Absolutely no problem about the name thing, thank you for asking. Marc
----- Mail original ----- > De: "Kenneth Heafield" <[email protected]> > À: [email protected] > Envoyé: Mercredi 24 Août 2011 12:24:27 > Objet: Re: [Moses-support] Using Moses language models > > Sorry about the spam. Should have remembered you said to ignore > PhraseDictionaryTree. FWIW, you can use std::auto_ptr from #include > <memory> but that's set to be deprecated with C++0x. > > Merged your memory leak fix in a slightly different way. Also, since > I'm merging part of branch, do you mind if it says my name on the > change > but the commentary says you? Or you can teach me more svn. . . > > Kenneth > > On 08/24/11 11:19, Marc LEGENDRE wrote: > > Yes I understood this from another discussion. > > The point in PhraseDictionaryTree.cpp was just memory management. > > (admitedly, to silence Valgrind ; but hey, don't we all strive for > > perfection ? :-) > > > > I don't need this, I guess I should have removed it from my branch > > if I wanted to merge. > > It's done. > > > > ----- Mail original ----- > >> De: "Kenneth Heafield" <[email protected]> > >> À: [email protected] > >> Envoyé: Mercredi 24 Août 2011 11:52:19 > >> Objet: Re: [Moses-support] Using Moses language models > >> > >> I support depending on Boost but sadly some people don't. > >> PhraseDictionaryTree.cpp:3 in your branch includes a boost header. > >> > >> Kenneth > >> > >> On 08/24/11 10:17, Marc LEGENDRE wrote: > >>> Hi, > >>> > >>> I merged the trunk into my branch; it looks ok. > >>> May my little modification to LMKen.h/cpp be finally merged into > >>> the trunk ? > >>> (not the useless changes to PhraseDictionaryTree) > >>> > >>> Thanks, (And sorry for my low reactivity, I hope you remember > >>> me!) > >>> > >>> Marc > >>> > >>> ----- Mail original ----- > >>>> De: "Hieu Hoang" <[email protected]> > >>>> À: "Marc LEGENDRE" <[email protected]> > >>>> Cc: "Kenneth Heafield" <[email protected]>, > >>>> [email protected] > >>>> Envoyé: Mercredi 27 Juillet 2011 13:34:35 > >>>> Objet: Re: [Moses-support] Using Moses language models > >>>> > >>>> hi marc, > >>>> > >>>> thx for the commits. > >>>> > >>>> the regression test failed probably because the decoder wasn't > >>>> compiled > >>>> with SRI or IRST LM, which some of the regression test specify. > >>>> I > >>>> compiled your branch & it passes. > >>>> > >>>> I suppose for convenience, we should change it to use KenLM, > >>>> with > >>>> specific tests for IRST & SRI. > >>>> > >>>> On 25/07/2011 21:51, Marc LEGENDRE wrote: > >>>>> Well, I actually commited in the augmLMResult branch. > >>>>> > >>>>> I inserted a class between LMKen and LMSingleFactor to prevent > >>>>> the > >>>>> inclusion of kenlm headers. > >>>>> (And yes, I now realize this may be the kind of things you > >>>>> write > >>>>> in > >>>>> a commit message) > >>>>> > >>>>> Since the LanguageModelKen.h header now contains functions I > >>>>> want > >>>>> to use, > >>>>> can we add it to the list of the installed files ? (&& How ? ) > >>>>> > >>>>> > >>>>> Also, I can't get the regression tests to work. > >>>>> I downloaded the test data&& extracted those in /tmp; I read > >>>>> what > >>>>> I found, and this is the command I came up with > >>>>> ./regression-testing/run-test-suite.pl > >>>>> --decoder-phrase=moses-cmd/src/moses > >>>>> --decoder-chart=moses-chart-cmd/src/moses_chart > >>>>> But every test ends with a "MOSES CRASHED" message. (And the > >>>>> same > >>>>> thing happens with the trunk build) > >>>>> I tried to understand, and I noticed that .ini files for the > >>>>> tests > >>>>> contain : > >>>>> [lmodel-file] > >>>>> 0 0 3 moses-reg-test-data-5/lm/europarl.en.srilm.gz > >>>>> > >>>>> Is that OK for kenlm ? > >>>>> > >>>>> Marc > >>>>> > >>>>> ----- Mail original ----- > >>>>>> De: "Kenneth Heafield"<[email protected]> > >>>>>> À: "Marc LEGENDRE"<[email protected]> > >>>>>> Cc:[email protected],[email protected] > >>>>>> Envoyé: Vendredi 22 Juillet 2011 20:18:21 > >>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>> > >>>>>> Hi Marc, > >>>>>> > >>>>>> This sounds like a simple change, so a branch is > >>>>>> probably too > >>>>>> much > >>>>>> overhead. Please do one of the following: > >>>>>> > >>>>>> 1. Send a patch as generated by diff -rupN $old $new . Do a > >>>>>> make > >>>>>> clean > >>>>>> first. > >>>>>> 2. Attach the files you modified and send them, along with the > >>>>>> revision > >>>>>> you based changes on. > >>>>>> 3. Make a branch (if you already did). > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Kenneth > >>>>>> > >>>>>> On 07/22/11 04:21, Marc LEGENDRE wrote: > >>>>>>> Well, we (me and the people I work with) were hoping not to > >>>>>>> have > >>>>>>> to > >>>>>>> maintain > >>>>>>> a modified version of Moses. > >>>>>>> > >>>>>>> Luckily, obviousness just hit me like a truck : if something > >>>>>>> is > >>>>>>> specific to a LM, > >>>>>>> it does not have to be in the top layer. > >>>>>>> Having a common interface does not prevent subclasses from > >>>>>>> having > >>>>>>> a > >>>>>>> specific behaviour, > >>>>>>> we could have a LanguageModelKen method, say > >>>>>>> GetValueForgotStateKen(...) which would return > >>>>>>> something specific, say a LMKenResult, which would contain a > >>>>>>> LMResult plus others things > >>>>>>> like, say, a ngram_length field :-). > >>>>>>> And the virtual GetValueForgotState() method would simply > >>>>>>> return > >>>>>>> the LMResult from there. > >>>>>>> > >>>>>>> This way, no need to break the high level API, > >>>>>>> and no extra maintenance cost for us (me and the peop... > >>>>>>> Well, > >>>>>>> you > >>>>>>> know). > >>>>>>> > >>>>>>> ----- Mail original ----- > >>>>>>>> De: "Hieu Hoang"<[email protected]> > >>>>>>>> À: "Kenneth Heafield"<[email protected]> > >>>>>>>> Cc:[email protected] > >>>>>>>> Envoyé: Vendredi 22 Juillet 2011 04:50:14 > >>>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>>> > >>>>>>>> > >>>>>>>> true,& there's no right answer to it. > >>>>>>>> > >>>>>>>> I suppose 1 goal of the trunk is to make sure that the core > >>>>>>>> functionality of translating isn't affected too much, in > >>>>>>>> terms > >>>>>>>> of > >>>>>>>> quality, speed, or memory. ANother goal is to make not to > >>>>>>>> overburden > >>>>>>>> the API with things no-one else uses or implement. > >>>>>>>> > >>>>>>>> therefore, i think a good strategy is to branch& do what > >>>>>>>> you > >>>>>>>> like > >>>>>>>> > >>>>>>>> > >>>>>>>> On 21 July 2011 22:46, Kenneth Heafield< > >>>>>>>> [email protected] > >>>>>>>> > > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> Marc makes a good point. When one language model provides > >>>>>>>> more > >>>>>>>> information than do other language models, it's difficult to > >>>>>>>> maintain > >>>>>>>> a > >>>>>>>> common abstraction layer. Currently we're looking at n-gram > >>>>>>>> length. > >>>>>>>> SRILM doesn't provide access to that (but you can get > >>>>>>>> right-looking > >>>>>>>> state length which is usually the same thing). > >>>>>>>> > >>>>>>>> I'm working on making this issue more severe with > >>>>>>>> left-looking > >>>>>>>> state > >>>>>>>> optimization and explicit hypothesis bounds. How do we > >>>>>>>> change > >>>>>>>> the > >>>>>>>> decoder to use these features if not all of the language > >>>>>>>> models > >>>>>>>> support > >>>>>>>> them? > >>>>>>>> > >>>>>>>> Maybe another class in the language model hierarchy > >>>>>>>> supporting > >>>>>>>> these > >>>>>>>> additional features. But it's going to make the decoder look > >>>>>>>> ugly > >>>>>>>> if > >>>>>>>> you want to support both. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On 07/21/11 11:14, Hieu Hoang wrote: > >>>>>>>>> hi marc, > >>>>>>>>> > >>>>>>>>> it'll be good for people to see your changes. > >>>>>>>>> > >>>>>>>>> i suppose you should create a branch and make your changes > >>>>>>>>> in > >>>>>>>>> there. > >>>>>>>>> > >>>>>>>>> If there are other people interested, you can point them to > >>>>>>>>> your > >>>>>>>>> branch. > >>>>>>>>> If more people are interested and it doesn't affect other > >>>>>>>>> people > >>>>>>>>> too > >>>>>>>>> much, then we can move it to trunk. > >>>>>>>>> > >>>>>>>>> i'll email you offline with svn details > >>>>>>>>> > >>>>>>>>> On 21/07/2011 15:16, Marc LEGENDRE wrote: > >>>>>>>>>> Alright, I gave this a try, and it did it for me. > >>>>>>>>>> With kenlm, it is a ridiculously straightforward > >>>>>>>>>> modification, > >>>>>>>>>> but now I'm not sure how I can submit it : > >>>>>>>>>> on one hand, I am not a "machine tranlation guy" and I > >>>>>>>>>> don't > >>>>>>>>>> imagine myself > >>>>>>>>>> digging in every other LM to find how to set the > >>>>>>>>>> ngram_length > >>>>>>>>>> value; > >>>>>>>>>> and on the other hand I would feel guilty to submit a > >>>>>>>>>> 10-line > >>>>>>>>>> patch and say > >>>>>>>>>> "Guys, I need this, would you mind committing it and doing > >>>>>>>>>> yourselves the > >>>>>>>>>> necessary modifications in every other wrapper ?" > >>>>>>>>>> > >>>>>>>>>> How do you, Moses developers, feel about this ? > >>>>>>>>>> Is it acceptable / outrageously stupid if I set the value > >>>>>>>>>> to > >>>>>>>>>> -1 > >>>>>>>>>> in > >>>>>>>>>> the other wrappers, > >>>>>>>>>> maybe with a TODO, and properly document it in the super > >>>>>>>>>> class > >>>>>>>>>> ? > >>>>>>>>>> > >>>>>>>>>> ----- Mail original ----- > >>>>>>>>>>> De: "Kenneth Heafield"< [email protected] > > >>>>>>>>>>> À:[email protected] > >>>>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 20:53:46 > >>>>>>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>>>>>> > >>>>>>>>>>> I'd suggest adding a ngram_length member to LMResult then > >>>>>>>>>>> modifying > >>>>>>>>>>> each > >>>>>>>>>>> model's wrapper (or just mine) to set that value. > >>>>>>>>>>> > >>>>>>>>>>> You're welcome to move stuff from LanguageModelKen.cpp to > >>>>>>>>>>> LanguageModelKen.h as necessary. I chose this setup to > >>>>>>>>>>> minimize > >>>>>>>>>>> unnecessary includes. > >>>>>>>>>>> > >>>>>>>>>>> Kenneth > >>>>>>>>>>> > >>>>>>>>>>> On 07/13/11 14:33, Marc LEGENDRE wrote: > >>>>>>>>>>>> Well, not only the header is not "public", so to speak, > >>>>>>>>>>>> (which > >>>>>>>>>>>> I > >>>>>>>>>>>> agree is not a major obstacle) > >>>>>>>>>>>> but also the desired pointer is a private member of the > >>>>>>>>>>>> class, > >>>>>>>>>>>> and > >>>>>>>>>>>> sadly lacks a getter. > >>>>>>>>>>>> As far as I know, it means that accessing it will > >>>>>>>>>>>> involve > >>>>>>>>>>>> questionnable C++ tricks. > >>>>>>>>>>>> (never tried, though) > >>>>>>>>>>>> > >>>>>>>>>>>> If modifying Moses is not too much of a chore, I'll give > >>>>>>>>>>>> it > >>>>>>>>>>>> a > >>>>>>>>>>>> thought. > >>>>>>>>>>>> > >>>>>>>>>>>> Anyway, thank you for your answers. > >>>>>>>>>>>> > >>>>>>>>>>>> ----- Mail original ----- > >>>>>>>>>>>>> De: "Hieu Hoang"< [email protected] > > >>>>>>>>>>>>> À:[email protected] > >>>>>>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 18:40:11 > >>>>>>>>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>>>>>>>> i guess lm::Model is specific to the ken lm > >>>>>>>>>>>>> implementation. > >>>>>>>>>>>>> If > >>>>>>>>>>>>> you > >>>>>>>>>>>>> want > >>>>>>>>>>>>> use it you should include the header yourself and cast > >>>>>>>>>>>>> whatever > >>>>>>>>>>>>> you > >>>>>>>>>>>>> need > >>>>>>>>>>>>> to get the pointer. > >>>>>>>>>>>>> > >>>>>>>>>>>>> if you're feeling generous, maybe you can extend the > >>>>>>>>>>>>> moses > >>>>>>>>>>>>> LM > >>>>>>>>>>>>> wrapper > >>>>>>>>>>>>> so > >>>>>>>>>>>>> that all LM implementations have the opportunity to > >>>>>>>>>>>>> return > >>>>>>>>>>>>> the > >>>>>>>>>>>>> length > >>>>>>>>>>>>> n-gram match. > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 13/07/2011 21:51, Marc LEGENDRE wrote: > >>>>>>>>>>>>>> The length of the n-gram match is sufficient for I > >>>>>>>>>>>>>> want, > >>>>>>>>>>>>>> indeed. > >>>>>>>>>>>>>> I figured out how to do get it using directly kenlm, > >>>>>>>>>>>>>> but > >>>>>>>>>>>>>> as > >>>>>>>>>>>>>> I > >>>>>>>>>>>>>> am > >>>>>>>>>>>>>> running the decoder, I wanted to use the already > >>>>>>>>>>>>>> loaded > >>>>>>>>>>>>>> LM. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I first tried to dig my way through the Moses > >>>>>>>>>>>>>> abstraction > >>>>>>>>>>>>>> layers > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>> retrieve a pointer to a lm::Model from kenlm, but the > >>>>>>>>>>>>>> Moses::LanguageModelKen header is not part of the > >>>>>>>>>>>>>> public > >>>>>>>>>>>>>> headers > >>>>>>>>>>>>>> of > >>>>>>>>>>>>>> Moses ; that's why I tried to use only Moses > >>>>>>>>>>>>>> interface. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> (I did I did not mention this alternative ; If someone > >>>>>>>>>>>>>> knows > >>>>>>>>>>>>>> how > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>> get such a pointer, I can carry on from there) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> ----- Mail original ----- > >>>>>>>>>>>>>>> De: "Kenneth Heafield"< [email protected] > > >>>>>>>>>>>>>>> À: "Marc LEGENDRE"< > >>>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 16:12:27 > >>>>>>>>>>>>>>> Objet: Re: [Moses-support] Using Moses language > >>>>>>>>>>>>>>> models > >>>>>>>>>>>>>>> The definition of unknown is that the word you asked > >>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>> (the > >>>>>>>>>>>>>>> rightmost > >>>>>>>>>>>>>>> one) is mapped to<unk> i.e. an OOV. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Are you looking for: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> 1) Length of n-gram matched in the model > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> or > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> 2) Length of state you must keep for valid > >>>>>>>>>>>>>>> continuation > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>> right > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> These are slightly different things due to state > >>>>>>>>>>>>>>> minimization. > >>>>>>>>>>>>>>> The > >>>>>>>>>>>>>>> moses abstraction layer does not return either in a > >>>>>>>>>>>>>>> general > >>>>>>>>>>>>>>> way. > >>>>>>>>>>>>>>> However, if you're using KenLM, #2 is in the returned > >>>>>>>>>>>>>>> state's > >>>>>>>>>>>>>>> valid_length_. Further, #1 is in > >>>>>>>>>>>>>>> FullScoreReturn.ngram_length. > >>>>>>>>>>>>>>> So > >>>>>>>>>>>>>>> if > >>>>>>>>>>>>>>> you call KenLM directly these are easy to obtain (and > >>>>>>>>>>>>>>> you > >>>>>>>>>>>>>>> can > >>>>>>>>>>>>>>> decide > >>>>>>>>>>>>>>> whether to expose them through the Moses abstraction > >>>>>>>>>>>>>>> layer). > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Outside the decoder, you can run > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> kenlm/query model_file null > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> then provide your trigrams on stdin. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Here's an example with kenlm/query kenlm/lm/test.arpa > >>>>>>>>>>>>>>> null > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> looking on a > >>>>>>>>>>>>>>> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 > >>>>>>>>>>>>>>> -0.0483513 > >>>>>>>>>>>>>>> Total: -1.79818 OOV: 0 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> The format is "word=vocab_id ngram_length score". So > >>>>>>>>>>>>>>> this > >>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>> trigram > >>>>>>>>>>>>>>> in the model because "a=5 3" appears. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On 07/13/11 08:50, Marc LEGENDRE wrote: > >>>>>>>>>>>>>>>> Hello, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I am trying to use the language models loaded by > >>>>>>>>>>>>>>>> Moses > >>>>>>>>>>>>>>>> ; > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I am using a 3-gram LM, and I need to know whether > >>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>> contains > >>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>> given N-gram or not. > >>>>>>>>>>>>>>>> I tried to play around with > >>>>>>>>>>>>>>>> LanguageModelImplementation::GetValueForgotState(...), > >>>>>>>>>>>>>>>> but the boolean 'unknown' in the returned structure > >>>>>>>>>>>>>>>> does > >>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>> seem > >>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>> be what I'm looking for. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Is there any simple way of getting this piece of > >>>>>>>>>>>>>>>> information > >>>>>>>>>>>>>>>> ? > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>>>> Marc Legendre > >>>>>>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>>>>>> Moses-support mailing list > >>>>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>>>> Moses-support mailing list > >>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>>> Moses-support mailing list > >>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>> Moses-support mailing list > >>>>>>>>>>>> [email protected] > >>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>>> _______________________________________________ > >>>>>>>>>>> Moses-support mailing list > >>>>>>>>>>> [email protected] > >>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>>> > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> Moses-support mailing list > >>>>>>>>>> [email protected] > >>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> Moses-support mailing list > >>>>>>>>> [email protected] > >>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>> _______________________________________________ > >>>>>>>> Moses-support mailing list > >>>>>>>> [email protected] > >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> Moses-support mailing list > >>>>>>>> [email protected] > >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>> > >>> _______________________________________________ > >>> Moses-support mailing list > >>> [email protected] > >>> http://mailman.mit.edu/mailman/listinfo/moses-support > >> _______________________________________________ > >> Moses-support mailing list > >> [email protected] > >> http://mailman.mit.edu/mailman/listinfo/moses-support > >> > > _______________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
