Yes I understood this from another discussion. The point in PhraseDictionaryTree.cpp was just memory management. (admitedly, to silence Valgrind ; but hey, don't we all strive for perfection ? :-)
I don't need this, I guess I should have removed it from my branch if I wanted to merge. It's done. ----- Mail original ----- > De: "Kenneth Heafield" <[email protected]> > À: [email protected] > Envoyé: Mercredi 24 Août 2011 11:52:19 > Objet: Re: [Moses-support] Using Moses language models > > I support depending on Boost but sadly some people don't. > PhraseDictionaryTree.cpp:3 in your branch includes a boost header. > > Kenneth > > On 08/24/11 10:17, Marc LEGENDRE wrote: > > Hi, > > > > I merged the trunk into my branch; it looks ok. > > May my little modification to LMKen.h/cpp be finally merged into > > the trunk ? > > (not the useless changes to PhraseDictionaryTree) > > > > Thanks, (And sorry for my low reactivity, I hope you remember me!) > > > > Marc > > > > ----- Mail original ----- > >> De: "Hieu Hoang" <[email protected]> > >> À: "Marc LEGENDRE" <[email protected]> > >> Cc: "Kenneth Heafield" <[email protected]>, > >> [email protected] > >> Envoyé: Mercredi 27 Juillet 2011 13:34:35 > >> Objet: Re: [Moses-support] Using Moses language models > >> > >> hi marc, > >> > >> thx for the commits. > >> > >> the regression test failed probably because the decoder wasn't > >> compiled > >> with SRI or IRST LM, which some of the regression test specify. I > >> compiled your branch & it passes. > >> > >> I suppose for convenience, we should change it to use KenLM, with > >> specific tests for IRST & SRI. > >> > >> On 25/07/2011 21:51, Marc LEGENDRE wrote: > >>> Well, I actually commited in the augmLMResult branch. > >>> > >>> I inserted a class between LMKen and LMSingleFactor to prevent > >>> the > >>> inclusion of kenlm headers. > >>> (And yes, I now realize this may be the kind of things you write > >>> in > >>> a commit message) > >>> > >>> Since the LanguageModelKen.h header now contains functions I want > >>> to use, > >>> can we add it to the list of the installed files ? (&& How ? ) > >>> > >>> > >>> Also, I can't get the regression tests to work. > >>> I downloaded the test data&& extracted those in /tmp; I read > >>> what > >>> I found, and this is the command I came up with > >>> ./regression-testing/run-test-suite.pl > >>> --decoder-phrase=moses-cmd/src/moses > >>> --decoder-chart=moses-chart-cmd/src/moses_chart > >>> But every test ends with a "MOSES CRASHED" message. (And the same > >>> thing happens with the trunk build) > >>> I tried to understand, and I noticed that .ini files for the > >>> tests > >>> contain : > >>> [lmodel-file] > >>> 0 0 3 moses-reg-test-data-5/lm/europarl.en.srilm.gz > >>> > >>> Is that OK for kenlm ? > >>> > >>> Marc > >>> > >>> ----- Mail original ----- > >>>> De: "Kenneth Heafield"<[email protected]> > >>>> À: "Marc LEGENDRE"<[email protected]> > >>>> Cc:[email protected],[email protected] > >>>> Envoyé: Vendredi 22 Juillet 2011 20:18:21 > >>>> Objet: Re: [Moses-support] Using Moses language models > >>>> > >>>> Hi Marc, > >>>> > >>>> This sounds like a simple change, so a branch is probably too > >>>> much > >>>> overhead. Please do one of the following: > >>>> > >>>> 1. Send a patch as generated by diff -rupN $old $new . Do a > >>>> make > >>>> clean > >>>> first. > >>>> 2. Attach the files you modified and send them, along with the > >>>> revision > >>>> you based changes on. > >>>> 3. Make a branch (if you already did). > >>>> > >>>> Thanks, > >>>> > >>>> Kenneth > >>>> > >>>> On 07/22/11 04:21, Marc LEGENDRE wrote: > >>>>> Well, we (me and the people I work with) were hoping not to > >>>>> have > >>>>> to > >>>>> maintain > >>>>> a modified version of Moses. > >>>>> > >>>>> Luckily, obviousness just hit me like a truck : if something is > >>>>> specific to a LM, > >>>>> it does not have to be in the top layer. > >>>>> Having a common interface does not prevent subclasses from > >>>>> having > >>>>> a > >>>>> specific behaviour, > >>>>> we could have a LanguageModelKen method, say > >>>>> GetValueForgotStateKen(...) which would return > >>>>> something specific, say a LMKenResult, which would contain a > >>>>> LMResult plus others things > >>>>> like, say, a ngram_length field :-). > >>>>> And the virtual GetValueForgotState() method would simply > >>>>> return > >>>>> the LMResult from there. > >>>>> > >>>>> This way, no need to break the high level API, > >>>>> and no extra maintenance cost for us (me and the peop... Well, > >>>>> you > >>>>> know). > >>>>> > >>>>> ----- Mail original ----- > >>>>>> De: "Hieu Hoang"<[email protected]> > >>>>>> À: "Kenneth Heafield"<[email protected]> > >>>>>> Cc:[email protected] > >>>>>> Envoyé: Vendredi 22 Juillet 2011 04:50:14 > >>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>> > >>>>>> > >>>>>> true,& there's no right answer to it. > >>>>>> > >>>>>> I suppose 1 goal of the trunk is to make sure that the core > >>>>>> functionality of translating isn't affected too much, in terms > >>>>>> of > >>>>>> quality, speed, or memory. ANother goal is to make not to > >>>>>> overburden > >>>>>> the API with things no-one else uses or implement. > >>>>>> > >>>>>> therefore, i think a good strategy is to branch& do what you > >>>>>> like > >>>>>> > >>>>>> > >>>>>> On 21 July 2011 22:46, Kenneth Heafield< [email protected] > >>>>>> > > >>>>>> wrote: > >>>>>> > >>>>>> > >>>>>> Marc makes a good point. When one language model provides more > >>>>>> information than do other language models, it's difficult to > >>>>>> maintain > >>>>>> a > >>>>>> common abstraction layer. Currently we're looking at n-gram > >>>>>> length. > >>>>>> SRILM doesn't provide access to that (but you can get > >>>>>> right-looking > >>>>>> state length which is usually the same thing). > >>>>>> > >>>>>> I'm working on making this issue more severe with left-looking > >>>>>> state > >>>>>> optimization and explicit hypothesis bounds. How do we change > >>>>>> the > >>>>>> decoder to use these features if not all of the language > >>>>>> models > >>>>>> support > >>>>>> them? > >>>>>> > >>>>>> Maybe another class in the language model hierarchy supporting > >>>>>> these > >>>>>> additional features. But it's going to make the decoder look > >>>>>> ugly > >>>>>> if > >>>>>> you want to support both. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 07/21/11 11:14, Hieu Hoang wrote: > >>>>>>> hi marc, > >>>>>>> > >>>>>>> it'll be good for people to see your changes. > >>>>>>> > >>>>>>> i suppose you should create a branch and make your changes in > >>>>>>> there. > >>>>>>> > >>>>>>> If there are other people interested, you can point them to > >>>>>>> your > >>>>>>> branch. > >>>>>>> If more people are interested and it doesn't affect other > >>>>>>> people > >>>>>>> too > >>>>>>> much, then we can move it to trunk. > >>>>>>> > >>>>>>> i'll email you offline with svn details > >>>>>>> > >>>>>>> On 21/07/2011 15:16, Marc LEGENDRE wrote: > >>>>>>>> Alright, I gave this a try, and it did it for me. > >>>>>>>> With kenlm, it is a ridiculously straightforward > >>>>>>>> modification, > >>>>>>>> but now I'm not sure how I can submit it : > >>>>>>>> on one hand, I am not a "machine tranlation guy" and I don't > >>>>>>>> imagine myself > >>>>>>>> digging in every other LM to find how to set the > >>>>>>>> ngram_length > >>>>>>>> value; > >>>>>>>> and on the other hand I would feel guilty to submit a > >>>>>>>> 10-line > >>>>>>>> patch and say > >>>>>>>> "Guys, I need this, would you mind committing it and doing > >>>>>>>> yourselves the > >>>>>>>> necessary modifications in every other wrapper ?" > >>>>>>>> > >>>>>>>> How do you, Moses developers, feel about this ? > >>>>>>>> Is it acceptable / outrageously stupid if I set the value to > >>>>>>>> -1 > >>>>>>>> in > >>>>>>>> the other wrappers, > >>>>>>>> maybe with a TODO, and properly document it in the super > >>>>>>>> class > >>>>>>>> ? > >>>>>>>> > >>>>>>>> ----- Mail original ----- > >>>>>>>>> De: "Kenneth Heafield"< [email protected] > > >>>>>>>>> À:[email protected] > >>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 20:53:46 > >>>>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>>>> > >>>>>>>>> I'd suggest adding a ngram_length member to LMResult then > >>>>>>>>> modifying > >>>>>>>>> each > >>>>>>>>> model's wrapper (or just mine) to set that value. > >>>>>>>>> > >>>>>>>>> You're welcome to move stuff from LanguageModelKen.cpp to > >>>>>>>>> LanguageModelKen.h as necessary. I chose this setup to > >>>>>>>>> minimize > >>>>>>>>> unnecessary includes. > >>>>>>>>> > >>>>>>>>> Kenneth > >>>>>>>>> > >>>>>>>>> On 07/13/11 14:33, Marc LEGENDRE wrote: > >>>>>>>>>> Well, not only the header is not "public", so to speak, > >>>>>>>>>> (which > >>>>>>>>>> I > >>>>>>>>>> agree is not a major obstacle) > >>>>>>>>>> but also the desired pointer is a private member of the > >>>>>>>>>> class, > >>>>>>>>>> and > >>>>>>>>>> sadly lacks a getter. > >>>>>>>>>> As far as I know, it means that accessing it will involve > >>>>>>>>>> questionnable C++ tricks. > >>>>>>>>>> (never tried, though) > >>>>>>>>>> > >>>>>>>>>> If modifying Moses is not too much of a chore, I'll give > >>>>>>>>>> it > >>>>>>>>>> a > >>>>>>>>>> thought. > >>>>>>>>>> > >>>>>>>>>> Anyway, thank you for your answers. > >>>>>>>>>> > >>>>>>>>>> ----- Mail original ----- > >>>>>>>>>>> De: "Hieu Hoang"< [email protected] > > >>>>>>>>>>> À:[email protected] > >>>>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 18:40:11 > >>>>>>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>>>>>> i guess lm::Model is specific to the ken lm > >>>>>>>>>>> implementation. > >>>>>>>>>>> If > >>>>>>>>>>> you > >>>>>>>>>>> want > >>>>>>>>>>> use it you should include the header yourself and cast > >>>>>>>>>>> whatever > >>>>>>>>>>> you > >>>>>>>>>>> need > >>>>>>>>>>> to get the pointer. > >>>>>>>>>>> > >>>>>>>>>>> if you're feeling generous, maybe you can extend the > >>>>>>>>>>> moses > >>>>>>>>>>> LM > >>>>>>>>>>> wrapper > >>>>>>>>>>> so > >>>>>>>>>>> that all LM implementations have the opportunity to > >>>>>>>>>>> return > >>>>>>>>>>> the > >>>>>>>>>>> length > >>>>>>>>>>> n-gram match. > >>>>>>>>>>> > >>>>>>>>>>> On 13/07/2011 21:51, Marc LEGENDRE wrote: > >>>>>>>>>>>> The length of the n-gram match is sufficient for I want, > >>>>>>>>>>>> indeed. > >>>>>>>>>>>> I figured out how to do get it using directly kenlm, but > >>>>>>>>>>>> as > >>>>>>>>>>>> I > >>>>>>>>>>>> am > >>>>>>>>>>>> running the decoder, I wanted to use the already loaded > >>>>>>>>>>>> LM. > >>>>>>>>>>>> > >>>>>>>>>>>> I first tried to dig my way through the Moses > >>>>>>>>>>>> abstraction > >>>>>>>>>>>> layers > >>>>>>>>>>>> to > >>>>>>>>>>>> retrieve a pointer to a lm::Model from kenlm, but the > >>>>>>>>>>>> Moses::LanguageModelKen header is not part of the public > >>>>>>>>>>>> headers > >>>>>>>>>>>> of > >>>>>>>>>>>> Moses ; that's why I tried to use only Moses interface. > >>>>>>>>>>>> > >>>>>>>>>>>> (I did I did not mention this alternative ; If someone > >>>>>>>>>>>> knows > >>>>>>>>>>>> how > >>>>>>>>>>>> to > >>>>>>>>>>>> get such a pointer, I can carry on from there) > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> ----- Mail original ----- > >>>>>>>>>>>>> De: "Kenneth Heafield"< [email protected] > > >>>>>>>>>>>>> À: "Marc LEGENDRE"< > >>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 16:12:27 > >>>>>>>>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>>>>>>>> The definition of unknown is that the word you asked > >>>>>>>>>>>>> for > >>>>>>>>>>>>> (the > >>>>>>>>>>>>> rightmost > >>>>>>>>>>>>> one) is mapped to<unk> i.e. an OOV. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Are you looking for: > >>>>>>>>>>>>> > >>>>>>>>>>>>> 1) Length of n-gram matched in the model > >>>>>>>>>>>>> > >>>>>>>>>>>>> or > >>>>>>>>>>>>> > >>>>>>>>>>>>> 2) Length of state you must keep for valid continuation > >>>>>>>>>>>>> to > >>>>>>>>>>>>> the > >>>>>>>>>>>>> right > >>>>>>>>>>>>> > >>>>>>>>>>>>> These are slightly different things due to state > >>>>>>>>>>>>> minimization. > >>>>>>>>>>>>> The > >>>>>>>>>>>>> moses abstraction layer does not return either in a > >>>>>>>>>>>>> general > >>>>>>>>>>>>> way. > >>>>>>>>>>>>> However, if you're using KenLM, #2 is in the returned > >>>>>>>>>>>>> state's > >>>>>>>>>>>>> valid_length_. Further, #1 is in > >>>>>>>>>>>>> FullScoreReturn.ngram_length. > >>>>>>>>>>>>> So > >>>>>>>>>>>>> if > >>>>>>>>>>>>> you call KenLM directly these are easy to obtain (and > >>>>>>>>>>>>> you > >>>>>>>>>>>>> can > >>>>>>>>>>>>> decide > >>>>>>>>>>>>> whether to expose them through the Moses abstraction > >>>>>>>>>>>>> layer). > >>>>>>>>>>>>> > >>>>>>>>>>>>> Outside the decoder, you can run > >>>>>>>>>>>>> > >>>>>>>>>>>>> kenlm/query model_file null > >>>>>>>>>>>>> > >>>>>>>>>>>>> then provide your trigrams on stdin. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Here's an example with kenlm/query kenlm/lm/test.arpa > >>>>>>>>>>>>> null > >>>>>>>>>>>>> > >>>>>>>>>>>>> looking on a > >>>>>>>>>>>>> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513 > >>>>>>>>>>>>> Total: -1.79818 OOV: 0 > >>>>>>>>>>>>> > >>>>>>>>>>>>> The format is "word=vocab_id ngram_length score". So > >>>>>>>>>>>>> this > >>>>>>>>>>>>> is > >>>>>>>>>>>>> a > >>>>>>>>>>>>> trigram > >>>>>>>>>>>>> in the model because "a=5 3" appears. > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 07/13/11 08:50, Marc LEGENDRE wrote: > >>>>>>>>>>>>>> Hello, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I am trying to use the language models loaded by Moses > >>>>>>>>>>>>>> ; > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I am using a 3-gram LM, and I need to know whether it > >>>>>>>>>>>>>> contains > >>>>>>>>>>>>>> a > >>>>>>>>>>>>>> given N-gram or not. > >>>>>>>>>>>>>> I tried to play around with > >>>>>>>>>>>>>> LanguageModelImplementation::GetValueForgotState(...), > >>>>>>>>>>>>>> but the boolean 'unknown' in the returned structure > >>>>>>>>>>>>>> does > >>>>>>>>>>>>>> not > >>>>>>>>>>>>>> seem > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>> be what I'm looking for. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Is there any simple way of getting this piece of > >>>>>>>>>>>>>> information > >>>>>>>>>>>>>> ? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>> Marc Legendre > >>>>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>>>> Moses-support mailing list > >>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>> Moses-support mailing list > >>>>>>>>>>>> [email protected] > >>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> _______________________________________________ > >>>>>>>>>>> Moses-support mailing list > >>>>>>>>>>> [email protected] > >>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> Moses-support mailing list > >>>>>>>>>> [email protected] > >>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>> _______________________________________________ > >>>>>>>>> Moses-support mailing list > >>>>>>>>> [email protected] > >>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> Moses-support mailing list > >>>>>>>> [email protected] > >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>> > >>>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> Moses-support mailing list > >>>>>>> [email protected] > >>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>> _______________________________________________ > >>>>>> Moses-support mailing list > >>>>>> [email protected] > >>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>> > >>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Moses-support mailing list > >>>>>> [email protected] > >>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>> > > _______________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
