Hi, I merged the trunk into my branch; it looks ok. May my little modification to LMKen.h/cpp be finally merged into the trunk ? (not the useless changes to PhraseDictionaryTree)
Thanks, (And sorry for my low reactivity, I hope you remember me!) Marc ----- Mail original ----- > De: "Hieu Hoang" <[email protected]> > À: "Marc LEGENDRE" <[email protected]> > Cc: "Kenneth Heafield" <[email protected]>, [email protected] > Envoyé: Mercredi 27 Juillet 2011 13:34:35 > Objet: Re: [Moses-support] Using Moses language models > > hi marc, > > thx for the commits. > > the regression test failed probably because the decoder wasn't > compiled > with SRI or IRST LM, which some of the regression test specify. I > compiled your branch & it passes. > > I suppose for convenience, we should change it to use KenLM, with > specific tests for IRST & SRI. > > On 25/07/2011 21:51, Marc LEGENDRE wrote: > > Well, I actually commited in the augmLMResult branch. > > > > I inserted a class between LMKen and LMSingleFactor to prevent the > > inclusion of kenlm headers. > > (And yes, I now realize this may be the kind of things you write in > > a commit message) > > > > Since the LanguageModelKen.h header now contains functions I want > > to use, > > can we add it to the list of the installed files ? (&& How ? ) > > > > > > Also, I can't get the regression tests to work. > > I downloaded the test data&& extracted those in /tmp; I read what > > I found, and this is the command I came up with > > ./regression-testing/run-test-suite.pl > > --decoder-phrase=moses-cmd/src/moses > > --decoder-chart=moses-chart-cmd/src/moses_chart > > But every test ends with a "MOSES CRASHED" message. (And the same > > thing happens with the trunk build) > > I tried to understand, and I noticed that .ini files for the tests > > contain : > > [lmodel-file] > > 0 0 3 moses-reg-test-data-5/lm/europarl.en.srilm.gz > > > > Is that OK for kenlm ? > > > > Marc > > > > ----- Mail original ----- > >> De: "Kenneth Heafield"<[email protected]> > >> À: "Marc LEGENDRE"<[email protected]> > >> Cc:[email protected],[email protected] > >> Envoyé: Vendredi 22 Juillet 2011 20:18:21 > >> Objet: Re: [Moses-support] Using Moses language models > >> > >> Hi Marc, > >> > >> This sounds like a simple change, so a branch is probably too > >> much > >> overhead. Please do one of the following: > >> > >> 1. Send a patch as generated by diff -rupN $old $new . Do a make > >> clean > >> first. > >> 2. Attach the files you modified and send them, along with the > >> revision > >> you based changes on. > >> 3. Make a branch (if you already did). > >> > >> Thanks, > >> > >> Kenneth > >> > >> On 07/22/11 04:21, Marc LEGENDRE wrote: > >>> Well, we (me and the people I work with) were hoping not to have > >>> to > >>> maintain > >>> a modified version of Moses. > >>> > >>> Luckily, obviousness just hit me like a truck : if something is > >>> specific to a LM, > >>> it does not have to be in the top layer. > >>> Having a common interface does not prevent subclasses from having > >>> a > >>> specific behaviour, > >>> we could have a LanguageModelKen method, say > >>> GetValueForgotStateKen(...) which would return > >>> something specific, say a LMKenResult, which would contain a > >>> LMResult plus others things > >>> like, say, a ngram_length field :-). > >>> And the virtual GetValueForgotState() method would simply return > >>> the LMResult from there. > >>> > >>> This way, no need to break the high level API, > >>> and no extra maintenance cost for us (me and the peop... Well, > >>> you > >>> know). > >>> > >>> ----- Mail original ----- > >>>> De: "Hieu Hoang"<[email protected]> > >>>> À: "Kenneth Heafield"<[email protected]> > >>>> Cc:[email protected] > >>>> Envoyé: Vendredi 22 Juillet 2011 04:50:14 > >>>> Objet: Re: [Moses-support] Using Moses language models > >>>> > >>>> > >>>> true,& there's no right answer to it. > >>>> > >>>> I suppose 1 goal of the trunk is to make sure that the core > >>>> functionality of translating isn't affected too much, in terms > >>>> of > >>>> quality, speed, or memory. ANother goal is to make not to > >>>> overburden > >>>> the API with things no-one else uses or implement. > >>>> > >>>> therefore, i think a good strategy is to branch& do what you > >>>> like > >>>> > >>>> > >>>> On 21 July 2011 22:46, Kenneth Heafield< [email protected] > > >>>> wrote: > >>>> > >>>> > >>>> Marc makes a good point. When one language model provides more > >>>> information than do other language models, it's difficult to > >>>> maintain > >>>> a > >>>> common abstraction layer. Currently we're looking at n-gram > >>>> length. > >>>> SRILM doesn't provide access to that (but you can get > >>>> right-looking > >>>> state length which is usually the same thing). > >>>> > >>>> I'm working on making this issue more severe with left-looking > >>>> state > >>>> optimization and explicit hypothesis bounds. How do we change > >>>> the > >>>> decoder to use these features if not all of the language models > >>>> support > >>>> them? > >>>> > >>>> Maybe another class in the language model hierarchy supporting > >>>> these > >>>> additional features. But it's going to make the decoder look > >>>> ugly > >>>> if > >>>> you want to support both. > >>>> > >>>> > >>>> > >>>> > >>>> On 07/21/11 11:14, Hieu Hoang wrote: > >>>>> hi marc, > >>>>> > >>>>> it'll be good for people to see your changes. > >>>>> > >>>>> i suppose you should create a branch and make your changes in > >>>>> there. > >>>>> > >>>>> If there are other people interested, you can point them to > >>>>> your > >>>>> branch. > >>>>> If more people are interested and it doesn't affect other > >>>>> people > >>>>> too > >>>>> much, then we can move it to trunk. > >>>>> > >>>>> i'll email you offline with svn details > >>>>> > >>>>> On 21/07/2011 15:16, Marc LEGENDRE wrote: > >>>>>> Alright, I gave this a try, and it did it for me. > >>>>>> With kenlm, it is a ridiculously straightforward modification, > >>>>>> but now I'm not sure how I can submit it : > >>>>>> on one hand, I am not a "machine tranlation guy" and I don't > >>>>>> imagine myself > >>>>>> digging in every other LM to find how to set the ngram_length > >>>>>> value; > >>>>>> and on the other hand I would feel guilty to submit a 10-line > >>>>>> patch and say > >>>>>> "Guys, I need this, would you mind committing it and doing > >>>>>> yourselves the > >>>>>> necessary modifications in every other wrapper ?" > >>>>>> > >>>>>> How do you, Moses developers, feel about this ? > >>>>>> Is it acceptable / outrageously stupid if I set the value to > >>>>>> -1 > >>>>>> in > >>>>>> the other wrappers, > >>>>>> maybe with a TODO, and properly document it in the super class > >>>>>> ? > >>>>>> > >>>>>> ----- Mail original ----- > >>>>>>> De: "Kenneth Heafield"< [email protected] > > >>>>>>> À:[email protected] > >>>>>>> Envoyé: Mercredi 13 Juillet 2011 20:53:46 > >>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>> > >>>>>>> I'd suggest adding a ngram_length member to LMResult then > >>>>>>> modifying > >>>>>>> each > >>>>>>> model's wrapper (or just mine) to set that value. > >>>>>>> > >>>>>>> You're welcome to move stuff from LanguageModelKen.cpp to > >>>>>>> LanguageModelKen.h as necessary. I chose this setup to > >>>>>>> minimize > >>>>>>> unnecessary includes. > >>>>>>> > >>>>>>> Kenneth > >>>>>>> > >>>>>>> On 07/13/11 14:33, Marc LEGENDRE wrote: > >>>>>>>> Well, not only the header is not "public", so to speak, > >>>>>>>> (which > >>>>>>>> I > >>>>>>>> agree is not a major obstacle) > >>>>>>>> but also the desired pointer is a private member of the > >>>>>>>> class, > >>>>>>>> and > >>>>>>>> sadly lacks a getter. > >>>>>>>> As far as I know, it means that accessing it will involve > >>>>>>>> questionnable C++ tricks. > >>>>>>>> (never tried, though) > >>>>>>>> > >>>>>>>> If modifying Moses is not too much of a chore, I'll give it > >>>>>>>> a > >>>>>>>> thought. > >>>>>>>> > >>>>>>>> Anyway, thank you for your answers. > >>>>>>>> > >>>>>>>> ----- Mail original ----- > >>>>>>>>> De: "Hieu Hoang"< [email protected] > > >>>>>>>>> À:[email protected] > >>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 18:40:11 > >>>>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>>>> i guess lm::Model is specific to the ken lm implementation. > >>>>>>>>> If > >>>>>>>>> you > >>>>>>>>> want > >>>>>>>>> use it you should include the header yourself and cast > >>>>>>>>> whatever > >>>>>>>>> you > >>>>>>>>> need > >>>>>>>>> to get the pointer. > >>>>>>>>> > >>>>>>>>> if you're feeling generous, maybe you can extend the moses > >>>>>>>>> LM > >>>>>>>>> wrapper > >>>>>>>>> so > >>>>>>>>> that all LM implementations have the opportunity to return > >>>>>>>>> the > >>>>>>>>> length > >>>>>>>>> n-gram match. > >>>>>>>>> > >>>>>>>>> On 13/07/2011 21:51, Marc LEGENDRE wrote: > >>>>>>>>>> The length of the n-gram match is sufficient for I want, > >>>>>>>>>> indeed. > >>>>>>>>>> I figured out how to do get it using directly kenlm, but > >>>>>>>>>> as > >>>>>>>>>> I > >>>>>>>>>> am > >>>>>>>>>> running the decoder, I wanted to use the already loaded > >>>>>>>>>> LM. > >>>>>>>>>> > >>>>>>>>>> I first tried to dig my way through the Moses abstraction > >>>>>>>>>> layers > >>>>>>>>>> to > >>>>>>>>>> retrieve a pointer to a lm::Model from kenlm, but the > >>>>>>>>>> Moses::LanguageModelKen header is not part of the public > >>>>>>>>>> headers > >>>>>>>>>> of > >>>>>>>>>> Moses ; that's why I tried to use only Moses interface. > >>>>>>>>>> > >>>>>>>>>> (I did I did not mention this alternative ; If someone > >>>>>>>>>> knows > >>>>>>>>>> how > >>>>>>>>>> to > >>>>>>>>>> get such a pointer, I can carry on from there) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> ----- Mail original ----- > >>>>>>>>>>> De: "Kenneth Heafield"< [email protected] > > >>>>>>>>>>> À: "Marc LEGENDRE"< [email protected] > >>>>>>>>>>> > > >>>>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 16:12:27 > >>>>>>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>>>>>> The definition of unknown is that the word you asked for > >>>>>>>>>>> (the > >>>>>>>>>>> rightmost > >>>>>>>>>>> one) is mapped to<unk> i.e. an OOV. > >>>>>>>>>>> > >>>>>>>>>>> Are you looking for: > >>>>>>>>>>> > >>>>>>>>>>> 1) Length of n-gram matched in the model > >>>>>>>>>>> > >>>>>>>>>>> or > >>>>>>>>>>> > >>>>>>>>>>> 2) Length of state you must keep for valid continuation > >>>>>>>>>>> to > >>>>>>>>>>> the > >>>>>>>>>>> right > >>>>>>>>>>> > >>>>>>>>>>> These are slightly different things due to state > >>>>>>>>>>> minimization. > >>>>>>>>>>> The > >>>>>>>>>>> moses abstraction layer does not return either in a > >>>>>>>>>>> general > >>>>>>>>>>> way. > >>>>>>>>>>> However, if you're using KenLM, #2 is in the returned > >>>>>>>>>>> state's > >>>>>>>>>>> valid_length_. Further, #1 is in > >>>>>>>>>>> FullScoreReturn.ngram_length. > >>>>>>>>>>> So > >>>>>>>>>>> if > >>>>>>>>>>> you call KenLM directly these are easy to obtain (and you > >>>>>>>>>>> can > >>>>>>>>>>> decide > >>>>>>>>>>> whether to expose them through the Moses abstraction > >>>>>>>>>>> layer). > >>>>>>>>>>> > >>>>>>>>>>> Outside the decoder, you can run > >>>>>>>>>>> > >>>>>>>>>>> kenlm/query model_file null > >>>>>>>>>>> > >>>>>>>>>>> then provide your trigrams on stdin. > >>>>>>>>>>> > >>>>>>>>>>> Here's an example with kenlm/query kenlm/lm/test.arpa > >>>>>>>>>>> null > >>>>>>>>>>> > >>>>>>>>>>> looking on a > >>>>>>>>>>> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513 > >>>>>>>>>>> Total: -1.79818 OOV: 0 > >>>>>>>>>>> > >>>>>>>>>>> The format is "word=vocab_id ngram_length score". So this > >>>>>>>>>>> is > >>>>>>>>>>> a > >>>>>>>>>>> trigram > >>>>>>>>>>> in the model because "a=5 3" appears. > >>>>>>>>>>> > >>>>>>>>>>> On 07/13/11 08:50, Marc LEGENDRE wrote: > >>>>>>>>>>>> Hello, > >>>>>>>>>>>> > >>>>>>>>>>>> I am trying to use the language models loaded by Moses ; > >>>>>>>>>>>> > >>>>>>>>>>>> I am using a 3-gram LM, and I need to know whether it > >>>>>>>>>>>> contains > >>>>>>>>>>>> a > >>>>>>>>>>>> given N-gram or not. > >>>>>>>>>>>> I tried to play around with > >>>>>>>>>>>> LanguageModelImplementation::GetValueForgotState(...), > >>>>>>>>>>>> but the boolean 'unknown' in the returned structure does > >>>>>>>>>>>> not > >>>>>>>>>>>> seem > >>>>>>>>>>>> to > >>>>>>>>>>>> be what I'm looking for. > >>>>>>>>>>>> > >>>>>>>>>>>> Is there any simple way of getting this piece of > >>>>>>>>>>>> information > >>>>>>>>>>>> ? > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Regards, > >>>>>>>>>>>> Marc Legendre > >>>>>>>>>>>> _______________________________________________ > >>>>>>>>>>>> Moses-support mailing list > >>>>>>>>>>>> [email protected] > >>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> Moses-support mailing list > >>>>>>>>>> [email protected] > >>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> Moses-support mailing list > >>>>>>>>> [email protected] > >>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>> _______________________________________________ > >>>>>>>> Moses-support mailing list > >>>>>>>> [email protected] > >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>> _______________________________________________ > >>>>>>> Moses-support mailing list > >>>>>>> [email protected] > >>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>> > >>>>>> _______________________________________________ > >>>>>> Moses-support mailing list > >>>>>> [email protected] > >>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>> > >>>>>> > >>>>> _______________________________________________ > >>>>> Moses-support mailing list > >>>>> [email protected] > >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>> _______________________________________________ > >>>> Moses-support mailing list > >>>> [email protected] > >>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Moses-support mailing list > >>>> [email protected] > >>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>> > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
