hi marc, thx for the commits.
the regression test failed probably because the decoder wasn't compiled with SRI or IRST LM, which some of the regression test specify. I compiled your branch & it passes. I suppose for convenience, we should change it to use KenLM, with specific tests for IRST & SRI. On 25/07/2011 21:51, Marc LEGENDRE wrote: > Well, I actually commited in the augmLMResult branch. > > I inserted a class between LMKen and LMSingleFactor to prevent the inclusion > of kenlm headers. > (And yes, I now realize this may be the kind of things you write in a commit > message) > > Since the LanguageModelKen.h header now contains functions I want to use, > can we add it to the list of the installed files ? (&& How ? ) > > > Also, I can't get the regression tests to work. > I downloaded the test data&& extracted those in /tmp; I read what I found, > and this is the command I came up with > ./regression-testing/run-test-suite.pl --decoder-phrase=moses-cmd/src/moses > --decoder-chart=moses-chart-cmd/src/moses_chart > But every test ends with a "MOSES CRASHED" message. (And the same thing > happens with the trunk build) > I tried to understand, and I noticed that .ini files for the tests contain : > [lmodel-file] > 0 0 3 moses-reg-test-data-5/lm/europarl.en.srilm.gz > > Is that OK for kenlm ? > > Marc > > ----- Mail original ----- >> De: "Kenneth Heafield"<[email protected]> >> À: "Marc LEGENDRE"<[email protected]> >> Cc:[email protected],[email protected] >> Envoyé: Vendredi 22 Juillet 2011 20:18:21 >> Objet: Re: [Moses-support] Using Moses language models >> >> Hi Marc, >> >> This sounds like a simple change, so a branch is probably too much >> overhead. Please do one of the following: >> >> 1. Send a patch as generated by diff -rupN $old $new . Do a make >> clean >> first. >> 2. Attach the files you modified and send them, along with the >> revision >> you based changes on. >> 3. Make a branch (if you already did). >> >> Thanks, >> >> Kenneth >> >> On 07/22/11 04:21, Marc LEGENDRE wrote: >>> Well, we (me and the people I work with) were hoping not to have to >>> maintain >>> a modified version of Moses. >>> >>> Luckily, obviousness just hit me like a truck : if something is >>> specific to a LM, >>> it does not have to be in the top layer. >>> Having a common interface does not prevent subclasses from having a >>> specific behaviour, >>> we could have a LanguageModelKen method, say >>> GetValueForgotStateKen(...) which would return >>> something specific, say a LMKenResult, which would contain a >>> LMResult plus others things >>> like, say, a ngram_length field :-). >>> And the virtual GetValueForgotState() method would simply return >>> the LMResult from there. >>> >>> This way, no need to break the high level API, >>> and no extra maintenance cost for us (me and the peop... Well, you >>> know). >>> >>> ----- Mail original ----- >>>> De: "Hieu Hoang"<[email protected]> >>>> À: "Kenneth Heafield"<[email protected]> >>>> Cc:[email protected] >>>> Envoyé: Vendredi 22 Juillet 2011 04:50:14 >>>> Objet: Re: [Moses-support] Using Moses language models >>>> >>>> >>>> true,& there's no right answer to it. >>>> >>>> I suppose 1 goal of the trunk is to make sure that the core >>>> functionality of translating isn't affected too much, in terms of >>>> quality, speed, or memory. ANother goal is to make not to >>>> overburden >>>> the API with things no-one else uses or implement. >>>> >>>> therefore, i think a good strategy is to branch& do what you like >>>> >>>> >>>> On 21 July 2011 22:46, Kenneth Heafield< [email protected] > >>>> wrote: >>>> >>>> >>>> Marc makes a good point. When one language model provides more >>>> information than do other language models, it's difficult to >>>> maintain >>>> a >>>> common abstraction layer. Currently we're looking at n-gram >>>> length. >>>> SRILM doesn't provide access to that (but you can get >>>> right-looking >>>> state length which is usually the same thing). >>>> >>>> I'm working on making this issue more severe with left-looking >>>> state >>>> optimization and explicit hypothesis bounds. How do we change the >>>> decoder to use these features if not all of the language models >>>> support >>>> them? >>>> >>>> Maybe another class in the language model hierarchy supporting >>>> these >>>> additional features. But it's going to make the decoder look ugly >>>> if >>>> you want to support both. >>>> >>>> >>>> >>>> >>>> On 07/21/11 11:14, Hieu Hoang wrote: >>>>> hi marc, >>>>> >>>>> it'll be good for people to see your changes. >>>>> >>>>> i suppose you should create a branch and make your changes in >>>>> there. >>>>> >>>>> If there are other people interested, you can point them to your >>>>> branch. >>>>> If more people are interested and it doesn't affect other people >>>>> too >>>>> much, then we can move it to trunk. >>>>> >>>>> i'll email you offline with svn details >>>>> >>>>> On 21/07/2011 15:16, Marc LEGENDRE wrote: >>>>>> Alright, I gave this a try, and it did it for me. >>>>>> With kenlm, it is a ridiculously straightforward modification, >>>>>> but now I'm not sure how I can submit it : >>>>>> on one hand, I am not a "machine tranlation guy" and I don't >>>>>> imagine myself >>>>>> digging in every other LM to find how to set the ngram_length >>>>>> value; >>>>>> and on the other hand I would feel guilty to submit a 10-line >>>>>> patch and say >>>>>> "Guys, I need this, would you mind committing it and doing >>>>>> yourselves the >>>>>> necessary modifications in every other wrapper ?" >>>>>> >>>>>> How do you, Moses developers, feel about this ? >>>>>> Is it acceptable / outrageously stupid if I set the value to -1 >>>>>> in >>>>>> the other wrappers, >>>>>> maybe with a TODO, and properly document it in the super class ? >>>>>> >>>>>> ----- Mail original ----- >>>>>>> De: "Kenneth Heafield"< [email protected] > >>>>>>> À:[email protected] >>>>>>> Envoyé: Mercredi 13 Juillet 2011 20:53:46 >>>>>>> Objet: Re: [Moses-support] Using Moses language models >>>>>>> >>>>>>> I'd suggest adding a ngram_length member to LMResult then >>>>>>> modifying >>>>>>> each >>>>>>> model's wrapper (or just mine) to set that value. >>>>>>> >>>>>>> You're welcome to move stuff from LanguageModelKen.cpp to >>>>>>> LanguageModelKen.h as necessary. I chose this setup to minimize >>>>>>> unnecessary includes. >>>>>>> >>>>>>> Kenneth >>>>>>> >>>>>>> On 07/13/11 14:33, Marc LEGENDRE wrote: >>>>>>>> Well, not only the header is not "public", so to speak, (which >>>>>>>> I >>>>>>>> agree is not a major obstacle) >>>>>>>> but also the desired pointer is a private member of the class, >>>>>>>> and >>>>>>>> sadly lacks a getter. >>>>>>>> As far as I know, it means that accessing it will involve >>>>>>>> questionnable C++ tricks. >>>>>>>> (never tried, though) >>>>>>>> >>>>>>>> If modifying Moses is not too much of a chore, I'll give it a >>>>>>>> thought. >>>>>>>> >>>>>>>> Anyway, thank you for your answers. >>>>>>>> >>>>>>>> ----- Mail original ----- >>>>>>>>> De: "Hieu Hoang"< [email protected] > >>>>>>>>> À:[email protected] >>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 18:40:11 >>>>>>>>> Objet: Re: [Moses-support] Using Moses language models >>>>>>>>> i guess lm::Model is specific to the ken lm implementation. >>>>>>>>> If >>>>>>>>> you >>>>>>>>> want >>>>>>>>> use it you should include the header yourself and cast >>>>>>>>> whatever >>>>>>>>> you >>>>>>>>> need >>>>>>>>> to get the pointer. >>>>>>>>> >>>>>>>>> if you're feeling generous, maybe you can extend the moses LM >>>>>>>>> wrapper >>>>>>>>> so >>>>>>>>> that all LM implementations have the opportunity to return >>>>>>>>> the >>>>>>>>> length >>>>>>>>> n-gram match. >>>>>>>>> >>>>>>>>> On 13/07/2011 21:51, Marc LEGENDRE wrote: >>>>>>>>>> The length of the n-gram match is sufficient for I want, >>>>>>>>>> indeed. >>>>>>>>>> I figured out how to do get it using directly kenlm, but as >>>>>>>>>> I >>>>>>>>>> am >>>>>>>>>> running the decoder, I wanted to use the already loaded LM. >>>>>>>>>> >>>>>>>>>> I first tried to dig my way through the Moses abstraction >>>>>>>>>> layers >>>>>>>>>> to >>>>>>>>>> retrieve a pointer to a lm::Model from kenlm, but the >>>>>>>>>> Moses::LanguageModelKen header is not part of the public >>>>>>>>>> headers >>>>>>>>>> of >>>>>>>>>> Moses ; that's why I tried to use only Moses interface. >>>>>>>>>> >>>>>>>>>> (I did I did not mention this alternative ; If someone knows >>>>>>>>>> how >>>>>>>>>> to >>>>>>>>>> get such a pointer, I can carry on from there) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ----- Mail original ----- >>>>>>>>>>> De: "Kenneth Heafield"< [email protected] > >>>>>>>>>>> À: "Marc LEGENDRE"< [email protected] > >>>>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 16:12:27 >>>>>>>>>>> Objet: Re: [Moses-support] Using Moses language models >>>>>>>>>>> The definition of unknown is that the word you asked for >>>>>>>>>>> (the >>>>>>>>>>> rightmost >>>>>>>>>>> one) is mapped to<unk> i.e. an OOV. >>>>>>>>>>> >>>>>>>>>>> Are you looking for: >>>>>>>>>>> >>>>>>>>>>> 1) Length of n-gram matched in the model >>>>>>>>>>> >>>>>>>>>>> or >>>>>>>>>>> >>>>>>>>>>> 2) Length of state you must keep for valid continuation to >>>>>>>>>>> the >>>>>>>>>>> right >>>>>>>>>>> >>>>>>>>>>> These are slightly different things due to state >>>>>>>>>>> minimization. >>>>>>>>>>> The >>>>>>>>>>> moses abstraction layer does not return either in a general >>>>>>>>>>> way. >>>>>>>>>>> However, if you're using KenLM, #2 is in the returned >>>>>>>>>>> state's >>>>>>>>>>> valid_length_. Further, #1 is in >>>>>>>>>>> FullScoreReturn.ngram_length. >>>>>>>>>>> So >>>>>>>>>>> if >>>>>>>>>>> you call KenLM directly these are easy to obtain (and you >>>>>>>>>>> can >>>>>>>>>>> decide >>>>>>>>>>> whether to expose them through the Moses abstraction >>>>>>>>>>> layer). >>>>>>>>>>> >>>>>>>>>>> Outside the decoder, you can run >>>>>>>>>>> >>>>>>>>>>> kenlm/query model_file null >>>>>>>>>>> >>>>>>>>>>> then provide your trigrams on stdin. >>>>>>>>>>> >>>>>>>>>>> Here's an example with kenlm/query kenlm/lm/test.arpa null >>>>>>>>>>> >>>>>>>>>>> looking on a >>>>>>>>>>> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513 >>>>>>>>>>> Total: -1.79818 OOV: 0 >>>>>>>>>>> >>>>>>>>>>> The format is "word=vocab_id ngram_length score". So this >>>>>>>>>>> is >>>>>>>>>>> a >>>>>>>>>>> trigram >>>>>>>>>>> in the model because "a=5 3" appears. >>>>>>>>>>> >>>>>>>>>>> On 07/13/11 08:50, Marc LEGENDRE wrote: >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> I am trying to use the language models loaded by Moses ; >>>>>>>>>>>> >>>>>>>>>>>> I am using a 3-gram LM, and I need to know whether it >>>>>>>>>>>> contains >>>>>>>>>>>> a >>>>>>>>>>>> given N-gram or not. >>>>>>>>>>>> I tried to play around with >>>>>>>>>>>> LanguageModelImplementation::GetValueForgotState(...), >>>>>>>>>>>> but the boolean 'unknown' in the returned structure does >>>>>>>>>>>> not >>>>>>>>>>>> seem >>>>>>>>>>>> to >>>>>>>>>>>> be what I'm looking for. >>>>>>>>>>>> >>>>>>>>>>>> Is there any simple way of getting this piece of >>>>>>>>>>>> information >>>>>>>>>>>> ? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Marc Legendre >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Moses-support mailing list >>>>>>>>>>>> [email protected] >>>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>>>> _______________________________________________ >>>>>>>>>> Moses-support mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>>>> >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Moses-support mailing list >>>>>>>>> [email protected] >>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>> _______________________________________________ >>>>>>>> Moses-support mailing list >>>>>>>> [email protected] >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>> _______________________________________________ >>>>>>> Moses-support mailing list >>>>>>> [email protected] >>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>> >>>>>> _______________________________________________ >>>>>> Moses-support mailing list >>>>>> [email protected] >>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Moses-support mailing list >>>>> [email protected] >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> _______________________________________________ >>>> Moses-support mailing list >>>> [email protected] >>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> >>>> >>>> >>>> _______________________________________________ >>>> Moses-support mailing list >>>> [email protected] >>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
