Well, I actually commited in the augmLMResult branch. I inserted a class between LMKen and LMSingleFactor to prevent the inclusion of kenlm headers. (And yes, I now realize this may be the kind of things you write in a commit message)
Since the LanguageModelKen.h header now contains functions I want to use, can we add it to the list of the installed files ? ( && How ? ) Also, I can't get the regression tests to work. I downloaded the test data && extracted those in /tmp; I read what I found, and this is the command I came up with ./regression-testing/run-test-suite.pl --decoder-phrase=moses-cmd/src/moses --decoder-chart=moses-chart-cmd/src/moses_chart But every test ends with a "MOSES CRASHED" message. (And the same thing happens with the trunk build) I tried to understand, and I noticed that .ini files for the tests contain : [lmodel-file] 0 0 3 moses-reg-test-data-5/lm/europarl.en.srilm.gz Is that OK for kenlm ? Marc ----- Mail original ----- > De: "Kenneth Heafield" <[email protected]> > À: "Marc LEGENDRE" <[email protected]> > Cc: [email protected], [email protected] > Envoyé: Vendredi 22 Juillet 2011 20:18:21 > Objet: Re: [Moses-support] Using Moses language models > > Hi Marc, > > This sounds like a simple change, so a branch is probably too much > overhead. Please do one of the following: > > 1. Send a patch as generated by diff -rupN $old $new . Do a make > clean > first. > 2. Attach the files you modified and send them, along with the > revision > you based changes on. > 3. Make a branch (if you already did). > > Thanks, > > Kenneth > > On 07/22/11 04:21, Marc LEGENDRE wrote: > > Well, we (me and the people I work with) were hoping not to have to > > maintain > > a modified version of Moses. > > > > Luckily, obviousness just hit me like a truck : if something is > > specific to a LM, > > it does not have to be in the top layer. > > Having a common interface does not prevent subclasses from having a > > specific behaviour, > > we could have a LanguageModelKen method, say > > GetValueForgotStateKen(...) which would return > > something specific, say a LMKenResult, which would contain a > > LMResult plus others things > > like, say, a ngram_length field :-). > > And the virtual GetValueForgotState() method would simply return > > the LMResult from there. > > > > This way, no need to break the high level API, > > and no extra maintenance cost for us (me and the peop... Well, you > > know). > > > > ----- Mail original ----- > >> De: "Hieu Hoang" <[email protected]> > >> À: "Kenneth Heafield" <[email protected]> > >> Cc: [email protected] > >> Envoyé: Vendredi 22 Juillet 2011 04:50:14 > >> Objet: Re: [Moses-support] Using Moses language models > >> > >> > >> true, & there's no right answer to it. > >> > >> I suppose 1 goal of the trunk is to make sure that the core > >> functionality of translating isn't affected too much, in terms of > >> quality, speed, or memory. ANother goal is to make not to > >> overburden > >> the API with things no-one else uses or implement. > >> > >> therefore, i think a good strategy is to branch & do what you like > >> > >> > >> On 21 July 2011 22:46, Kenneth Heafield < [email protected] > > >> wrote: > >> > >> > >> Marc makes a good point. When one language model provides more > >> information than do other language models, it's difficult to > >> maintain > >> a > >> common abstraction layer. Currently we're looking at n-gram > >> length. > >> SRILM doesn't provide access to that (but you can get > >> right-looking > >> state length which is usually the same thing). > >> > >> I'm working on making this issue more severe with left-looking > >> state > >> optimization and explicit hypothesis bounds. How do we change the > >> decoder to use these features if not all of the language models > >> support > >> them? > >> > >> Maybe another class in the language model hierarchy supporting > >> these > >> additional features. But it's going to make the decoder look ugly > >> if > >> you want to support both. > >> > >> > >> > >> > >> On 07/21/11 11:14, Hieu Hoang wrote: > >>> hi marc, > >>> > >>> it'll be good for people to see your changes. > >>> > >>> i suppose you should create a branch and make your changes in > >>> there. > >>> > >>> If there are other people interested, you can point them to your > >>> branch. > >>> If more people are interested and it doesn't affect other people > >>> too > >>> much, then we can move it to trunk. > >>> > >>> i'll email you offline with svn details > >>> > >>> On 21/07/2011 15:16, Marc LEGENDRE wrote: > >>>> Alright, I gave this a try, and it did it for me. > >>>> With kenlm, it is a ridiculously straightforward modification, > >>>> but now I'm not sure how I can submit it : > >>>> on one hand, I am not a "machine tranlation guy" and I don't > >>>> imagine myself > >>>> digging in every other LM to find how to set the ngram_length > >>>> value; > >>>> and on the other hand I would feel guilty to submit a 10-line > >>>> patch and say > >>>> "Guys, I need this, would you mind committing it and doing > >>>> yourselves the > >>>> necessary modifications in every other wrapper ?" > >>>> > >>>> How do you, Moses developers, feel about this ? > >>>> Is it acceptable / outrageously stupid if I set the value to -1 > >>>> in > >>>> the other wrappers, > >>>> maybe with a TODO, and properly document it in the super class ? > >>>> > >>>> ----- Mail original ----- > >>>>> De: "Kenneth Heafield"< [email protected] > > >>>>> À: [email protected] > >>>>> Envoyé: Mercredi 13 Juillet 2011 20:53:46 > >>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>> > >>>>> I'd suggest adding a ngram_length member to LMResult then > >>>>> modifying > >>>>> each > >>>>> model's wrapper (or just mine) to set that value. > >>>>> > >>>>> You're welcome to move stuff from LanguageModelKen.cpp to > >>>>> LanguageModelKen.h as necessary. I chose this setup to minimize > >>>>> unnecessary includes. > >>>>> > >>>>> Kenneth > >>>>> > >>>>> On 07/13/11 14:33, Marc LEGENDRE wrote: > >>>>>> Well, not only the header is not "public", so to speak, (which > >>>>>> I > >>>>>> agree is not a major obstacle) > >>>>>> but also the desired pointer is a private member of the class, > >>>>>> and > >>>>>> sadly lacks a getter. > >>>>>> As far as I know, it means that accessing it will involve > >>>>>> questionnable C++ tricks. > >>>>>> (never tried, though) > >>>>>> > >>>>>> If modifying Moses is not too much of a chore, I'll give it a > >>>>>> thought. > >>>>>> > >>>>>> Anyway, thank you for your answers. > >>>>>> > >>>>>> ----- Mail original ----- > >>>>>>> De: "Hieu Hoang"< [email protected] > > >>>>>>> À: [email protected] > >>>>>>> Envoyé: Mercredi 13 Juillet 2011 18:40:11 > >>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>> i guess lm::Model is specific to the ken lm implementation. > >>>>>>> If > >>>>>>> you > >>>>>>> want > >>>>>>> use it you should include the header yourself and cast > >>>>>>> whatever > >>>>>>> you > >>>>>>> need > >>>>>>> to get the pointer. > >>>>>>> > >>>>>>> if you're feeling generous, maybe you can extend the moses LM > >>>>>>> wrapper > >>>>>>> so > >>>>>>> that all LM implementations have the opportunity to return > >>>>>>> the > >>>>>>> length > >>>>>>> n-gram match. > >>>>>>> > >>>>>>> On 13/07/2011 21:51, Marc LEGENDRE wrote: > >>>>>>>> The length of the n-gram match is sufficient for I want, > >>>>>>>> indeed. > >>>>>>>> I figured out how to do get it using directly kenlm, but as > >>>>>>>> I > >>>>>>>> am > >>>>>>>> running the decoder, I wanted to use the already loaded LM. > >>>>>>>> > >>>>>>>> I first tried to dig my way through the Moses abstraction > >>>>>>>> layers > >>>>>>>> to > >>>>>>>> retrieve a pointer to a lm::Model from kenlm, but the > >>>>>>>> Moses::LanguageModelKen header is not part of the public > >>>>>>>> headers > >>>>>>>> of > >>>>>>>> Moses ; that's why I tried to use only Moses interface. > >>>>>>>> > >>>>>>>> (I did I did not mention this alternative ; If someone knows > >>>>>>>> how > >>>>>>>> to > >>>>>>>> get such a pointer, I can carry on from there) > >>>>>>>> > >>>>>>>> > >>>>>>>> ----- Mail original ----- > >>>>>>>>> De: "Kenneth Heafield"< [email protected] > > >>>>>>>>> À: "Marc LEGENDRE"< [email protected] > > >>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 16:12:27 > >>>>>>>>> Objet: Re: [Moses-support] Using Moses language models > >>>>>>>>> The definition of unknown is that the word you asked for > >>>>>>>>> (the > >>>>>>>>> rightmost > >>>>>>>>> one) is mapped to<unk> i.e. an OOV. > >>>>>>>>> > >>>>>>>>> Are you looking for: > >>>>>>>>> > >>>>>>>>> 1) Length of n-gram matched in the model > >>>>>>>>> > >>>>>>>>> or > >>>>>>>>> > >>>>>>>>> 2) Length of state you must keep for valid continuation to > >>>>>>>>> the > >>>>>>>>> right > >>>>>>>>> > >>>>>>>>> These are slightly different things due to state > >>>>>>>>> minimization. > >>>>>>>>> The > >>>>>>>>> moses abstraction layer does not return either in a general > >>>>>>>>> way. > >>>>>>>>> However, if you're using KenLM, #2 is in the returned > >>>>>>>>> state's > >>>>>>>>> valid_length_. Further, #1 is in > >>>>>>>>> FullScoreReturn.ngram_length. > >>>>>>>>> So > >>>>>>>>> if > >>>>>>>>> you call KenLM directly these are easy to obtain (and you > >>>>>>>>> can > >>>>>>>>> decide > >>>>>>>>> whether to expose them through the Moses abstraction > >>>>>>>>> layer). > >>>>>>>>> > >>>>>>>>> Outside the decoder, you can run > >>>>>>>>> > >>>>>>>>> kenlm/query model_file null > >>>>>>>>> > >>>>>>>>> then provide your trigrams on stdin. > >>>>>>>>> > >>>>>>>>> Here's an example with kenlm/query kenlm/lm/test.arpa null > >>>>>>>>> > >>>>>>>>> looking on a > >>>>>>>>> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513 > >>>>>>>>> Total: -1.79818 OOV: 0 > >>>>>>>>> > >>>>>>>>> The format is "word=vocab_id ngram_length score". So this > >>>>>>>>> is > >>>>>>>>> a > >>>>>>>>> trigram > >>>>>>>>> in the model because "a=5 3" appears. > >>>>>>>>> > >>>>>>>>> On 07/13/11 08:50, Marc LEGENDRE wrote: > >>>>>>>>>> Hello, > >>>>>>>>>> > >>>>>>>>>> I am trying to use the language models loaded by Moses ; > >>>>>>>>>> > >>>>>>>>>> I am using a 3-gram LM, and I need to know whether it > >>>>>>>>>> contains > >>>>>>>>>> a > >>>>>>>>>> given N-gram or not. > >>>>>>>>>> I tried to play around with > >>>>>>>>>> LanguageModelImplementation::GetValueForgotState(...), > >>>>>>>>>> but the boolean 'unknown' in the returned structure does > >>>>>>>>>> not > >>>>>>>>>> seem > >>>>>>>>>> to > >>>>>>>>>> be what I'm looking for. > >>>>>>>>>> > >>>>>>>>>> Is there any simple way of getting this piece of > >>>>>>>>>> information > >>>>>>>>>> ? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Regards, > >>>>>>>>>> Marc Legendre > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> Moses-support mailing list > >>>>>>>>>> [email protected] > >>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>> _______________________________________________ > >>>>>>>> Moses-support mailing list > >>>>>>>> [email protected] > >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>>>> > >>>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> Moses-support mailing list > >>>>>>> [email protected] > >>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>>> _______________________________________________ > >>>>>> Moses-support mailing list > >>>>>> [email protected] > >>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>> _______________________________________________ > >>>>> Moses-support mailing list > >>>>> [email protected] > >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>>> > >>>> _______________________________________________ > >>>> Moses-support mailing list > >>>> [email protected] > >>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>> > >>>> > >>> _______________________________________________ > >>> Moses-support mailing list > >>> [email protected] > >>> http://mailman.mit.edu/mailman/listinfo/moses-support > >> _______________________________________________ > >> Moses-support mailing list > >> [email protected] > >> http://mailman.mit.edu/mailman/listinfo/moses-support > >> > >> > >> > >> _______________________________________________ > >> Moses-support mailing list > >> [email protected] > >> http://mailman.mit.edu/mailman/listinfo/moses-support > >> > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
