The length of the n-gram match is sufficient for I want, indeed. I figured out how to do get it using directly kenlm, but as I am running the decoder, I wanted to use the already loaded LM.
I first tried to dig my way through the Moses abstraction layers to retrieve a pointer to a lm::Model from kenlm, but the Moses::LanguageModelKen header is not part of the public headers of Moses ; that's why I tried to use only Moses interface. (I did I did not mention this alternative ; If someone knows how to get such a pointer, I can carry on from there) ----- Mail original ----- > De: "Kenneth Heafield" <[email protected]> > À: "Marc LEGENDRE" <[email protected]> > Envoyé: Mercredi 13 Juillet 2011 16:12:27 > Objet: Re: [Moses-support] Using Moses language models > The definition of unknown is that the word you asked for (the > rightmost > one) is mapped to <unk> i.e. an OOV. > > Are you looking for: > > 1) Length of n-gram matched in the model > > or > > 2) Length of state you must keep for valid continuation to the right > > These are slightly different things due to state minimization. The > moses abstraction layer does not return either in a general way. > However, if you're using KenLM, #2 is in the returned state's > valid_length_. Further, #1 is in FullScoreReturn.ngram_length. So if > you call KenLM directly these are easy to obtain (and you can decide > whether to expose them through the Moses abstraction layer). > > Outside the decoder, you can run > > kenlm/query model_file null > > then provide your trigrams on stdin. > > Here's an example with kenlm/query kenlm/lm/test.arpa null > > looking on a > looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513 > Total: -1.79818 OOV: 0 > > The format is "word=vocab_id ngram_length score". So this is a trigram > in the model because "a=5 3" appears. > > On 07/13/11 08:50, Marc LEGENDRE wrote: > > > > Hello, > > > > I am trying to use the language models loaded by Moses ; > > > > I am using a 3-gram LM, and I need to know whether it contains a > > given N-gram or not. > > I tried to play around with > > LanguageModelImplementation::GetValueForgotState(...), > > but the boolean 'unknown' in the returned structure does not seem to > > be what I'm looking for. > > > > Is there any simple way of getting this piece of information ? > > > > > > Regards, > > Marc Legendre > > _______________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
