The length of the n-gram match is sufficient for I want, indeed.
I figured out how to do get it using directly kenlm, but as I am running the 
decoder, I wanted to use the already loaded LM.

I first tried to dig my way through the Moses abstraction layers to retrieve a 
pointer to a lm::Model from kenlm, but the Moses::LanguageModelKen header is 
not part of the public headers of Moses ; that's why I tried to use only Moses 
interface.

(I did I did not mention this alternative ; If someone knows how to get such a 
pointer, I can carry on from there)


----- Mail original -----
> De: "Kenneth Heafield" <[email protected]>
> À: "Marc LEGENDRE" <[email protected]>
> Envoyé: Mercredi 13 Juillet 2011 16:12:27
> Objet: Re: [Moses-support] Using Moses language models
> The definition of unknown is that the word you asked for (the
> rightmost
> one) is mapped to <unk> i.e. an OOV.
> 
> Are you looking for:
> 
> 1) Length of n-gram matched in the model
> 
> or
> 
> 2) Length of state you must keep for valid continuation to the right
> 
> These are slightly different things due to state minimization. The
> moses abstraction layer does not return either in a general way.
> However, if you're using KenLM, #2 is in the returned state's
> valid_length_. Further, #1 is in FullScoreReturn.ngram_length. So if
> you call KenLM directly these are easy to obtain (and you can decide
> whether to expose them through the Moses abstraction layer).
> 
> Outside the decoder, you can run
> 
> kenlm/query model_file null
> 
> then provide your trigrams on stdin.
> 
> Here's an example with kenlm/query kenlm/lm/test.arpa null
> 
> looking on a
> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513
> Total: -1.79818 OOV: 0
> 
> The format is "word=vocab_id ngram_length score". So this is a trigram
> in the model because "a=5 3" appears.
> 
> On 07/13/11 08:50, Marc LEGENDRE wrote:
> >
> > Hello,
> >
> > I am trying to use the language models loaded by Moses ;
> >
> > I am using a 3-gram LM, and I need to know whether it contains a
> > given N-gram or not.
> > I tried to play around with
> > LanguageModelImplementation::GetValueForgotState(...),
> > but the boolean 'unknown' in the returned structure does not seem to
> > be what I'm looking for.
> >
> > Is there any simple way of getting this piece of information ?
> >
> >
> > Regards,
> > Marc Legendre
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to