Alright, I gave this a try, and it did it for me.
With kenlm, it is a ridiculously straightforward modification,
but now I'm not sure how I can submit it :
on one hand, I am not a "machine tranlation guy" and I don't imagine myself
digging in every other LM to find how to set the ngram_length value;
and on the other hand I would feel guilty to submit a 10-line patch and say
"Guys, I need this, would you mind committing it and doing yourselves the
necessary modifications in every other wrapper ?"

How do you, Moses developers, feel about this ?
Is it acceptable / outrageously stupid if I set the value to -1 in the other 
wrappers,
maybe with a TODO, and properly document it in the super class ?

----- Mail original -----
> De: "Kenneth Heafield" <[email protected]>
> À: [email protected]
> Envoyé: Mercredi 13 Juillet 2011 20:53:46
> Objet: Re: [Moses-support] Using Moses language models
> 
> I'd suggest adding a ngram_length member to LMResult then modifying
> each
> model's wrapper (or just mine) to set that value.
> 
> You're welcome to move stuff from LanguageModelKen.cpp to
> LanguageModelKen.h as necessary.  I chose this setup to minimize
> unnecessary includes.
> 
> Kenneth
> 
> On 07/13/11 14:33, Marc LEGENDRE wrote:
> > Well, not only the header is not "public", so to speak, (which I
> > agree is not a major obstacle)
> > but also the desired pointer is a private member of the class, and
> > sadly lacks a getter.
> > As far as I know, it means that accessing it will involve
> > questionnable C++ tricks.
> > (never tried, though)
> > 
> > If modifying Moses is not too much of a chore, I'll give it a
> > thought.
> > 
> > Anyway, thank you for your answers.
> > 
> > ----- Mail original -----
> >> De: "Hieu Hoang" <[email protected]>
> >> À: [email protected]
> >> Envoyé: Mercredi 13 Juillet 2011 18:40:11
> >> Objet: Re: [Moses-support] Using Moses language models
> >> i guess lm::Model is specific to the ken lm implementation. If you
> >> want
> >> use it you should include the header yourself and cast whatever
> >> you
> >> need
> >> to get the pointer.
> >>
> >> if you're feeling generous, maybe you can extend the moses LM
> >> wrapper
> >> so
> >> that all LM implementations have the opportunity to return the
> >> length
> >> n-gram match.
> >>
> >> On 13/07/2011 21:51, Marc LEGENDRE wrote:
> >>> The length of the n-gram match is sufficient for I want, indeed.
> >>> I figured out how to do get it using directly kenlm, but as I am
> >>> running the decoder, I wanted to use the already loaded LM.
> >>>
> >>> I first tried to dig my way through the Moses abstraction layers
> >>> to
> >>> retrieve a pointer to a lm::Model from kenlm, but the
> >>> Moses::LanguageModelKen header is not part of the public headers
> >>> of
> >>> Moses ; that's why I tried to use only Moses interface.
> >>>
> >>> (I did I did not mention this alternative ; If someone knows how
> >>> to
> >>> get such a pointer, I can carry on from there)
> >>>
> >>>
> >>> ----- Mail original -----
> >>>> De: "Kenneth Heafield"<[email protected]>
> >>>> À: "Marc LEGENDRE"<[email protected]>
> >>>> Envoyé: Mercredi 13 Juillet 2011 16:12:27
> >>>> Objet: Re: [Moses-support] Using Moses language models
> >>>> The definition of unknown is that the word you asked for (the
> >>>> rightmost
> >>>> one) is mapped to<unk> i.e. an OOV.
> >>>>
> >>>> Are you looking for:
> >>>>
> >>>> 1) Length of n-gram matched in the model
> >>>>
> >>>> or
> >>>>
> >>>> 2) Length of state you must keep for valid continuation to the
> >>>> right
> >>>>
> >>>> These are slightly different things due to state minimization.
> >>>> The
> >>>> moses abstraction layer does not return either in a general way.
> >>>> However, if you're using KenLM, #2 is in the returned state's
> >>>> valid_length_. Further, #1 is in FullScoreReturn.ngram_length.
> >>>> So
> >>>> if
> >>>> you call KenLM directly these are easy to obtain (and you can
> >>>> decide
> >>>> whether to expose them through the Moses abstraction layer).
> >>>>
> >>>> Outside the decoder, you can run
> >>>>
> >>>> kenlm/query model_file null
> >>>>
> >>>> then provide your trigrams on stdin.
> >>>>
> >>>> Here's an example with kenlm/query kenlm/lm/test.arpa null
> >>>>
> >>>> looking on a
> >>>> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513
> >>>> Total: -1.79818 OOV: 0
> >>>>
> >>>> The format is "word=vocab_id ngram_length score". So this is a
> >>>> trigram
> >>>> in the model because "a=5 3" appears.
> >>>>
> >>>> On 07/13/11 08:50, Marc LEGENDRE wrote:
> >>>>> Hello,
> >>>>>
> >>>>> I am trying to use the language models loaded by Moses ;
> >>>>>
> >>>>> I am using a 3-gram LM, and I need to know whether it contains
> >>>>> a
> >>>>> given N-gram or not.
> >>>>> I tried to play around with
> >>>>> LanguageModelImplementation::GetValueForgotState(...),
> >>>>> but the boolean 'unknown' in the returned structure does not
> >>>>> seem
> >>>>> to
> >>>>> be what I'm looking for.
> >>>>>
> >>>>> Is there any simple way of getting this piece of information ?
> >>>>>
> >>>>>
> >>>>> Regards,
> >>>>> Marc Legendre
> >>>>> _______________________________________________
> >>>>> Moses-support mailing list
> >>>>> [email protected]
> >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>> _______________________________________________
> >>> Moses-support mailing list
> >>> [email protected]
> >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >>>
> >>>
> >> _______________________________________________
> >> Moses-support mailing list
> >> [email protected]
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> > 
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to