Hi Marc,
This sounds like a simple change, so a branch is probably too much
overhead. Please do one of the following:
1. Send a patch as generated by diff -rupN $old $new . Do a make clean
first.
2. Attach the files you modified and send them, along with the revision
you based changes on.
3. Make a branch (if you already did).
Thanks,
Kenneth
On 07/22/11 04:21, Marc LEGENDRE wrote:
> Well, we (me and the people I work with) were hoping not to have to maintain
> a modified version of Moses.
>
> Luckily, obviousness just hit me like a truck : if something is specific to a
> LM,
> it does not have to be in the top layer.
> Having a common interface does not prevent subclasses from having a specific
> behaviour,
> we could have a LanguageModelKen method, say GetValueForgotStateKen(...)
> which would return
> something specific, say a LMKenResult, which would contain a LMResult plus
> others things
> like, say, a ngram_length field :-).
> And the virtual GetValueForgotState() method would simply return the LMResult
> from there.
>
> This way, no need to break the high level API,
> and no extra maintenance cost for us (me and the peop... Well, you know).
>
> ----- Mail original -----
>> De: "Hieu Hoang" <[email protected]>
>> À: "Kenneth Heafield" <[email protected]>
>> Cc: [email protected]
>> Envoyé: Vendredi 22 Juillet 2011 04:50:14
>> Objet: Re: [Moses-support] Using Moses language models
>>
>>
>> true, & there's no right answer to it.
>>
>> I suppose 1 goal of the trunk is to make sure that the core
>> functionality of translating isn't affected too much, in terms of
>> quality, speed, or memory. ANother goal is to make not to overburden
>> the API with things no-one else uses or implement.
>>
>> therefore, i think a good strategy is to branch & do what you like
>>
>>
>> On 21 July 2011 22:46, Kenneth Heafield < [email protected] >
>> wrote:
>>
>>
>> Marc makes a good point. When one language model provides more
>> information than do other language models, it's difficult to maintain
>> a
>> common abstraction layer. Currently we're looking at n-gram length.
>> SRILM doesn't provide access to that (but you can get right-looking
>> state length which is usually the same thing).
>>
>> I'm working on making this issue more severe with left-looking state
>> optimization and explicit hypothesis bounds. How do we change the
>> decoder to use these features if not all of the language models
>> support
>> them?
>>
>> Maybe another class in the language model hierarchy supporting these
>> additional features. But it's going to make the decoder look ugly if
>> you want to support both.
>>
>>
>>
>>
>> On 07/21/11 11:14, Hieu Hoang wrote:
>>> hi marc,
>>>
>>> it'll be good for people to see your changes.
>>>
>>> i suppose you should create a branch and make your changes in
>>> there.
>>>
>>> If there are other people interested, you can point them to your
>>> branch.
>>> If more people are interested and it doesn't affect other people
>>> too
>>> much, then we can move it to trunk.
>>>
>>> i'll email you offline with svn details
>>>
>>> On 21/07/2011 15:16, Marc LEGENDRE wrote:
>>>> Alright, I gave this a try, and it did it for me.
>>>> With kenlm, it is a ridiculously straightforward modification,
>>>> but now I'm not sure how I can submit it :
>>>> on one hand, I am not a "machine tranlation guy" and I don't
>>>> imagine myself
>>>> digging in every other LM to find how to set the ngram_length
>>>> value;
>>>> and on the other hand I would feel guilty to submit a 10-line
>>>> patch and say
>>>> "Guys, I need this, would you mind committing it and doing
>>>> yourselves the
>>>> necessary modifications in every other wrapper ?"
>>>>
>>>> How do you, Moses developers, feel about this ?
>>>> Is it acceptable / outrageously stupid if I set the value to -1 in
>>>> the other wrappers,
>>>> maybe with a TODO, and properly document it in the super class ?
>>>>
>>>> ----- Mail original -----
>>>>> De: "Kenneth Heafield"< [email protected] >
>>>>> À: [email protected]
>>>>> Envoyé: Mercredi 13 Juillet 2011 20:53:46
>>>>> Objet: Re: [Moses-support] Using Moses language models
>>>>>
>>>>> I'd suggest adding a ngram_length member to LMResult then
>>>>> modifying
>>>>> each
>>>>> model's wrapper (or just mine) to set that value.
>>>>>
>>>>> You're welcome to move stuff from LanguageModelKen.cpp to
>>>>> LanguageModelKen.h as necessary. I chose this setup to minimize
>>>>> unnecessary includes.
>>>>>
>>>>> Kenneth
>>>>>
>>>>> On 07/13/11 14:33, Marc LEGENDRE wrote:
>>>>>> Well, not only the header is not "public", so to speak, (which I
>>>>>> agree is not a major obstacle)
>>>>>> but also the desired pointer is a private member of the class,
>>>>>> and
>>>>>> sadly lacks a getter.
>>>>>> As far as I know, it means that accessing it will involve
>>>>>> questionnable C++ tricks.
>>>>>> (never tried, though)
>>>>>>
>>>>>> If modifying Moses is not too much of a chore, I'll give it a
>>>>>> thought.
>>>>>>
>>>>>> Anyway, thank you for your answers.
>>>>>>
>>>>>> ----- Mail original -----
>>>>>>> De: "Hieu Hoang"< [email protected] >
>>>>>>> À: [email protected]
>>>>>>> Envoyé: Mercredi 13 Juillet 2011 18:40:11
>>>>>>> Objet: Re: [Moses-support] Using Moses language models
>>>>>>> i guess lm::Model is specific to the ken lm implementation. If
>>>>>>> you
>>>>>>> want
>>>>>>> use it you should include the header yourself and cast whatever
>>>>>>> you
>>>>>>> need
>>>>>>> to get the pointer.
>>>>>>>
>>>>>>> if you're feeling generous, maybe you can extend the moses LM
>>>>>>> wrapper
>>>>>>> so
>>>>>>> that all LM implementations have the opportunity to return the
>>>>>>> length
>>>>>>> n-gram match.
>>>>>>>
>>>>>>> On 13/07/2011 21:51, Marc LEGENDRE wrote:
>>>>>>>> The length of the n-gram match is sufficient for I want,
>>>>>>>> indeed.
>>>>>>>> I figured out how to do get it using directly kenlm, but as I
>>>>>>>> am
>>>>>>>> running the decoder, I wanted to use the already loaded LM.
>>>>>>>>
>>>>>>>> I first tried to dig my way through the Moses abstraction
>>>>>>>> layers
>>>>>>>> to
>>>>>>>> retrieve a pointer to a lm::Model from kenlm, but the
>>>>>>>> Moses::LanguageModelKen header is not part of the public
>>>>>>>> headers
>>>>>>>> of
>>>>>>>> Moses ; that's why I tried to use only Moses interface.
>>>>>>>>
>>>>>>>> (I did I did not mention this alternative ; If someone knows
>>>>>>>> how
>>>>>>>> to
>>>>>>>> get such a pointer, I can carry on from there)
>>>>>>>>
>>>>>>>>
>>>>>>>> ----- Mail original -----
>>>>>>>>> De: "Kenneth Heafield"< [email protected] >
>>>>>>>>> À: "Marc LEGENDRE"< [email protected] >
>>>>>>>>> Envoyé: Mercredi 13 Juillet 2011 16:12:27
>>>>>>>>> Objet: Re: [Moses-support] Using Moses language models
>>>>>>>>> The definition of unknown is that the word you asked for (the
>>>>>>>>> rightmost
>>>>>>>>> one) is mapped to<unk> i.e. an OOV.
>>>>>>>>>
>>>>>>>>> Are you looking for:
>>>>>>>>>
>>>>>>>>> 1) Length of n-gram matched in the model
>>>>>>>>>
>>>>>>>>> or
>>>>>>>>>
>>>>>>>>> 2) Length of state you must keep for valid continuation to
>>>>>>>>> the
>>>>>>>>> right
>>>>>>>>>
>>>>>>>>> These are slightly different things due to state
>>>>>>>>> minimization.
>>>>>>>>> The
>>>>>>>>> moses abstraction layer does not return either in a general
>>>>>>>>> way.
>>>>>>>>> However, if you're using KenLM, #2 is in the returned state's
>>>>>>>>> valid_length_. Further, #1 is in
>>>>>>>>> FullScoreReturn.ngram_length.
>>>>>>>>> So
>>>>>>>>> if
>>>>>>>>> you call KenLM directly these are easy to obtain (and you can
>>>>>>>>> decide
>>>>>>>>> whether to expose them through the Moses abstraction layer).
>>>>>>>>>
>>>>>>>>> Outside the decoder, you can run
>>>>>>>>>
>>>>>>>>> kenlm/query model_file null
>>>>>>>>>
>>>>>>>>> then provide your trigrams on stdin.
>>>>>>>>>
>>>>>>>>> Here's an example with kenlm/query kenlm/lm/test.arpa null
>>>>>>>>>
>>>>>>>>> looking on a
>>>>>>>>> looking=23 1 -1.28594 on=25 2 -0.46389 a=5 3 -0.0483513
>>>>>>>>> Total: -1.79818 OOV: 0
>>>>>>>>>
>>>>>>>>> The format is "word=vocab_id ngram_length score". So this is
>>>>>>>>> a
>>>>>>>>> trigram
>>>>>>>>> in the model because "a=5 3" appears.
>>>>>>>>>
>>>>>>>>> On 07/13/11 08:50, Marc LEGENDRE wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> I am trying to use the language models loaded by Moses ;
>>>>>>>>>>
>>>>>>>>>> I am using a 3-gram LM, and I need to know whether it
>>>>>>>>>> contains
>>>>>>>>>> a
>>>>>>>>>> given N-gram or not.
>>>>>>>>>> I tried to play around with
>>>>>>>>>> LanguageModelImplementation::GetValueForgotState(...),
>>>>>>>>>> but the boolean 'unknown' in the returned structure does not
>>>>>>>>>> seem
>>>>>>>>>> to
>>>>>>>>>> be what I'm looking for.
>>>>>>>>>>
>>>>>>>>>> Is there any simple way of getting this piece of information
>>>>>>>>>> ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Marc Legendre
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Moses-support mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>> _______________________________________________
>>>>>>>> Moses-support mailing list
>>>>>>>> [email protected]
>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Moses-support mailing list
>>>>>>> [email protected]
>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> [email protected]
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> [email protected]
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected]
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support