Dear Jie,

There may be some option from SRILM:
- http://www.speech.sri.com/pipermail/srilm-user/2013q2/001509.html
- http://www.speech.sri.com/projects/srilm/manpages/ngram.1.html:
*    -skipoovs*
Instruct the LM to skip over contexts that contain out-of-vocabulary words,
instead of using a backoff strategy in these cases.

​if it is not ​there maybe for a reason...

Bing appears fast to index this thread:
​http://comments.gmane.org/gmane.comp.nlp.moses.user/14570​


*Best Regards,*
Ergun

Ergun Biçici
DFKI Projektbüro Berlin


On Fri, Jan 15, 2016 at 2:37 PM, Jie Jiang <mail.jie.ji...@gmail.com> wrote:

> Hi Ergun:
>
> The original request in Quang's post was:
>
> *For instance, with the n-gram: "the <unk> house <unk> in", I would like
> the decoder to assign it the probability of the phrase: "the house in"
> (existing in the LM).*
>
> so each time there is a <unk> when calculating the LM score, you need to
> look another word further.
>
> I believe that it cannot be achieved on current LM tools without modifying
> the source code, which has already been clarified by Kenneth.
>
>
> 2016-01-15 13:20 GMT+00:00 Ergun Bicici <ergun.bic...@dfki.de>:
>
>>
>> Dear Kenneth,
>>
>> In the Moses manual, -drop-unknown switch is mentioned:
>>
>> 4.7.2
>>  Handling Unknown Words
>> Unknown words are copied verbatim to the output. They are also scored by
>> the language
>> model, and may be placed out of order. Alternatively, you may want to
>> drop unknown words.
>> To do so add the switch -drop-unknown.
>>
>> ​Alternatively, you can write a script that replaces all OOV tokens​ with
>> some OOV-token-identifier such as <unk> before sending for translation.
>>
>>
>> *Best Regards,*
>> Ergun
>>
>> Ergun Biçici
>> DFKI Projektbüro Berlin
>>
>>
>> On Fri, Jan 15, 2016 at 12:22 AM, Kenneth Heafield <mo...@kheafield.com>
>> wrote:
>>
>>> Hi,
>>>
>>>         I think oov-feature=1 just activates the OOV count feature while
>>> leaving LM score unchanged.  So it would still include p(<unk> | in).
>>>
>>>         One might try setting the OOV feature weight to -weight_LM *
>>> weird_moses_internal_constant * log p(<unk>) in an attempt to cancel out
>>> the log p(<unk>) terms.  However that won't work either because:
>>>
>>> 1) It will still charge backoff penalties, b(the)b(house) in the example.
>>>
>>> 2) The context will be lost each time so it's p(house) not p(house |
>>> the).
>>>
>>> If the <unk>s follow a pattern, such as appearing every other word, one
>>> could insert them into the ARPA file though that would waste memory.
>>>
>>> I don't think there's any way to accomplish exactly what OP asked for
>>> without coding (though it wouldn't be that hard once one understands how
>>> the LM infrastructure works).
>>>
>>> Kenneth
>>>
>>> On 01/14/2016 11:07 PM, Philipp Koehn wrote:
>>> > Hi,
>>> >
>>> > You may get the behavior you want by adding
>>> >   "oov-feature=1"
>>> > to your LM specification line in moses.ini
>>> > and also add a second weight with value "0" to the corresponding LM
>>> > weight setting.
>>> >
>>> > This will then only use the scores
>>> > p(the|<s>)
>>> > p(house|<s>,the,<unk>) ---> backoff to p(house)
>>> > p(in|<s>,the,<unk>,house,<unk>) ---> backoff to p(in)
>>> >
>>> > -phi
>>> >
>>> > On Thu, Jan 14, 2016 at 8:25 AM, LUONG NGOC Quang
>>> > <quangngoclu...@gmail.com <mailto:quangngoclu...@gmail.com>> wrote:
>>> >
>>> >     Dear All,
>>> >
>>> >     I am currently using a SRILM Language Model (LM) in my Moses
>>> >     decoder. Does anyone know how can I ask the decoder, at the
>>> decoding
>>> >     time, skip all out-of-vocabulary words when computing the LM score
>>> >     (instead of doing back-off)?
>>> >
>>> >     For instance, with the n-gram: "the <unk> house <unk> in", I would
>>> >     like the decoder to assign it the probability of the phrase: "the
>>> >     house in" (existing in the LM).
>>> >
>>> >     Do I need more options/declarations in moses.ini file?
>>> >
>>> >     Any help is very much appreciated,
>>> >
>>> >     Best,
>>> >     Quang
>>> >
>>> >
>>> >
>>> >     _______________________________________________
>>> >     Moses-support mailing list
>>> >     Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>>> >     http://mailman.mit.edu/mailman/listinfo/moses-support
>>> >
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Moses-support mailing list
>>> > Moses-support@mit.edu
>>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>>> >
>>> _______________________________________________
>>> Moses-support mailing list
>>> Moses-support@mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
>
> Best regards!
>
> Jie Jiang
>
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to