Hi Ergun: The original request in Quang's post was:
*For instance, with the n-gram: "the <unk> house <unk> in", I would like the decoder to assign it the probability of the phrase: "the house in" (existing in the LM).* so each time there is a <unk> when calculating the LM score, you need to look another word further. I believe that it cannot be achieved on current LM tools without modifying the source code, which has already been clarified by Kenneth. 2016-01-15 13:20 GMT+00:00 Ergun Bicici <ergun.bic...@dfki.de>: > > Dear Kenneth, > > In the Moses manual, -drop-unknown switch is mentioned: > > 4.7.2 > Handling Unknown Words > Unknown words are copied verbatim to the output. They are also scored by > the language > model, and may be placed out of order. Alternatively, you may want to drop > unknown words. > To do so add the switch -drop-unknown. > > Alternatively, you can write a script that replaces all OOV tokens with > some OOV-token-identifier such as <unk> before sending for translation. > > > *Best Regards,* > Ergun > > Ergun Biçici > DFKI Projektbüro Berlin > > > On Fri, Jan 15, 2016 at 12:22 AM, Kenneth Heafield <mo...@kheafield.com> > wrote: > >> Hi, >> >> I think oov-feature=1 just activates the OOV count feature while >> leaving LM score unchanged. So it would still include p(<unk> | in). >> >> One might try setting the OOV feature weight to -weight_LM * >> weird_moses_internal_constant * log p(<unk>) in an attempt to cancel out >> the log p(<unk>) terms. However that won't work either because: >> >> 1) It will still charge backoff penalties, b(the)b(house) in the example. >> >> 2) The context will be lost each time so it's p(house) not p(house | the). >> >> If the <unk>s follow a pattern, such as appearing every other word, one >> could insert them into the ARPA file though that would waste memory. >> >> I don't think there's any way to accomplish exactly what OP asked for >> without coding (though it wouldn't be that hard once one understands how >> the LM infrastructure works). >> >> Kenneth >> >> On 01/14/2016 11:07 PM, Philipp Koehn wrote: >> > Hi, >> > >> > You may get the behavior you want by adding >> > "oov-feature=1" >> > to your LM specification line in moses.ini >> > and also add a second weight with value "0" to the corresponding LM >> > weight setting. >> > >> > This will then only use the scores >> > p(the|<s>) >> > p(house|<s>,the,<unk>) ---> backoff to p(house) >> > p(in|<s>,the,<unk>,house,<unk>) ---> backoff to p(in) >> > >> > -phi >> > >> > On Thu, Jan 14, 2016 at 8:25 AM, LUONG NGOC Quang >> > <quangngoclu...@gmail.com <mailto:quangngoclu...@gmail.com>> wrote: >> > >> > Dear All, >> > >> > I am currently using a SRILM Language Model (LM) in my Moses >> > decoder. Does anyone know how can I ask the decoder, at the decoding >> > time, skip all out-of-vocabulary words when computing the LM score >> > (instead of doing back-off)? >> > >> > For instance, with the n-gram: "the <unk> house <unk> in", I would >> > like the decoder to assign it the probability of the phrase: "the >> > house in" (existing in the LM). >> > >> > Do I need more options/declarations in moses.ini file? >> > >> > Any help is very much appreciated, >> > >> > Best, >> > Quang >> > >> > >> > >> > _______________________________________________ >> > Moses-support mailing list >> > Moses-support@mit.edu <mailto:Moses-support@mit.edu> >> > http://mailman.mit.edu/mailman/listinfo/moses-support >> > >> > >> > >> > >> > _______________________________________________ >> > Moses-support mailing list >> > Moses-support@mit.edu >> > http://mailman.mit.edu/mailman/listinfo/moses-support >> > >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- Best regards! Jie Jiang
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support