Re: [Moses-support] producing the minimal number of LM-OOVs

Chris Dyer Mon, 21 Mar 2011 03:26:35 -0700

>> I allow pass through of all words, with a penalty that is also learned
>> by MERT.
> Interesting stuff. Do you have results published on this?
This was easiest to implement when I wrote cdec, and the results
seemed good enough, so I never did a proper comparison. I will
describe the newer innovation to give OOVs a tunable penalty in this
year's WMT system description.


Also, for completeness on this topic: I have a third tunable feature
that counts the number of non-ASCII characters in the target. So far,
I've only used this when translating into English from Chinese or
Arabic.

>
>> With the open-class LM, I use the -unk option in SRILM,
>> which reserves a bit of probability mass for OOVs. What exactly it
>> does is a bit unclear to me (it's more than just replacing singletons
>> with <unk>, but that's probably a reasonable approximation).
>
> I would assume that it does the usual discounting (GoodTuring
> or Kneser Ney), and gives the discounted probability mass to
> <unk>.
>
> -phi
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] producing the minimal number of LM-OOVs

Reply via email to