Re: [Moses-support] recaser

Philipp Koehn Thu, 17 Jul 2008 11:45:12 -0700

Hi,

just to add one comment: We have recently experimented
with training models on truecased model (where only the
first word of each sentence is converted into its most common
casing), with mixed results. Such a truecaser should also
be adjusted to deal with ALL-CAPS HEADLINES and so on.


Maybe someone out there has a tool for this?

-phi

On Thu, Jul 17, 2008 at 3:40 PM, John D. Burger <[EMAIL PROTECTED]> wrote:
> Sanne Korzec wrote:
>
>> I am having trouble understanding what the recaser is doing exactly
>> when evaluating a (dev) test set.
>>
>> Why do we need to train a recaser?
>
>
> Because the default setup in Moses is to train caseless models.  This
> is done by lowercasing the parallel corpus before anything else
> happens.  But this means that all ouput will be lowercase, which is
> ugly - users uniformly hate it.  Plus, in the NIST evaluations,
> scoring is done casefully.
>
> The Moses recaser is a separate MT model that translates between the
> languages "lowercase english" and "mixed-CASE English".  This is
> trained from a parallel corpus constructed from the lowercase version
> of the English, and the original English.
>
>> Is there some documentation about which arguments to give to train-
>> recaser.perl
>
>
> There's a little bit here:
>
>   http://www.statmt.org/wmt08/baseline.html
>
> But its pretty minimal.
>
>> Why is there yet another moses.ini file here. I thought at this
>> stage we are finished training and thus we do not need the
>> moses.ini file anymore.
>
> Because the recaser is a completely separate set of Moses models.
> Even the language model is different - it's trained from the original
> English, while the "main" language model is trained from the
> lowercase English, to match what the main translation model wants to
> produce.
>
> It's worth noting that there are other ways to deal with translating
> case.  You could simply leave the corpus unaltered, and train
> everything on caseful data.  Then Moses would treat "burger" and
> "Burger" as completely unrelated words (same for "the" and "The",
> however).  Or you could train a caseless translation model, but use a
> caseful language model to disambiguate between the possible case
> patterns for each word.  There are a couple ways people have done the
> latter, by either using SRILM's disambiguate tool, or by hacking the
> phrase table to have every likely case pattern for each phrase.  I
> think early versions of Moses used the former approach, while one of
> Google's entries in the NIST evals used the latter.
>
> Hope this lengthy explanation helped.
>
> - John Burger (not to be confused with john burger)
>   MITRE
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] recaser

Reply via email to