Sanne Korzec wrote:

> I am having trouble understanding what the recaser is doing exactly  
> when evaluating a (dev) test set.
>
> Why do we need to train a recaser?


Because the default setup in Moses is to train caseless models.  This  
is done by lowercasing the parallel corpus before anything else  
happens.  But this means that all ouput will be lowercase, which is  
ugly - users uniformly hate it.  Plus, in the NIST evaluations,  
scoring is done casefully.

The Moses recaser is a separate MT model that translates between the  
languages "lowercase english" and "mixed-CASE English".  This is  
trained from a parallel corpus constructed from the lowercase version  
of the English, and the original English.

> Is there some documentation about which arguments to give to train- 
> recaser.perl


There's a little bit here:

   http://www.statmt.org/wmt08/baseline.html

But its pretty minimal.

> Why is there yet another moses.ini file here. I thought at this  
> stage we are finished training and thus we do not need the  
> moses.ini file anymore.

Because the recaser is a completely separate set of Moses models.   
Even the language model is different - it's trained from the original  
English, while the "main" language model is trained from the  
lowercase English, to match what the main translation model wants to  
produce.

It's worth noting that there are other ways to deal with translating  
case.  You could simply leave the corpus unaltered, and train  
everything on caseful data.  Then Moses would treat "burger" and  
"Burger" as completely unrelated words (same for "the" and "The",  
however).  Or you could train a caseless translation model, but use a  
caseful language model to disambiguate between the possible case  
patterns for each word.  There are a couple ways people have done the  
latter, by either using SRILM's disambiguate tool, or by hacking the  
phrase table to have every likely case pattern for each phrase.  I  
think early versions of Moses used the former approach, while one of  
Google's entries in the NIST evals used the latter.

Hope this lengthy explanation helped.

- John Burger (not to be confused with john burger)
   MITRE

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to