Hi, just to add one comment: We have recently experimented with training models on truecased model (where only the first word of each sentence is converted into its most common casing), with mixed results. Such a truecaser should also be adjusted to deal with ALL-CAPS HEADLINES and so on.
Maybe someone out there has a tool for this? -phi On Thu, Jul 17, 2008 at 3:40 PM, John D. Burger <[EMAIL PROTECTED]> wrote: > Sanne Korzec wrote: > >> I am having trouble understanding what the recaser is doing exactly >> when evaluating a (dev) test set. >> >> Why do we need to train a recaser? > > > Because the default setup in Moses is to train caseless models. This > is done by lowercasing the parallel corpus before anything else > happens. But this means that all ouput will be lowercase, which is > ugly - users uniformly hate it. Plus, in the NIST evaluations, > scoring is done casefully. > > The Moses recaser is a separate MT model that translates between the > languages "lowercase english" and "mixed-CASE English". This is > trained from a parallel corpus constructed from the lowercase version > of the English, and the original English. > >> Is there some documentation about which arguments to give to train- >> recaser.perl > > > There's a little bit here: > > http://www.statmt.org/wmt08/baseline.html > > But its pretty minimal. > >> Why is there yet another moses.ini file here. I thought at this >> stage we are finished training and thus we do not need the >> moses.ini file anymore. > > Because the recaser is a completely separate set of Moses models. > Even the language model is different - it's trained from the original > English, while the "main" language model is trained from the > lowercase English, to match what the main translation model wants to > produce. > > It's worth noting that there are other ways to deal with translating > case. You could simply leave the corpus unaltered, and train > everything on caseful data. Then Moses would treat "burger" and > "Burger" as completely unrelated words (same for "the" and "The", > however). Or you could train a caseless translation model, but use a > caseful language model to disambiguate between the possible case > patterns for each word. There are a couple ways people have done the > latter, by either using SRILM's disambiguate tool, or by hacking the > phrase table to have every likely case pattern for each phrase. I > think early versions of Moses used the former approach, while one of > Google's entries in the NIST evals used the latter. > > Hope this lengthy explanation helped. > > - John Burger (not to be confused with john burger) > MITRE > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
