Hi John, yes, the translation model is trained on case-sensitive data, and the recaser only uppercases the first word of each sentence. For some of the results, check out ACL WMT-08 workshop paper. For other language pairs we so slight increases or drops in scores (in the 0.1-0.3 range) compared to the traditional approach.
-phi On Thu, Jul 17, 2008 at 9:17 PM, John D. Burger <[EMAIL PROTECTED]> wrote: > Philipp Koehn wrote: > >> just to add one comment: We have recently experimented >> with training models on truecased model (where only the >> first word of each sentence is converted into its most common >> casing), with mixed results. Such a truecaser should also >> be adjusted to deal with ALL-CAPS HEADLINES and so on. > > Hi Philipp - > > Just to clarify, you mean you have built case-sensitive translation > models, with no separate recaser? Or are you talking about a > slightly different approach to training a recaser? > > Can you expand on your "mixed results" comment? Thanks! > > - John D. Burger > MITRE > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
