Dear David, basically, you want an SMT model that has optimal parameter settings for getting understandable translations. This is hard to do, however, as you need humans for grading the understandability (or general quality) of a translation. This is why people created different sorts of proxies for the actual translation quality: One are evaluation metrics like BLEU and Meteor, which are not numerically well-behaved, but are more directly aimed at representing translation quality. The other are criteria such as (log-)likelihood of the training data (which is what a maxent learner uses), which are numerically well-behaved, but cannot be applied to arbitrary (i.e., non-statistical) translation systems and don't necessarily correspond to some quantity that makes sense from a "this system produced great translations" perspective.
MERT optimizes the first sort of criteria - the hairy-but-closer-to-what-you-want sort of function; and therefore it's limited to optimizing a relatively small number of parameters (a dozen to a couple hundred); furthermore, it's relatively frequent that parameters you get by optimizing on one evaluation measure are not optimal for a different evaluation measure. MaxEnt learning optimizes the second sort of criteria - functions which are well-behaved but are useful as a goodness criterion mostly from a statistical point of view. You can use much more efficient optimization algorithms for these, which can handle a much larger number of parameters (in the range up to a million, commonly a few hundred thousands). Trying to do the large-scale optimization task (i.e., lots of parameters) directly on an evaluation metric is usually a bad idea because that function is rather spiky and may have local optima that are simply due to a combination of overfitting and quirks of the evaluation metric (and wouldn't carry over to the test dataset). Best, Yannick -- SFB 833 "Bedeutungskonstitution" Nauklerstr. 35 - D-72074 Tübingen Tel.: +49-7071-77155; Fax: +49-7071-5830 _______________________________________________ Mt-list mailing list
