Dear David,

basically, you want an SMT model that has optimal parameter settings
for getting understandable translations. This is hard to do, however, as
you need humans for grading the understandability (or general quality)
of a translation.
This is why people created different sorts of proxies for the actual
translation quality:
One are evaluation metrics like BLEU and Meteor, which are not
numerically well-behaved, but are more directly aimed at representing
translation quality.
The other are criteria such as (log-)likelihood of the training data
(which is what a maxent learner uses), which are numerically
well-behaved, but cannot be applied to arbitrary (i.e., non-statistical)
translation systems and don't necessarily correspond to some
quantity that makes sense from a "this system produced great
translations" perspective.

MERT optimizes the first sort of criteria - the 
hairy-but-closer-to-what-you-want
sort of function; and therefore it's limited to optimizing a relatively
small number of parameters (a dozen to a couple hundred); furthermore,
it's relatively frequent that parameters you get by optimizing on one
evaluation measure are not optimal for a different evaluation measure.

MaxEnt learning optimizes the second sort of criteria - functions which
are well-behaved but are useful as a goodness criterion mostly from
a statistical point of view. You can use much more efficient optimization
algorithms for these, which can handle a much larger number of
parameters (in the range up to a million, commonly a few hundred
thousands).

Trying to do the large-scale optimization task (i.e., lots of parameters)
directly on an evaluation metric is usually a bad idea because that
function is rather spiky and may have local optima that are simply
due to a combination of overfitting and quirks of the evaluation metric
(and wouldn't carry over to the test dataset).

Best,
Yannick
-- 
SFB 833 "Bedeutungskonstitution"
Nauklerstr. 35 - D-72074 Tübingen
Tel.: +49-7071-77155; Fax: +49-7071-5830

_______________________________________________
Mt-list mailing list

Reply via email to