Please welcome a new parameter tuner to Moses: k-best batch MIRA. This
is hope-fear MIRA built as a drop-in replacement for MERT; it conducts
online training using aggregated k-best lists as an approximation to
the decoder's true search space. This allows it to handle large
features, and it often out-performs MERT once feature counts get above
10. The new code has been pushed into the master branch, as well as
into miramerge.

You can tune using this system by adding [--batch-mira] to your
mert-moses.pl command. This replaces the normal call to the mert
executable with a call to kbmira.

I recommend also adding the flag [--return-best-dev] to mert-moses.pl.
This will copy the moses.ini file corresponding to the highest-scoring
development run (as determined by the evaluator executable using BLEU
on run*.out) into the final moses.ini. This can make a fairly big
difference for MIRA's test-time accuracy.

You can also pass through options to kbmira by adding
[--batch-mira-args 'whatever'] to mert-moses.pl. Useful kbmira options
include:

[-J n] : changes the number of inner MIRA loops to n passes over the
data. Increasing this value to 100 or 300 can be good for working with
small development sets. The default, 60, is ideal for development sets
with more than 1000 sentences.

[-C n] : changes MIRA's C-value to n. This controls regularization.
The default, 0.01, works well for most situations, but if it looks
like MIRA is over-fitting or not converging, decreasing C to 0.001 or
0.0001 can sometimes help.

[--streaming] : stream k-best lists from disk rather than load them
into memory. This results in very slow training, but may be necessary
in low-memory environments or with very large development sets.

Run kbmira --help for a full list of options.

So, a complete call might look like this:

$MOSES_SCRIPTS/training/mert-moses.pl work/dev.fr work/dev.en
$MOSES_BIN/moses work/model/moses.ini --mertdir $MOSES_BIN --rootdir
$MOSES_SCRIPTS --batch-mira --return-best-dev --batch-mira-args '-J
300' --decoder-flags '-threads 8 -v 0'

Please give it a try. If it's not working as advertised, send me an
e-mail and I'll see what I can do.

For more information on batch MIRA, or to cite us, check out our paper:

Colin Cherry and George Foster
Batch Tuning Strategies for Statistical Machine Translation
NAACL, June 2012
https://sites.google.com/site/colinacherry/Cherry_Foster_NAACL_2012.pdf

Anticipating some questions:

Q: Does it only handle BLEU?
A: Yes, for now. There's nothing stopping people from implementing
other metrics, so long as a reasonable sentence-level version of the
metric can be worked out. Note that you generally need to retune
kbmira's C-value for different metrics. I'd also change
--return-best-dev to use the new metric as well.

Q: Have you tested this on a cluster?
A: No, I don't have access to a Sun Grid cluster - I would love it if
someone would test that scenario for me. But it works just fine using
multi-threaded decoding. Since training happens in a batch, decoding
is embarrassingly parallel.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to