Wow that was a long read. Still reading though :) but I see that tuning is essential. I am fairly new to Moses so could you please check if the commands I ran were correct (minus the tuning part). I just modified the commands on the Moses website for building a baseline system. Below are the commands I ran. My training files are "compilation.en" and "compilation.sm". My test files are "test.en" and "test.sm".
~/mosesdecoder/scripts/tokenizer/tokenizer.perl -l en < ~/corpus/training/compilation.en > ~/corpus/compilation.tok.en ~/mosesdecoder/scripts/tokenizer/tokenizer.perl -l sm < ~/corpus/training/ compilation.sm > ~/corpus/compilation.tok.sm ~/mosesdecoder/scripts/recaser/train-truecaser.perl --model ~/corpus/truecase-model.en --corpus ~/corpus/compilation.tok.en ~/mosesdecoder/scripts/recaser/train-truecaser.perl --model ~/corpus/ truecase-model.sm --corpus ~/corpus/compilation.tok.sm ~/mosesdecoder/scripts/recaser/truecase.perl --model ~/corpus/truecase-model.en < ~/corpus/compilation.tok.en > ~/corpus/compilation.true.en ~/mosesdecoder/scripts/recaser/truecase.perl --model ~/corpus/ truecase-model.sm < ~/corpus/compilation.tok.sm > ~/corpus/ compilation.true.sm ~/mosesdecoder/scripts/training/clean-corpus-n.perl ~/corpus/compilation.true sm en ~/corpus/compilation.clean 1 80 cd ~/working nohup nice ~/mosesdecoder/scripts/training/train-model.perl -root-dir train -corpus ~/corpus/compilation.clean -f sm -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8 -external-bin-dir ~/mosesdecoder/tools >& training.out & cd ~/corpus ~/mosesdecoder/scripts/tokenizer/tokenizer.perl -l en < test.en > test.tok.en ~/mosesdecoder/scripts/tokenizer/tokenizer.perl -l sm < test.sm > test.tok.sm ~/mosesdecoder/scripts/recaser/truecase.perl --model truecase-model.en < test.tok.en > test.true.en ~/mosesdecoder/scripts/recaser/truecase.perl --model truecase-model.sm < test.tok.sm > test.true.sm cd ~/working ~/mosesdecoder/scripts/training/filter-model-given-input.pl filtered-test train/model/moses.ini ~/corpus/test.true.sm -Binarizer ~/mosesdecoder/bin/processPhraseTableMin nohup nice ~/mosesdecoder/bin/moses -f ~/working/filtered-test/moses.ini < ~/corpus/test.true.sm > ~/working/test.translated.en 2> ~/working/test.out ~/mosesdecoder/scripts/generic/multi-bleu.perl -lc ~/corpus/test.true.en < ~/working/test.translated.en On 22 June 2015 at 01:20, Marcin Junczys-Dowmunt <[email protected]> wrote: > Hm. That's interesting. The language should not matter. > > 1) Do not report results without tuning. They are meaningless. There is a > whole thread on that, look for "Major bug found in Moses". If you ignore > the trollish aspects it contains may good descriptions why this is a > mistake. > > 2) Assuming it was the same data every time (was it?), without tuning > however I do not quite see where the variance is coming from. This rather > suggests you have something weird in your pipeline. Mgiza is the only > stochastic element there, but usually its results are quite consistent. For > the same weights in your ini-file you should have very similar results. > Tuning would be the part that introduces instability, but even then these > differences would be a little on the extreme end, though possible. > > On 22.06.2015 08:12, Hokage Sama wrote: > >> Thanks Marcin. Its for a new resource-poor language so I only trained it >> with what I could collect so far (i.e. only 190,630 words of parallel >> data). I retrained the entire system each time without any tuning. >> >> On 22 June 2015 at 01:00, Marcin Junczys-Dowmunt <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi, >> I think the average is OK, your variance is however quite high. >> Did you >> retrain the entire system or just optimize parameters a couple of >> times? >> >> Two useful papers on the topic: >> >> https://www.cs.cmu.edu/~jhclark/pubs/significance.pdf >> <https://www.cs.cmu.edu/%7Ejhclark/pubs/significance.pdf> >> http://www.mt-archive.info/MTS-2011-Cettolo.pdf >> >> >> On 22.06.2015 02 <tel:22.06.2015%2002>:37, Hokage Sama wrote: >> > Hi, >> > >> > Since MT training is non-convex and thus the BLEU score varies, >> which >> > score should I use for my system? I trained my system three times >> > using the same data and obtained the three different scores below. >> > Should I take the average or the best score? >> > >> > BLEU = 17.84, 49.1/22.0/12.5/7.5 (BP=1.000, ratio=1.095, >> hyp_len=3952, >> > ref_len=3609) >> > BLEU = 16.51, 48.4/20.7/11.4/6.5 (BP=1.000, ratio=1.093, >> hyp_len=3945, >> > ref_len=3609) >> > BLEU = 15.33, 48.2/20.1/10.3/5.5 (BP=1.000, ratio=1.087, >> hyp_len=3924, >> > ref_len=3609) >> > >> > Thanks, >> > Hilton >> > >> > >> > _______________________________________________ >> > Moses-support mailing list >> > [email protected] <mailto:[email protected]> >> > http://mailman.mit.edu/mailman/listinfo/moses-support >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] <mailto:[email protected]> >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >> >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
