Hello,
I did the moses-baseline tutorial to train and tune and translation
model for English to German. After finishing the system it seemed to
work quite well at first but then I noticed that the tuning step seemed
to actually having made my system worse! I really don't know what I did
wrong. I sticked very close to the tutorial. Here is what I did in detail:
1. Training the TM to working/train/model.
2. Tuning with a corpus that is a cut-down version of news-test2008. The
main result of this process are the weights of the new file
mert-work/moses.ini, right?
3. Filtering of mert-work/moses.ini to a testing corpus (cut-down
version of newstest2011).
4. Translating the testing corpus and calculating BLEU-score. I got a
score of 7.42.
5. In a second test I used the default moses.ini file instead of the
tuned one (and the same filtered and binarized model) and got a score of
8.22 on the same testing corpus!
Something is probably wrong with the tuned moses.ini file. To find out,
I translated the corpus that was used for tuning with both ini-files and
calculated the scores:
Untuned: 7.01
Tuned: 6.70 (!)
Now this is really odd! Furthermore in the tuned moses.ini file there is
the line:
# BLEU 0.0755253 on dev
/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/corpus/dev-small.en
Why do I get a score of 6.7 instead? The files dev-small.en and
dev-small.de where my tuning corpora.
Do you have any idea, what I might have done wrong?
For the tuning step, I used:
cd ~/working
nohup nice ~/mosesdecoder/scripts/training/mert-moses.pl \
~/corpus/dev-small.en ~/corpus/dev-small.de \
~/mosesdecoder/bin/moses train/model/moses.ini --mertdir
~/mosesdecoder/bin/ \ &> mert.out &
I appended mert.log and the tune moses.ini file. Did anyone ever build a
system for English to German and can say something about the trained
weights in moses.ini? Do they seem okay?
Thank you very much for your help!
Greetings,
Raphi
# MERT optimized configuration
# decoder /home/rh/Studium/aktuell/LSS/moses/mosesdecoder/bin/moses
# BLEU 0.0755253 on dev
/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/corpus/dev-small.en
# We were before running iteration 16
# finished Mi 18. Nov 03:42:26 CET 2015
### MOSES CONFIG FILE ###
#########################
# input factors
[input-factors]
0
# mapping steps
[mapping]
0 T 0
[distortion-limit]
6
# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryMemory name=TranslationModel0 num-features=4
path=/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/working/train/model/phrase-table.gz
input-factor=0 output-factor=0
LexicalReordering name=LexicalReordering0 num-features=6
type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0
path=/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/working/train/model/reordering-table.wbe-msd-bidirectional-fe.gz
Distortion
KENLM lazyken=0 name=LM0 factor=0
path=/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/languagemodel/news-commentary-v8.de-en.blm.de
order=3
# dense weights for feature functions
[weight]
LexicalReordering0= 0.0249924 0.227188 0.0517476 0.19267 -0.0499753 0.0222798
Distortion0= 0.0758859
LM0= 0.0681075
WordPenalty0= -0.125367
PhrasePenalty0= -0.0569564
TranslationModel0= 0.00738901 0.0353195 0.0442114 0.0179101
UnknownWordPenalty0= 1
shard_size = 0 shard_count = 0
Seeding random numbers with system clock
name: case value: true
Data::m_score_type BLEU
Data::Scorer type from Scorer: BLEU
Loading Data from: run1.scores.dat and run1.features.dat
loading feature data from run1.features.dat
loading score data from run1.scores.dat
Loading Data from: run2.scores.dat and run2.features.dat
loading feature data from run2.features.dat
loading score data from run2.scores.dat
Loading Data from: run3.scores.dat and run3.features.dat
loading feature data from run3.features.dat
loading score data from run3.scores.dat
Loading Data from: run4.scores.dat and run4.features.dat
loading feature data from run4.features.dat
loading score data from run4.scores.dat
Loading Data from: run5.scores.dat and run5.features.dat
loading feature data from run5.features.dat
loading score data from run5.scores.dat
Loading Data from: run6.scores.dat and run6.features.dat
loading feature data from run6.features.dat
loading score data from run6.scores.dat
Loading Data from: run7.scores.dat and run7.features.dat
loading feature data from run7.features.dat
loading score data from run7.scores.dat
Loading Data from: run8.scores.dat and run8.features.dat
loading feature data from run8.features.dat
loading score data from run8.scores.dat
Loading Data from: run9.scores.dat and run9.features.dat
loading feature data from run9.features.dat
loading score data from run9.scores.dat
Loading Data from: run10.scores.dat and run10.features.dat
loading feature data from run10.features.dat
loading score data from run10.scores.dat
Loading Data from: run11.scores.dat and run11.features.dat
loading feature data from run11.features.dat
loading score data from run11.scores.dat
Loading Data from: run12.scores.dat and run12.features.dat
loading feature data from run12.features.dat
loading score data from run12.scores.dat
Loading Data from: run13.scores.dat and run13.features.dat
loading feature data from run13.features.dat
loading score data from run13.scores.dat
Loading Data from: run14.scores.dat and run14.features.dat
loading feature data from run14.features.dat
loading score data from run14.scores.dat
Loading Data from: run15.scores.dat and run15.features.dat
loading feature data from run15.features.dat
loading score data from run15.scores.dat
Loading Data from: run16.scores.dat and run16.features.dat
loading feature data from run16.features.dat
loading score data from run16.scores.dat
Data loaded : [Wall 9.19283 CPU 8.956] seconds.
Creating a pool of 1 threads
Best point: 0.0249924 0.227188 0.0517476 0.19267 -0.0499753 0.0222798 0.0758859 0.0681075 -0.125367 -0.0569564 0.00738901 0.0353195 0.0442114 0.0179101 => 0.0755253
Stopping... : [Wall 233.787 CPU 227.764] seconds.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support