[Moses-support] Baseline: Problem with tuned weights

Raphael Höps Mon, 04 Jan 2016 09:21:19 -0800

Hello,

I did the moses-baseline tutorial to train and tune and translationmodel for English to German. After finishing the system it seemed towork quite well at first but then I noticed that the tuning step seemedto actually having made my system worse! I really don't know what I didwrong. I sticked very close to the tutorial. Here is what I did in detail:


1. Training the TM to working/train/model.

2. Tuning with a corpus that is a cut-down version of news-test2008. Themain result of this process are the weights of the new filemert-work/moses.ini, right?3. Filtering of mert-work/moses.ini to a testing corpus (cut-downversion of newstest2011).4. Translating the testing corpus and calculating BLEU-score. I got ascore of 7.42.5. In a second test I used the default moses.ini file instead of thetuned one (and the same filtered and binarized model) and got a score of8.22 on the same testing corpus!

Something is probably wrong with the tuned moses.ini file. To find out,I translated the corpus that was used for tuning with both ini-files andcalculated the scores:

Untuned: 7.01
Tuned: 6.70 (!)

Now this is really odd! Furthermore in the tuned moses.ini file there isthe line:# BLEU 0.0755253 on dev/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/corpus/dev-small.enWhy do I get a score of 6.7 instead? The files dev-small.en anddev-small.de where my tuning corpora.


Do you have any idea, what I might have done wrong?

For the tuning step, I used:
cd ~/working
nohup nice ~/mosesdecoder/scripts/training/mert-moses.pl \
   ~/corpus/dev-small.en ~/corpus/dev-small.de \

~/mosesdecoder/bin/moses train/model/moses.ini --mertdir~/mosesdecoder/bin/ \ &> mert.out &

I appended mert.log and the tune moses.ini file. Did anyone ever build asystem for English to German and can say something about the trainedweights in moses.ini? Do they seem okay?


Thank you very much for your help!
Greetings,
Raphi

# MERT optimized configuration
# decoder /home/rh/Studium/aktuell/LSS/moses/mosesdecoder/bin/moses
# BLEU 0.0755253 on dev 
/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/corpus/dev-small.en
# We were before running iteration 16
# finished Mi 18. Nov 03:42:26 CET 2015
### MOSES CONFIG FILE ###
#########################

# input factors
[input-factors]
0

# mapping steps
[mapping]
0 T 0

[distortion-limit]
6

# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryMemory name=TranslationModel0 num-features=4 
path=/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/working/train/model/phrase-table.gz
 input-factor=0 output-factor=0
LexicalReordering name=LexicalReordering0 num-features=6 
type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 
path=/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/working/train/model/reordering-table.wbe-msd-bidirectional-fe.gz
Distortion
KENLM lazyken=0 name=LM0 factor=0 
path=/home/rh/Studium/aktuell/LSS/moses/mosesdecoder/languagemodel/news-commentary-v8.de-en.blm.de
 order=3

# dense weights for feature functions
[weight]

LexicalReordering0= 0.0249924 0.227188 0.0517476 0.19267 -0.0499753 0.0222798
Distortion0= 0.0758859
LM0= 0.0681075
WordPenalty0= -0.125367
PhrasePenalty0= -0.0569564
TranslationModel0= 0.00738901 0.0353195 0.0442114 0.0179101
UnknownWordPenalty0= 1

shard_size = 0 shard_count = 0
Seeding random numbers with system clock 
name: case value: true
Data::m_score_type BLEU
Data::Scorer type from Scorer: BLEU
Loading Data from: run1.scores.dat and run1.features.dat
loading feature data from run1.features.dat
loading score data from run1.scores.dat
Loading Data from: run2.scores.dat and run2.features.dat
loading feature data from run2.features.dat
loading score data from run2.scores.dat
Loading Data from: run3.scores.dat and run3.features.dat
loading feature data from run3.features.dat
loading score data from run3.scores.dat
Loading Data from: run4.scores.dat and run4.features.dat
loading feature data from run4.features.dat
loading score data from run4.scores.dat
Loading Data from: run5.scores.dat and run5.features.dat
loading feature data from run5.features.dat
loading score data from run5.scores.dat
Loading Data from: run6.scores.dat and run6.features.dat
loading feature data from run6.features.dat
loading score data from run6.scores.dat
Loading Data from: run7.scores.dat and run7.features.dat
loading feature data from run7.features.dat
loading score data from run7.scores.dat
Loading Data from: run8.scores.dat and run8.features.dat
loading feature data from run8.features.dat
loading score data from run8.scores.dat
Loading Data from: run9.scores.dat and run9.features.dat
loading feature data from run9.features.dat
loading score data from run9.scores.dat
Loading Data from: run10.scores.dat and run10.features.dat
loading feature data from run10.features.dat
loading score data from run10.scores.dat
Loading Data from: run11.scores.dat and run11.features.dat
loading feature data from run11.features.dat
loading score data from run11.scores.dat
Loading Data from: run12.scores.dat and run12.features.dat
loading feature data from run12.features.dat
loading score data from run12.scores.dat
Loading Data from: run13.scores.dat and run13.features.dat
loading feature data from run13.features.dat
loading score data from run13.scores.dat
Loading Data from: run14.scores.dat and run14.features.dat
loading feature data from run14.features.dat
loading score data from run14.scores.dat
Loading Data from: run15.scores.dat and run15.features.dat
loading feature data from run15.features.dat
loading score data from run15.scores.dat
Loading Data from: run16.scores.dat and run16.features.dat
loading feature data from run16.features.dat
loading score data from run16.scores.dat
Data loaded : [Wall 9.19283 CPU 8.956] seconds.
Creating a pool of 1 threads
Best point: 0.0249924 0.227188 0.0517476 0.19267 -0.0499753 0.0222798 0.0758859 0.0681075 -0.125367 -0.0569564 0.00738901 0.0353195 0.0442114 0.0179101  => 0.0755253
Stopping... : [Wall 233.787 CPU 227.764] seconds.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] Baseline: Problem with tuned weights

Reply via email to