Tuning was always gave better BLEU score by~8 points or more , suddenly now tune step (MERT) gives degradded BLEU score starting from iterration 1 till the last one !!!! Marwa N. Refaie
To: [email protected]; [email protected] From: [email protected] Date: Wed, 6 Jan 2016 17:33:22 +0000 Subject: Re: [Moses-support] Baseline: Problem with tuned weights are you sure you got the source and target side of the tuning set the right way round? And the tuning set must be tokenized and lowercase/truecased EXACTLY the same way that you tokenized & lowercaseed/truecased the training data On 04/01/16 17:16, Raphael Höps wrote: Hello, I did the moses-baseline tutorial to train and tune and translation model for English to German. After finishing the system it seemed to work quite well at first but then I noticed that the tuning step seemed to actually having made my system worse! I really don't know what I did wrong. I sticked very close to the tutorial. Here is what I did in detail: 1. Training the TM to working/train/model. 2. Tuning with a corpus that is a cut-down version of news-test2008. The main result of this process are the weights of the new file mert-work/moses.ini, right? 3. Filtering of mert-work/moses.ini to a testing corpus (cut-down version of newstest2011). 4. Translating the testing corpus and calculating BLEU-score. I got a score of 7.42. 5. In a second test I used the default moses.ini file instead of the tuned one (and the same filtered and binarized model) and got a score of 8.22 on the same testing corpus! Something is probably wrong with the tuned moses.ini file. To find out, I translated the corpus that was used for tuning with both ini-files and calculated the scores: Untuned: 7.01 Tuned: 6.70 (!) Now this is really odd! Furthermore in the tuned moses.ini file there is the line: # BLEU 0.0755253 on dev /home/rh/Studium/aktuell/LSS/moses/mosesdecoder/corpus/dev-small.en Why do I get a score of 6.7 instead? The files dev-small.en and dev-small.de where my tuning corpora. Do you have any idea, what I might have done wrong? For the tuning step, I used: cd ~/working nohup nice ~/mosesdecoder/scripts/training/mert-moses.pl \ ~/corpus/dev-small.en ~/corpus/dev-small.de \ ~/mosesdecoder/bin/moses train/model/moses.ini --mertdir ~/mosesdecoder/bin/ \ &> mert.out & I appended mert.log and the tune moses.ini file. Did anyone ever build a system for English to German and can say something about the trained weights in moses.ini? Do they seem okay? Thank you very much for your help! Greetings, Raphi _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support -- Hieu Hoang http://www.hoang.co.uk/hieu _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
