I'm translating English to Indonesian and vice versa using Moses. I discover that when I run in different machines and even in the same machine, the result can be different especially with tuning.
So far I've discovered 3 places which cause the result to be different. 1. mert-modified.pl, I just need to activate predictable-seed. 2. mkcls, just set the seed for each run 3. mgiza: I find that even for the first iteration, the result is already different: In one run: Model1: Iteration 1 Model1: (1) TRAIN CROSS-ENTROPY 15.8786 PERPLEXITY 60246.2 Model1: (1) VITERBI TRAIN CROSS-ENTROPY 20.5269 PERPLEXITY 1.51077e+06 Model 1 Iteration: 1 took: 1 seconds In second run: Model1: Iteration 1 Model1: (1) TRAIN CROSS-ENTROPY 15.928 PERPLEXITY 62347.7 Model1: (1) VITERBI TRAIN CROSS-ENTROPY 20.5727 PERPLEXITY 1.55952e+06 Model 1 Iteration: 1 took: 1 seconds I have no idea where the randomization occurs for MGIZA even after looking at the codes which is hard to be understood. So my questions are: 1. How do I set it so the cross-entropy result in MGIZA to be the same? I think randomisation occurs somewhere but I can't find it. 2. I read in some threads that we need to run multiple time and average the result for the run to report. However, how I can find the best combination for training and tuning parameters if the result for each run is different? For example if I want to find the best combination for which alignment and which reordering model. 3. Is that possible that tuning causes worst result? My corpus is around 500,000 words and I use 100 sentences for tuning. Can the sentences for tuning be used for training or are they supposed to be separate? I used 100 sentences which are different from the training set. My non-tuning NIST and BLEU results are around 6.5 and 0.21, while the non tuning results are around 6.1 and 0.19. Is not the result a bit too low? I'm not sure how to increase it. Sorry for the multiple questions in one post. I can separate them into different posts but I don't want to spam the mailing list. Thanks. Any help will be appreciated. Best regards, Jelita
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
