> Here's what I didn't expect. I shuffled the order of the pairs in the 
evaluation set and ran mteval-12.pl again for each set. For each set,
 > the same data shuffled in a different order and run through 
mteval-12.pl resulted in different cumulative BLEU scores. These scores
 > varied from 0.8520 to 0.8627. Same data, different evaluation order.

that is extremely odd. The only thing I can think of is a floating point 
numerical precision problem. Or a bug in mteval-12.pl. Would it be 
possible to send me the dataset you're using? I'll take a look at it


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to