Hi,

I have a 2k sentences tuning, 1k evaluation and a 70k  training corpus
in the IT software domain and after tuning I get a slightly lower BLEU
score but the reordering is way better and therefore the subjective
translation quality is better.

In this case I wouldn't recommend to use BLEU as a metric, but METEOR
which gives me a more accurate quality measurement:

http://www.cs.cmu.edu/~alavie/METEOR/examples.html


Regards,
Andreas
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to