Re: [Moses-support] BLEU evaluation at WMT - contractions

Marcin Junczys-Dowmunt Fri, 10 Oct 2014 10:01:56 -0700

Thanks for the quick answer.

I admire the stoicism :) I find it painful to see that contractions arenot handled by the official script. You get two errors for not hitting"we" and "are" when you have "we're" which is actually the same (modulostyle). Also, I guess the news domain has less issues with contractionsotherwise you might have heard more complaints. Unfortunately I have toprovide results in WMT-style, so there is no way around that script.METEOR does it right by the way.


W dniu 10.10.2014 17:44, Philipp Koehn pisze:

Hi,

there are a lot of issues with tokenization.

The BLEU scores we report in WMT are using the standard NIST script,
which expects detokenized and properly cased output. The script does
its own internal tokenization, we just accept that.

Another way to compute BLEU scores is with multi-bleu.perl - which
completely accepts your tokenization.

-phi

On Fri, Oct 10, 2014 at 11:12 AM, Marcin Junczys-Dowmunt<[email protected] <mailto:[email protected]>> wrote:


    Hi,

    slightly off-topic: I have a question concerning the evaluation
    practice during WMT. I have noticed that the standard NIST script
    mteval-v1.3a.pl <http://mteval-v1.3a.pl> (or any other versions)
    does not split on apostrophes for English contractions. How was
    this handled during the WMT? Did you use the official NIST scripts
    for BLEU calculation after detokenization? If yes, this would
    severely penalize the use of contractions over non-contracted
    forms (around 2-3% BLEU), is this just generally accepted?

    Thanks,

    Marcin


    _______________________________________________
    Moses-support mailing list
    [email protected] <mailto:[email protected]>
    http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] BLEU evaluation at WMT - contractions

Reply via email to