Hi,

the NIST script does internal tokenization, while the multi-bleu script
assumes that the data is already tokenized. There is also a difference
with the brevity penalty in the case of multiple reference translations.

-phi

On Fri, Mar 19, 2010 at 7:49 PM, Adam Lopez <[email protected]> wrote:
> IIRC, the principle difference is the calculation of the brevity
> penalty, but there also seem to be some slight differences in
> tokenization between the scripts.
>
> On Fri, Mar 19, 2010 at 9:32 AM, Mark Fishel <[email protected]> wrote:
>> Dear list,
>>
>> I am getting different BLEU scores from the NIST mteval script
>> (version) and the multi-bleu.perl script within Moses's distribution
>> for the same reference and hypothesis translations -- even the
>> individual n-gram precisions are different:
>>
>> BLEU = 16.80, 53.0/26.2/13.4/6.4 (BP=0.905, ratio=0.909, hyp_len=281,
>> ref_len=309)
>>
>> and
>>
>> BLEU score = 0.1681 for system "x"
>>
>> Individual N-gram scoring
>>        1-gram   2-gram   3-gram   4-gram   5-gram   6-gram   7-gram
>> 8-gram   9-gram
>>        ------   ------   ------   ------   ------   ------   ------
>> ------   ------
>>  BLEU:  0.5246   0.2591   0.1326   0.0630   0.0328   0.0213   0.0133
>> 0.0046   0.0000  "x"
>>
>> The files that produced the scores are here: mtj.ut.ee/diffbleu.tgz .
>>
>> Does everyone else get different scores? Can anyone suggest a reason
>> for that? It's not the smoothing of the NIST script, both support UTF8
>> i/o, etc; so I'm out of ideas, and before comparing the
>> implementations I wanted to ask for opinions.
>>
>> Thanks in advance,
>> Mark
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to