Re: [Moses-support] Cumulative BLEU scores

Philipp Koehn Wed, 26 Oct 2016 12:56:10 -0700

Hi,

I think you are right - the first set of numbers are the n-gram precisions
for each order of n-gram.
The second set are numbers that you get if you take the geometric mean of
the n-gram precisions.
Hence, the number under 4-gram is the BLEU score.


The BLEU score is traditionally computed for 1-4 grams, the original BLEU
paper discusses this.
There was the expectation that if machine translation gets better, we
should use higher-order BLEU,
but we never did.

-phi




On Wed, Oct 26, 2016 at 12:44 AM, Nat Gillin <nat.gil...@gmail.com> wrote:

> Dear Moses community,
>
> Ah, I found out what the cumulative means. The cumulative scores are the
> usual BLEU scores that we report because it includes the order of ngrams
> before the order that is desired.
>
> The only odd numbers from the mteval-v13a.pl are the individual BLEU
> scores. Is it right that the individual BLEU scores are the bp * weights *
> modified_precision for each order of ngram? Are there corresponding papers
> that investigates these numbers?
>
> Regards,
> Nat
>
> On Tue, Oct 25, 2016 at 12:02 PM, Nat Gillin <nat.gil...@gmail.com> wrote:
>
>> Dear Moses community,
>>
>> To make the question clearer:
>>
>> The question is why does the cumulative score add the brevity penalty
>> before taking the exponent at every order of ngram but the individual score
>> only takes the brevity penalty into account at
>> https://github.com/moses-smt/mosesdecoder/blob/master/scr
>> ipts/generic/mteval-v13a.pl#L874
>>
>> Any pointers to the papers describing the cumulative score would be nice
>> =)
>>
>> Thanks in advance again,
>> Nat
>>
>> On Tue, Oct 25, 2016 at 11:58 AM, Nat Gillin <nat.gil...@gmail.com>
>> wrote:
>>
>>> Dear Moses Community,
>>>
>>> When using mteval-13a.pl, we note that the output looks like this:
>>>
>>> length ratio: 1.07303974221267 (1998/1862), penalty (log): 0
>>>
>>> NIST score = 5.0564  BLEU score = 0.2318 for system "Google"
>>>
>>>
>>> # ------------------------------------------------------------
>>> ------------
>>>
>>>
>>> Individual N-gram scoring
>>>
>>>         1-gram   2-gram   3-gram   4-gram   5-gram   6-gram   7-gram
>>> 8-gram   9-gram
>>>
>>>         ------   ------   ------   ------   ------   ------   ------
>>> ------   ------
>>>
>>>  NIST:  4.4488   0.5554   0.0477   0.0045   0.0000   0.0000   0.0000
>>> 0.0000   0.0000  "Google"
>>>
>>>
>>>  BLEU:  0.5415   0.2972   0.1752   0.1025   0.0626   0.0354   0.0193
>>> 0.0085   0.0017  "Google"
>>>
>>>
>>> # ------------------------------------------------------------
>>> ------------
>>>
>>> Cumulative N-gram scoring
>>>
>>>         1-gram   2-gram   3-gram   4-gram   5-gram   6-gram   7-gram
>>> 8-gram   9-gram
>>>
>>>         ------   ------   ------   ------   ------   ------   ------
>>> ------   ------
>>>
>>>  NIST:  4.4488   5.0043   5.0520   5.0564   5.0564   5.0564   5.0564
>>> 5.0564   5.0564  "Google"
>>>
>>>
>>>  BLEU:  0.5415   0.4012   0.3044   0.2318   0.1784   0.1362   0.1031
>>> 0.0754   0.0493  "Google"
>>>
>>> And at https://github.com/moses-smt/mosesdecoder/blob/master/scr
>>> ipts/generic/mteval-v13a.pl#L823, it tries to calculate the cumulative
>>> score by accumulate the individual ngram precisions and at each order of
>>> ngram add to it and do a normalization before calculating the cumulative
>>> score for each order of nrgram.
>>>
>>> The question is why does it add the brevity penalty? (i.e. $len_score)
>>>
>>> Also, is this score discussed in any paper?
>>>
>>> Thanks in advance for the clarifications!
>>>
>>> Regards,
>>> Nat
>>>
>>>
>>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Cumulative BLEU scores

Reply via email to