Re: [Moses-support] sacréBLEU

Matt Post Sun, 12 Nov 2017 08:33:06 -0800

Hi, yes, I could add this easily. There are currently "wmt16/B" and "wmt17/B" 
test sets that include just the second reference. Do you anticipate using 
*just* the second reference? If so, I can create new test sets "wmt16/2" and 
"wm17/2" test sets that use both references. If you don't care about using just 
the second reference, I will repurpose the "*/B" to use both.


matt


> On Nov 12, 2017, at 10:47 AM, Jörg Tiedemann <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> 
> This is nice! Could your tool even support an option that makes use of the 
> multi-reference test sets that are available for English-Finnish in 2016 and 
> 2017? They would finally be used for something if there would be a simple 
> option that allows to download and use those sets for standard evaluation. 
> Thanks!
> 
> Jörg
> 
> **********************************************************************************************
> Jörg Tiedemann
> Department of Modern Languages        http://blogs.helsinki.fi/tiedeman/ 
> <http://blogs.helsinki.fi/tiedeman/>
> University of Helsinki                             
> http://blogs.helsinki.fi/language-technology/ 
> <http://blogs.helsinki.fi/language-technology/>
> 
> 
> 
>> On 11 Nov 2017, at 12:37, Matt Post <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hi,
>>  
>> I’ve written a BLEU scoring tool called “sacreBLEU” that may be of use to 
>> people here. The goal is to get people to start reporting WMT-matrix 
>> compatible scores in their papers (i.e., scoring on detokenized outputs with 
>> a fixed reference tokenization) so that numbers can be compared directly, in 
>> the spirit of Rico Sennrich's multi-bleu-detok.pl. The nice part for you is 
>> that it auto-downloads WMT datasets and makes it so you no longer have to 
>> deal with SGML. You can install it via pip:
>>  
>>     pip3 install sacrebleu
>>  
>> For starters, you can use it to easily download datasets:
>>  
>>     sacrebleu -t wmt17 -l en-de --echo src > wmt17.en-de.en
>>     sacrebleu -t wmt17 -l en-de --echo ref > wmt17.en-de.de 
>> <http://wmt17.en-de.de/>
>>  
>> You don’t need to download the reference, though. You can just score against 
>> it using sacreBLEU directly. After decoding and detokenizing, try:
>>  
>>     cat output.detok.txt | sacrebleu -t wmt17 -l en-de
>>  
>> I have tested and it produces the exact same numbers as Moses' 
>> mteval-v13a.pl, which is the official scoring script for WMT. It computes 
>> the exact same numbers for all 153 WMT17 system submissions (column 
>> BLEU-cased at matrix.statmt.org <http://matrix.statmt.org/>). For example:
>>  
>>     $ cat newstest2017.uedin-nmt.4722.en-de | sacrebleu -t wmt17 -l en-de
>>     
>> BLEU+case.mixed+lang.en-de+numrefs.1+smooth.exp+test.wmt17+tok.13a+version.1.1.4
>>  = 28.30 59.9/34.0/21.8/14.4 (BP = 1.000 ratio = 1.026 hyp_len = 62873 
>> ref_len = 61287)
>> 
>> This means numbers computed with it are directly comparable across papers. 
>> As you can see, in addition to the score, it outputs a version string that 
>> records the exact BLEU parameters used. The output string is compatible with 
>> the output of multi-bleu.pl, so your old code for parsing the BLEU score out 
>> of multi-bleu.pl should still work.
>>  
>> You can also use the tool in a backward compatible mode with arbitrary 
>> references, the same way 
>>  
>>     cat output.detok.txt | sacrebleu ref1 [ref2 …]
>>  
>> The official code is in sockeye (Amazon’s NMT system):
>> 
>>     github.com 
>> <http://github.com/>/awslabs/sockeye/tree/master/contrib/sacrebleu 
>> <http://github.com/awslabs/sockeye/tree/master/contrib/sacrebleu>
>> 
>> I will also likely maintain a clone here:
>>  
>>     github.com/mjpost/sacreBLEU <http://github.com/mjpost/sacreBLEU>
>>  
>> matt
>> _______________________________________________
>> Moses-support mailing list
>> [email protected] <mailto:[email protected]>
>> http://mailman.mit.edu/mailman/listinfo/moses-support 
>> <http://mailman.mit.edu/mailman/listinfo/moses-support>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] sacréBLEU

Reply via email to