Thanks Mark,
My results are still off.
My data is encoded in utf-8.
Your script reports an actual BLEU score of 0.024447 for my hypothesis 1.
The score reported by mteval using the -e option is 0.2459.
The score reported by mteval without the -e option is 0.0268.
I'm not sure which score is more accurate since I can't read the
language, but the 2 scores are off by an order of magnitude.
Is the -e option to mteval bogus?

John


On 11/29/10, Mark Fishel <[email protected]> wrote:
> Hi John,
>
> Thanks for pointing out the issue; I added support for arbitrary
> encodings to the script, by default it's set to UTF8 but you can
> change the global variable on line 23 for other encodings; just update
> the file from SVN.
>
> Treating non-ascii characters as separate tokens by wrapping them in
> spaces should not be the right thing to do in the general case, as far
> as I understand.
>
> Best,
> Mark
>
> On Mon, Nov 29, 2010 at 12:34 AM, John Morgan
> <[email protected]> wrote:
>> Hi,
>> I'd like to use the script
>> bootstrap-hypothesis-difference-significance.pl
>> to compare 2 systems that translate from English into languages that
>> use non-ascii character encodings.
>> I think this script is written for English hypothesis and reference files.
>> I guess that an option similar to the -e option to mteval needs to be
>> added to the script to make it work for non-ascii files.
>> I added the following line to the script at line 240 after the "while"
>> statement slurps in a line from the opened file:
>> s/([^[:ascii:]])/ $1 /g
>> It looks like this is all the -e option to mteval does.
>> I have 2 questions:
>> Is this the correct way to get the bootstrap script to work on
>> non-ascii text files?
>> If yes, can anyone explain to me why?
>> Why do we need to wrap white space around nonascii characters?
>>
>> When I do this the BLEU scores look reasonable (but I could be fooling
>> myself).
>>
>>
>> --
>> Regards,
>> John J Morgan
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>


-- 
Regards,
John J Morgan
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to