Hi John, my guess would be that one of your files isn't tokenized -- does that language you're working on have word boundaries?
The -e option of mteval does just what you described -- encloses non-ascii characters between spaces. The way I see it, if the alphabet of your language is mostly non-ascii then the option of course should not be used; if on the other hand the non-ascii characters are scarce and not part of regular words, or conversely every symbol is a word, then that is the case for the -e option. You can also compare the scores to the multi-bleu.perl script of the moses package, the scoring implementation of the bootstrap-hypothesis-difference-significance.pl script is identical with it. If nothing helps, you can send me the hypothesis and reference files, and I'll repeat the process on my own machine and see what I can do. Best, Mark On Mon, Nov 29, 2010 at 2:52 PM, John Morgan <[email protected]> wrote: > Thanks Mark, > My results are still off. > My data is encoded in utf-8. > Your script reports an actual BLEU score of 0.024447 for my hypothesis 1. > The score reported by mteval using the -e option is 0.2459. > The score reported by mteval without the -e option is 0.0268. > I'm not sure which score is more accurate since I can't read the > language, but the 2 scores are off by an order of magnitude. > Is the -e option to mteval bogus? > > John > > > On 11/29/10, Mark Fishel <[email protected]> wrote: >> Hi John, >> >> Thanks for pointing out the issue; I added support for arbitrary >> encodings to the script, by default it's set to UTF8 but you can >> change the global variable on line 23 for other encodings; just update >> the file from SVN. >> >> Treating non-ascii characters as separate tokens by wrapping them in >> spaces should not be the right thing to do in the general case, as far >> as I understand. >> >> Best, >> Mark >> >> On Mon, Nov 29, 2010 at 12:34 AM, John Morgan >> <[email protected]> wrote: >>> Hi, >>> I'd like to use the script >>> bootstrap-hypothesis-difference-significance.pl >>> to compare 2 systems that translate from English into languages that >>> use non-ascii character encodings. >>> I think this script is written for English hypothesis and reference files. >>> I guess that an option similar to the -e option to mteval needs to be >>> added to the script to make it work for non-ascii files. >>> I added the following line to the script at line 240 after the "while" >>> statement slurps in a line from the opened file: >>> s/([^[:ascii:]])/ $1 /g >>> It looks like this is all the -e option to mteval does. >>> I have 2 questions: >>> Is this the correct way to get the bootstrap script to work on >>> non-ascii text files? >>> If yes, can anyone explain to me why? >>> Why do we need to wrap white space around nonascii characters? >>> >>> When I do this the BLEU scores look reasonable (but I could be fooling >>> myself). >>> >>> >>> -- >>> Regards, >>> John J Morgan >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >> > > > -- > Regards, > John J Morgan > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
