Hi, some years ago, I noticed a "non-determinism" related to the *order* of reference translation files supplied. With multiple references, the "closest" reference length can be 8 or 12 to a hypothesis of 10 words. And the order of reference translations then mattered...
Cheers, O. "Tom Hoar" <[email protected]> wrote: > > >Hieu, > >I hadn't thought about floating point accuracy. Possible. I >can't share this data set. I'll make another set that I can share and >try to replicate the results. > >I have one other question about >mteval-12.pl (or v11). Is there any way to turn off its internal >tokenizer? I want to test my tokenization & not theirs. I didn't see an >easy a way other than editing the script. > >Tom > >On 2012-08-18 21:34, >Hieu Hoang wrote: > >>> Here's what I didn't expect. I shuffled the order >of the pairs in the >> evaluation set and ran mteval-12.pl again for >each set. For each set, >> > the same data shuffled in a different order >and run through >> mteval-12.pl resulted in different cumulative BLEU >scores. These scores >> > varied from 0.8520 to 0.8627. Same data, >different evaluation order. >> >> that is extremely odd. The only thing I >can think of is a floating point >> numerical precision problem. Or a >bug in mteval-12.pl. Would it be >> possible to send me the dataset >you're using? I'll take a look at it >> >> >_______________________________________________ >> Moses-support mailing >list >> [email protected] >> >http://mailman.mit.edu/mailman/listinfo/moses-support > /usr/bin/perl -w >/home/tahoar/domy-2.5/bin/train-model.perl > --do-steps 1 > >--alignment-file >/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.aligned.7.f-e > >--cores 6 > --corpus /opt/domy/BUILDS/tm/lotto_1/bitext > --corpus-dir >/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.classes > --e e > >--external-bin-dir /home/tahoar/domy-2.5/bin > --extract-file >/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.extract.f-e/ext.7-gram > > --f f > --giza-e2f >/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.e-f > --giza-f2e >/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.f-e > >--glue-grammar-file >/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/glue-grammar > >--lexical-file >/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.lexicon.f-e/lex.7-gram > > --lm 0:0:/tmp/placeholder.lm:0 > --mgiza > --mgiza-cpus 6 > >--model-dir /opt/domy/ENGINES/tables/phrase-s=f-t=e-p=lotto_1-a=giza-g=7 > > --parts 4 > --root-dir /opt/domy/ENGINES/tables > --temp-dir >/tmp/domy/~train/filter,alt,1,0,train-tables-pro.tmp > > _______________________________________________ >Moses-support mailing list >[email protected] >http://mailman.mit.edu/mailman/listinfo/moses-support -- Ondrej Bojar http://www.cuni.cz/~obo _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
