Re: [Moses-support] Another non-deterministic discussion

Ondrej Bojar Sun, 19 Aug 2012 07:05:55 -0700

Hi,

some years ago, I noticed a "non-determinism" related to the *order* of 
reference translation files supplied. With multiple references, the "closest" 
reference length can be 8 or 12 to a hypothesis of 10 words. And the order of 
reference translations then mattered...


Cheers,
  O.

"Tom Hoar" <[email protected]> wrote:

> 
>
>Hieu, 
>
>I hadn't thought about floating point accuracy. Possible. I
>can't share this data set. I'll make another set that I can share and
>try to replicate the results. 
>
>I have one other question about
>mteval-12.pl (or v11). Is there any way to turn off its internal
>tokenizer? I want to test my tokenization & not theirs. I didn't see an
>easy a way other than editing the script. 
>
>Tom 
>
>On 2012-08-18 21:34,
>Hieu Hoang wrote: 
>
>>> Here's what I didn't expect. I shuffled the order
>of the pairs in the 
>> evaluation set and ran mteval-12.pl again for
>each set. For each set,
>> > the same data shuffled in a different order
>and run through 
>> mteval-12.pl resulted in different cumulative BLEU
>scores. These scores
>> > varied from 0.8520 to 0.8627. Same data,
>different evaluation order.
>> 
>> that is extremely odd. The only thing I
>can think of is a floating point 
>> numerical precision problem. Or a
>bug in mteval-12.pl. Would it be 
>> possible to send me the dataset
>you're using? I'll take a look at it
>> 
>>
>_______________________________________________
>> Moses-support mailing
>list
>> [email protected]
>>
>http://mailman.mit.edu/mailman/listinfo/moses-support
> /usr/bin/perl -w
>/home/tahoar/domy-2.5/bin/train-model.perl 
> --do-steps 1 
>
>--alignment-file
>/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.aligned.7.f-e 
>
>--cores 6 
> --corpus /opt/domy/BUILDS/tm/lotto_1/bitext 
> --corpus-dir
>/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.classes 
> --e e 
>
>--external-bin-dir /home/tahoar/domy-2.5/bin 
> --extract-file
>/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.extract.f-e/ext.7-gram
>
> --f f 
> --giza-e2f
>/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.e-f 
> --giza-f2e
>/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.f-e 
>
>--glue-grammar-file
>/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/glue-grammar 
>
>--lexical-file
>/opt/domy/TRAININGS/alignments/align-lotto_1-e-f/giza.lexicon.f-e/lex.7-gram
>
> --lm 0:0:/tmp/placeholder.lm:0 
> --mgiza 
> --mgiza-cpus 6 
>
>--model-dir /opt/domy/ENGINES/tables/phrase-s=f-t=e-p=lotto_1-a=giza-g=7
>
> --parts 4 
> --root-dir /opt/domy/ENGINES/tables 
> --temp-dir
>/tmp/domy/~train/filter,alt,1,0,train-tables-pro.tmp
>
> _______________________________________________
>Moses-support mailing list
>[email protected]
>http://mailman.mit.edu/mailman/listinfo/moses-support


-- 
Ondrej Bojar
http://www.cuni.cz/~obo
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Another non-deterministic discussion

Reply via email to