Re: [Moses-support] BLEU score difference about 0.13 for one dataset is normal?

Tom Hoar Wed, 14 Oct 2015 10:19:20 -0700

Davood,

I don't know enough about your data and uses cases to recommend one wayor another. Running MERT multiple times will give you different BLEUscores, I have never found the deltas to make a difference in aproduction environment.


Tom


On 10/14/2015 12:50 PM, Davood Mohammadifar wrote:

Thanks Michael for the paper and thanks Tom.
Based on the paper, one solution is replication of MERT and testing atleast three times.
My ideas have subtle effects on BLUE. Do you recommend me run MERT andtesting three times or more? should i increase the number of sentencesfor tuning?
my dataset for Persian to English includes:
Training: about 240000 sentences
Tune: 1000 sentences
Test: 1000 sentences

------------------------------------------------------------------------
From: [email protected]
Date: Sun, 11 Oct 2015 12:53:37 +0700
To: [email protected]
Subject: Re: [Moses-support] BLEU score difference about 0.13 for onedataset is normal?
Yes. Each tuning with the same test set will give you small variationsin the final BLEU. Yours looks like they're in a normal range.
Date: Sun, 11 Oct 2015 04:23:56 +0000
From: Davood Mohammadifar <[email protected]>
Subject: [Moses-support] BLEU score difference about 0.13 for one
dataset is normal?
To: Moses Support <[email protected]>

Hello every one
I noticed different BLEU scores for same dataset. Also the differenceis not so much and is about 0.13.
I trained my dataset and tuned development set for Persian-Englishtranslation. after testing, the score was 21.95. For second time i didthe same process and obtained 21.82. (my tools were mgiza, mert, ...)
is this difference normal?

My system:
CPU: Core i7-4790K
RAM: 16GB
OS: ubuntu 12.04

Thanks
_______________________________________________ Moses-support mailinglist [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] BLEU score difference about 0.13 for one dataset is normal?

Reply via email to