Re: [Moses-support] baseline-system has very low BLEU-Score

Raphael Hoeps Wed, 18 Nov 2015 07:17:51 -0800

Hi,
I really tried to find my mistakes and checked every step, but with no 
effort. Here is what I have:


- As I want to translate from English into German, I trained a German 
language model. I think it works quite well, giving me good (low) 
perplexities for some German sentences I wrote as input.

- I trained the translation model (English to German) and tuned it over 
development corpora (500 sentences).

- Then I translated the test set and as far as I can say when I look at 
it, the translation is not that bad! A lot of small sentences in the 
test set are translated absolutely correctly (like "What will they do?" 
or "I really don't know."). Other sentences are rough but sometimes well 
understandable.

- When calculating the BLEU-score of this test set and the translation 
of the test set with multi-bleu.perl I get the poor result of 3.76.

- When translating with the untuned system, the score is 3.56. The 
translated sentences seem pretty similar to the translation of the tuned 
system.

What really confuses me, is that I get such a low score on a translated 
document, that doesn't seem at all like nonsense to me, when I read it. 
Can this be normal? If it helps, I can send you the English test set and 
the German translation.

Do you have any tips on how to find my mistake?

Thanks a lot,
Raphael






Am 18.11.2015 um 14:56 schrieb Rico Sennrich:
> Hello Raphael,
>
> I suggest that you check if you mixed up the languages somewhere, and
> check if your translation output is actually English.
>
> 3.76 BLEU is possible to achieve without translation (because names and
> some function words are the same between English and German), and it's
> possible that you used the wrong reference file when measuring BLEU, or
> that your SMT system is broken in some way and just copies the source
> text to the output.
>
> best wishes,
> Rico
>
>
> On 18.11.2015 13:36, Raphael Hoeps wrote:
>> Hi,
>> I'm a Computer Science student from Germany working on a SMT-project. I
>> tried to get into the moses-system a little bit and did the
>> baseline-tutorial found here:
>> http://www.statmt.org/moses/?n=Moses.Baseline. I sticked to this
>> tutorial but used the German/English corpora.
>>
>> Unfortunately in the end I got a poor BLEU-score of only 3.76:
>> BLEU = 3.76, 24.1/6.2/2.2/0.9 (BP=0.906, ratio=0.910, hyp_len=68049,
>> ref_len=74753)
>> In the tutorial, a score of 23.5 was received.
>>
>> I think did everything as shown in the tutorial, except for one thing:
>> in the tuning-part I cut down the two development corpora to 500 lines
>> (from 2000), because my Laptop is quite old and I wanted to speed up the
>> process a little bit. (It still took my Laptop 6 hours).
>> Do you think that this is the reason for my poor score?
>> Is it maybe because I used the German/English corpora, so the score
>> can't be compared to the English/French system in the tutorial?
>> Or did I just make a mistake when typing all the commands? Any ideas how
>> to find this mistake?
>>
>> Thank you very much for your help,
>> Raphael
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] baseline-system has very low BLEU-Score

Reply via email to