Hi, I really tried to find my mistakes and checked every step, but with no effort. Here is what I have:
- As I want to translate from English into German, I trained a German language model. I think it works quite well, giving me good (low) perplexities for some German sentences I wrote as input. - I trained the translation model (English to German) and tuned it over development corpora (500 sentences). - Then I translated the test set and as far as I can say when I look at it, the translation is not that bad! A lot of small sentences in the test set are translated absolutely correctly (like "What will they do?" or "I really don't know."). Other sentences are rough but sometimes well understandable. - When calculating the BLEU-score of this test set and the translation of the test set with multi-bleu.perl I get the poor result of 3.76. - When translating with the untuned system, the score is 3.56. The translated sentences seem pretty similar to the translation of the tuned system. What really confuses me, is that I get such a low score on a translated document, that doesn't seem at all like nonsense to me, when I read it. Can this be normal? If it helps, I can send you the English test set and the German translation. Do you have any tips on how to find my mistake? Thanks a lot, Raphael Am 18.11.2015 um 14:56 schrieb Rico Sennrich: > Hello Raphael, > > I suggest that you check if you mixed up the languages somewhere, and > check if your translation output is actually English. > > 3.76 BLEU is possible to achieve without translation (because names and > some function words are the same between English and German), and it's > possible that you used the wrong reference file when measuring BLEU, or > that your SMT system is broken in some way and just copies the source > text to the output. > > best wishes, > Rico > > > On 18.11.2015 13:36, Raphael Hoeps wrote: >> Hi, >> I'm a Computer Science student from Germany working on a SMT-project. I >> tried to get into the moses-system a little bit and did the >> baseline-tutorial found here: >> http://www.statmt.org/moses/?n=Moses.Baseline. I sticked to this >> tutorial but used the German/English corpora. >> >> Unfortunately in the end I got a poor BLEU-score of only 3.76: >> BLEU = 3.76, 24.1/6.2/2.2/0.9 (BP=0.906, ratio=0.910, hyp_len=68049, >> ref_len=74753) >> In the tutorial, a score of 23.5 was received. >> >> I think did everything as shown in the tutorial, except for one thing: >> in the tuning-part I cut down the two development corpora to 500 lines >> (from 2000), because my Laptop is quite old and I wanted to speed up the >> process a little bit. (It still took my Laptop 6 hours). >> Do you think that this is the reason for my poor score? >> Is it maybe because I used the German/English corpora, so the score >> can't be compared to the English/French system in the tutorial? >> Or did I just make a mistake when typing all the commands? Any ideas how >> to find this mistake? >> >> Thank you very much for your help, >> Raphael >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
