Dear moses-support, I tried the nplm model on the German-English baseline dataset ( wget http://www.statmt.org/wmt13/training-parallel-nc-v8.tgz) and it improved the scores from 0.2266 to 0.2317 BLEU.
I tried the bilingual LM: http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel#ntoc37 However: - vocab files were not written in the end and I used extract_training.py to obtain them. - I still obtained 'nan' scores from the bilingual lm model. Error: "Not a label, not a score 'nan'. Failed to parse the scores string: 0 ||| ... айта ... болатын . ||| LexicalReordering0= -11.3723 -15.4848 -26.5152 -17.8301 -6.95664 -16.8553 -29.4425 -22.5538 OpSequenceModel0= -403.825 99 22 45 5 Distortion0= -146 LM0= -685.828 BLMcomb= nan WordPenalty0= -76 PhrasePenalty0= 53 TranslationModel0= -242.874 -179.189 -291.623 -342.085 ||| nan KENLM name=LM0 factor=0 path=en-kk/lm.corpus.tok.kk.6.blm.bin order=6 BilingualNPLM name=BLMcomb order=5 source_window=4 path=wmt19_en-kk/lm/comb.blm.2/train.10 source_vocab=wmt19_en-kk/lm/comb.blm.2/vocab.source target_vocab=wmt19_en-kk/lm/comb.blm.2/vocab.target Therefore, this may be due to some bug in moses C++ code and not the input data / configuration. The documentation appears also not in sync about "average the <null> word embedding as per the instructions here <http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel#anchorNULL>." part since averageNullEmbedding.py asks for -i, -o, and -t. I found some related note in a paper by Barry Haddow at WMT'15 saying that the model is not used in the final submission due to insignificant differences. Do you have any recent results on the bilingual LM model? -- Regards, Ergun
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
