Hi all again, A little more info, if someone has any ideas as I still haven't been able to figure it out.
When I do tuning with models that only have one translation table, it works fine, however with a non-factored tuning corpus. If I use a factored tuning corpus, Moses does one run and sets all weights to zero. If I have two translation tables, Moses doesn't do the tuning as he is missing factors. If I use the factored corpus, I get a similar result as above. Tuning stops after one run and sets all weights to zero. There was a similar error mentioned a few monts back and the solution was to turn of mbr decoding, however I am not using it. I just use the command: ~/mosesdecoder/scripts/training/mert-moses.pl ~/working/IT_corpus/TMX/txt/tuning_corpus/tuning_corpus.tagged.en ~/working/IT_corpus/TMX/txt/tuning_corpus/tuning_corpus.tagged.sl ~/mosesdecoder/bin/moses ~/working/IT_corpus/TMX/txt/factored_corpus/complex/model/moses.ini --mertdir ~/mosesdecoder/bin/ --decoder-flags="-threads 32" Is there something I am missing? Do I have to add anything else for tuning a factored model? Any help will be greatly appreciated. Best regards, Saso ---------- Forwarded message ---------- From: Sašo Kuntaric <saso.kunta...@gmail.com> Date: 2016-06-20 19:36 GMT+02:00 Subject: Binarization fails with the Segmentation Fault error To: moses-support <moses-support@mit.edu> Hi all, Me again (last time I hope). I have successfully trained and tuned my factored model. Here are both moses.ini files: ######################### ### MOSES CONFIG FILE ### ######################### # input factors [input-factors] 0 1 # mapping steps [mapping] 0 T 0 0 G 0 0 T 1 [distortion-limit] 6 # feature functions [feature] UnknownWordPenalty WordPenalty PhrasePenalty PhraseDictionaryMemory name=TranslationModel0 num-features=4 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz input-factor=0 output-factor=1 PhraseDictionaryMemory name=TranslationModel1 num-features=4 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz input-factor=1 output-factor=2 Generation name=GenerationModel0 num-features=2 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz input-factor=1 output-factor=0,3 Distortion KENLM name=LM0 factor=0 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/ IT_corpus_surface.blm.sl order=3 KENLM name=LM1 factor=2 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/ IT_corpus_parts.blm.sl order=3 # dense weights for feature functions [weight] # The default weights are NOT optimized for translation quality. You MUST tune the weights. # Documentation for tuning is here: http://www.statmt.org/moses/?n=FactoredTraining.Tuning UnknownWordPenalty0= 1 WordPenalty0= -1 PhrasePenalty0= 0.2 TranslationModel0= 0.2 0.2 0.2 0.2 TranslationModel1= 0.2 0.2 0.2 0.2 GenerationModel0= 0.3 0 Distortion0= 0.3 LM0= 0.5 LM1= 0.5 # MERT optimized configuration # decoder /home/ksaso/mosesdecoder/bin/moses # BLEU 0 on dev /home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/tuning/tuning-corpus.tagged.en # We were before running iteration 2 # finished Mon Jun 20 16:19:08 CEST 2016 ### MOSES CONFIG FILE ### ######################### # input factors [input-factors] 0 1 # mapping steps [mapping] 0 T 0 0 G 0 0 T 1 [distortion-limit] 6 # feature functions [feature] UnknownWordPenalty WordPenalty PhrasePenalty PhraseDictionaryMemory name=TranslationModel0 num-features=4 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz input-factor=0 output-factor=1 PhraseDictionaryMemory name=TranslationModel1 num-features=4 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz input-factor=1 output-factor=2 Generation name=GenerationModel0 num-features=2 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz input-factor=1 output-factor=0,3 Distortion KENLM name=LM0 factor=0 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/ IT_corpus_surface.blm.sl order=3 KENLM name=LM1 factor=2 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/ IT_corpus_parts.blm.sl order=3 # dense weights for feature functions [threads] 16 [weight] Distortion0= 0 LM0= 0 LM1= 0 WordPenalty0= 0 PhrasePenalty0= 0 TranslationModel0= 0 0 0 0 TranslationModel1= 0 0 0 0 GenerationModel0= 0 0 UnknownWordPenalty0= 1 First of all, is it strange that I get all zeroes after tuning? My problem is that the translation with this model is spectacularly slow (a few days to translate a couple of thousand words with a 2,4 million line corpus), so naturally I tried to binarize my phrase tables with the command ~/mosesdecoder/bin/processPhraseTableMin -in ~/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz -nscores 4 -out ~/working/binarised_model/phrase-table.0-1 and ~/mosesdecoder/bin/processPhraseTableMin -in ~/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz -nscores 4 -out ~/working/binarised_model/phrase-table.1-2 The process itself finishes without errors and I can run the translation with the command ~/mosesdecoder/bin/moses -f /home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/moses.ini But when I try to enter my text, I get the following: Translating: use|NN of|IN light|JJ Line 1: Initialize search took 0.000 seconds total Segmentation fault (core dumped) When I try to filter my model, I get the same error. Any ideas what could be causing this? My final moses.ini file looks like this: # MERT optimized configuration # decoder /home/ksaso/mosesdecoder/bin/moses # BLEU 0 on dev /home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/tuning/tuning-corpus.tagged.en # We were before running iteration 2 # finished Mon Jun 20 16:19:08 CEST 2016 ### MOSES CONFIG FILE ### ######################### # input factors [input-factors] 0 1 # mapping steps [mapping] 0 T 0 0 G 0 0 T 1 [distortion-limit] 6 # feature functions [feature] UnknownWordPenalty WordPenalty PhrasePenalty PhraseDictionaryCompact name=TranslationModel0 num-features=4 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/phrase-table.0-1.minphr input-factor=0 output-factor=1 PhraseDictionaryCompact name=TranslationModel1 num-features=4 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/phrase-table.1-2.minphr input-factor=1 output-factor=2 Generation name=GenerationModel0 num-features=2 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz input-factor=1 output-factor=0,3 Distortion KENLM name=LM0 factor=0 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/ IT_corpus_surface.blm.sl order=3 KENLM name=LM1 factor=2 path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/ IT_corpus_parts.blm.sl order=3 # dense weights for feature functions [threads] 16 [weight] Distortion0= 0 LM0= 0 LM1= 0 WordPenalty0= 0 PhrasePenalty0= 0 TranslationModel0= 0 0 0 0 TranslationModel1= 0 0 0 0 GenerationModel0= 0 0 UnknownWordPenalty0= 1 And one more question ... can I run a translation (with the ~/mosesdecoder/bin/moses command) multi-threaded? Thanks for all the help and best regards, Saso -- lp, Sašo
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support