Hi Patricia, Could you try computing the perplexity of your binarized LM with compile-lm?
First on the ARPA format (SRILM): /home/lingua/smt/irstlm/bin/compile-lm complete_fr.truecased_unique_tok_clean.lm --eval <text-to-eval> and then on the binarized version (before your symbolic link): /home/lingua/smt/irstlm/bin/compile-lm complete_fr.truecased_unique_tok_clean.blm --eval <text-to-eval> It might be easier to debug by first looking at the direct output from IRSTLM. Thanks, Nick ________________________________ From: [email protected] [[email protected]] on behalf of Patricia Helmich [[email protected]] Sent: Tuesday, July 03, 2012 3:07 PM To: [email protected] Subject: [Moses-support] IRSTLM - Error: dictionary::loadtxt wrong entry was found (0) in position 1 Hi, I am using Moses in combination with SRILM and IRSTLM for several language pairs. After building LMs with SRILM and training the phrase model, I try to translate a sentence, for example: echo "this is a small house" | /home/lingua/smt/moses/bin/moses -f model/moses.ini This works well for each language pair. Then I produce an IRSTLM binary LM for each language pair, for example: /home/lingua/smt/irstlm/bin/compile-lm complete_fr.truecased_unique_tok_clean.lm complete_fr.truecased_unique_tok_clean.blm ln -s complete_fr.truecased_unique_tok_clean.blm complete_fr.truecased_unique_tok_clean.blm.mm and I produce binary phrase tables and binary reordering tables: gzip -cd fr-en/f_en.e_fr/model/phrase-table.gz | LC_ALL=C sort | /home/lingua/smt/moses/bin/processPhraseTable -ttable 0 0 - -nscores 5 -out fr-en/f_en.e_fr/model/phrase-table gzip -cd fr-en/f_en.e_fr/model/reordering-table.wbe-msd-bidirectional-fe.gz | LC_ALL=C sort | /home/lingua/smt/moses/bin/processLexicalTable -out fr-en/f_en.e_fr/model/reordering-table Then I create a copy of moses.ini (->moses-bin.ini) and set moses-bin.ini to use the binary files. Now I try to translate a sentence with: echo "this is a small house" | TMP=/tmp /home/lingua/smt/moses/bin/moses -v 2 -f model/moses-bin.ini This works well for each language pair, except for the language pair f: en, e: fr. The output is: Defined parameters (per moses.ini or switch): config: model/moses-bin.ini distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6 /home/lingua/Patricia/Corpora/Corpora_Biling/fr-en/f_en.e_fr/model/reordering-table distortion-limit: 6 input-factors: 0 lmodel-file: 1 0 3 /home/lingua/Patricia/Corpora/Corpora_Monoling_Complete/fr/complete_fr.truecased_unique_tok_clean.blm.mm mapping: 0 T 0 ttable-file: 1 0 0 5 /home/lingua/Patricia/Corpora/Corpora_Biling/fr-en/f_en.e_fr/model/phrase-table ttable-limit: 20 verbose: 2 weight-d: 0.3 0.3 0.3 0.3 0.3 0.3 0.3 weight-l: 0.5000 weight-t: 0.20 0.20 0.20 0.20 0.20 weight-w: -1 input type is: text input Loading lexical distortion models...have 1 models Creating lexical reordering... weights: 0.300 0.300 0.300 0.300 0.300 0.300 binary file loaded, default OFF_T: -1 Start loading LanguageModel /home/lingua/Patricia/Corpora/Corpora_Monoling_Complete/fr/complete_fr.truecased_unique_tok_clean.blm.mm : [0.000] seconds In LanguageModelIRST::Load: nGramOrder = 3 Language Model Type of /home/lingua/Patricia/Corpora/Corpora_Monoling_Complete/fr/complete_fr.truecased_unique_tok_clean.blm.mm is 1 blmt loadbin() lmtable::loadbin_dict() dictionary::loadtxt wrong entry was found (0) in position 1 I don't understand the reason for this error. Could you help me with this problem? Thank you, Patricia _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
