Can you give an example of your phrases. Your phrases are extract files, in the extract format? The extract format is source ||| target ||| word alignment or source ||| target ||| word alignment ||| fractional count
the extract file MUST be already sorted. LC_ALL=C sort < extract > extract.sorted You can use the perl scripts or the programs directly. There are intermediate files between score and consolidate that must also be sorted in the same way. So if you use the programs directly you have to do the sorting yourself. the words don't have to be in the lex file. If the word is not found in the file, it just gives it a 0 probability On 6 February 2014 21:57, Varvara Logacheva <[email protected]>wrote: > Dear all, > > I have a model trained by moses and would like to use it to score some > new phrases that have been extracted not by Moses. > > I tried simply running step 6 of train-model.perl script: > > /home/varvara/soft/mosesdecoder/scripts/training/train-model.perl > -dont-zip -first-step 6 -last-step 6 -external-bin-dir > /home/varvara/soft/mosesdecoder/external-bin -f fr -e en -alignment > grow-diag-final-and -max-phrase-length 7 -reordering > msd-bidirectional-fe -score-options '--GoodTuring' -extract-file > ./phrases -lexical-file > /home/varvara/workspace/experiment/raw10/model/lex.1 > -phrase-translation-table ./phrase-table.3 > > or score-parallel.perl: > > ~/soft/mosesdecoder/scripts/generic/score-parallel.perl 1 "sort " > ~/soft/mosesdecoder/bin/score ./phrases.gz ../model/lex.1.f2e > ./phrase-table.half.f2e 0 > ~/soft/mosesdecoder/scripts/generic/score-parallel.perl 1 "sort " > ~/soft/mosesdecoder/bin/score ./phrases.gz ../model/lex.1.e2f > ./phrase-table.half.e2f --Inverse 0 > > Both resulted in empty phrase table, but didn't report any errors. > > If I try running score and consolidate directly: > > ~/soft/mosesdecoder/bin/score phrases.gz ../model/lex.1.f2e phrase-t > ~/soft/mosesdecoder/bin/score phrases.gz ../model/lex.1.e2f phrase-t2 > --Inverse > /home/varvara/soft/mosesdecoder/bin/consolidate phrase-t phrase-t2 > /dev/stdout | gzip -c > phrase-table.1.gz > > consolidate ends with an error: > > ERROR: target phrase does not match in line 1: '$ 48 pour une livre > de' != '$ 48 for a pound of' > > What am I doing wrong? > Does the file with extracted phrases need to be sorted? gzipped? > Do all the words from new phrases have to be in the .lex file? How can > I add them to the existing .lex file? > > Thank you, > > Varvara. > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
