Hi James

sh: 1: /media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/../bin/symal: not found


The script expects to be able to navigate the file system and find the binaries. If you've built Moses with a "--prefix" option then it won't be able to find the binaries. If you are running the script from the source tree, then make sure the binaries are in the directory "bin" .

In answer to your other question, varying the pre-processing pipeline is definitely possible (how do you think Moses deals with Chinese?) There are certain data formatting requirements, as Philipp pointed out, but other than that you have a lot of freedom.

cheers - Barry


On 02/12/15 15:21, Read, James C wrote:

nohup nice /media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/training/train-model.perl -root-dir phrase_table -corpus /media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/europarl-v7.it-en.1-0010.00001000 -f it -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:3:/media/bigdata/jcread/llv/lm:8 -external-bin-dir /media/bigdata/jcread/3rd_party_software/bin >& training.out &



Runs well for a while and then bombs out with following output and Error 127



(3) generate word alignment @ Wed Dec  2 01:56:06 GMT 2015
Combining forward and inverted alignment from files:
/media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/giza.it-en/it-en.A3.final.{bz2,gz}
/media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/giza.en-it/en-it.A3.final.{bz2,gz}
Executing: mkdir -p /media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/model Executing: /media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/training/giza2bal.pl -d "gzip -cd /media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/giza.en-it/en-it.A3.final.gz" -i "gzip -cd /media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/giza.it-en/it-en.A3.final.gz" |/media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/../bin/symal -alignment="grow" -diagonal="yes" -final="yes" -both="yes" > /media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/model/aligned.grow-diag-final-and sh: 1: /media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/../bin/symal: not found
Exit code: 127
ERROR: Can't generate symmetrized alignment file


It seems this problem with the script has been encountered before:


http://comments.gmane.org/gmane.comp.nlp.moses.user/10489


I'm not sure I understand the accepted solution.


"Use absolute paths to all the scripts, and make sure your parallel files have the same names but the extension"


The command I issued uses only absolute paths. Is this referring to modifications in the training script itself?


James




_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to