Hi James
sh: 1:
/media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/../bin/symal:
not found
The script expects to be able to navigate the file system and find the
binaries. If you've built Moses with a "--prefix" option then it won't
be able to find the binaries. If you are running the script from the
source tree, then make sure the binaries are in the directory "bin" .
In answer to your other question, varying the pre-processing pipeline is
definitely possible (how do you think Moses deals with Chinese?) There
are certain data formatting requirements, as Philipp pointed out, but
other than that you have a lot of freedom.
cheers - Barry
On 02/12/15 15:21, Read, James C wrote:
nohup nice
/media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/training/train-model.perl
-root-dir phrase_table -corpus
/media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/europarl-v7.it-en.1-0010.00001000
-f it -e en -alignment grow-diag-final-and -reordering
msd-bidirectional-fe -lm 0:3:/media/bigdata/jcread/llv/lm:8
-external-bin-dir /media/bigdata/jcread/3rd_party_software/bin >&
training.out &
Runs well for a while and then bombs out with following output and
Error 127
(3) generate word alignment @ Wed Dec 2 01:56:06 GMT 2015
Combining forward and inverted alignment from files:
/media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/giza.it-en/it-en.A3.final.{bz2,gz}
/media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/giza.en-it/en-it.A3.final.{bz2,gz}
Executing: mkdir -p
/media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/model
Executing:
/media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/training/giza2bal.pl
-d "gzip -cd
/media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/giza.en-it/en-it.A3.final.gz"
-i "gzip -cd
/media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/giza.it-en/it-en.A3.final.gz"
|/media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/../bin/symal
-alignment="grow" -diagonal="yes" -final="yes" -both="yes" >
/media/bigdata/jcread/llv/data/europarlv7/prealigned/tokenized_truecased_cleaned/1-0010/00001000/phrase_table/model/aligned.grow-diag-final-and
sh: 1:
/media/bigdata/jcread/3rd_party_software/mosesdecoder/scripts/../bin/symal:
not found
Exit code: 127
ERROR: Can't generate symmetrized alignment file
It seems this problem with the script has been encountered before:
http://comments.gmane.org/gmane.comp.nlp.moses.user/10489
I'm not sure I understand the accepted solution.
"Use absolute paths to all the scripts, and make sure your parallel
files have the same names but the extension"
The command I issued uses only absolute paths. Is this referring to
modifications in the training script itself?
James
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support