Hello! I am trying to establish a working version of Moses for the purposes of our project. I followed guidelines from the Moses Web pages (Baseline System, ...) and it was mostly successful, except for the usage of mgiza.
History of what I did: My System: virtual machine with Ubuntu 14.04 x64, 2 cores, 12 GB of memory. 1. installed release 3.0 from Web page tried with commands from "Baseline System" ==> mgiza fails with signal 11, coredump 2. compiled and installed latest version of mgiza from Github tried with commands from "Baseline System" ==> mgiza fails with signal 11, coredump 3. compiled and installed latest version of GIZA++ from Github tried with commands from "Baseline System" ==> all OK 4. compiled and installed latest version of moses, GIZA++ and mgiza from Github tried with commands from "Baseline System" ==> OK with GIZA++, fail with mgiza Basically, for calling GIZA++/mgiza I use the same command with the same input files, the only difference is the following two switches: GIZAOPT="-mgiza -mgiza-cpus 2" Command: $HOME/mosesdecoder/scripts/training/train-model.perl -cores 2 $GIZAOPT -root-dir train -corpus $HOME/corpus/news-commentary-v8.fr-en.clean -f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8 -external-bin-dir $HOME/mosesdecoder/training-tools 2>&1 > train.out If GIZA++ is called (when GIZAOPT=""), all is OK, when mgiza is called (when GIZAOP="-mgiza ..."), mgiza fails with: Executing: $HOME/mosesdecoder/training-tools/mgiza -CoocurrenceFile $HOME/tm/train/giza.fr-en/fr-en.cooc -c $HOME/tm/train/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -ncpus 2 -nodumps 1 -nsmooth 4 -o $HOME/tm/train/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s $HOME/tm/train/corpus/en.vcb -t $HOME/tm/train/corpus/fr.vcb Starting MGIZA Initializing Global Paras DEBUG: EnterERROR: Execution of: $HOME/mosesdecoder/training-tools/mgiza -CoocurrenceFile $HOME/tm/train/giza.fr-en/fr-en.cooc -c $HOME/tm/train/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -ncpus 2 -nodumps 1 -nsmooth 4 -o $HOME/tm/train/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s $HOME/tm/train/corpus/en.vcb -t $HOME/tm/train/corpus/fr.vcb died with signal 11, with coredump GIZA++ on the other hand works as follows: Executing: $HOME/mosesdecoder/training-tools/GIZA++ -CoocurrenceFile $HOME/tm/train/giza.fr-en/fr-en.cooc -c $HOME/tm/train/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o $HOME/tm/train/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s $HOME/tm/train/corpus/en.vcb -t $HOME/tm/train/corpus/fr.vcb Reading vocabulary file from:$HOME/tm/train/corpus/en.vcb Reading vocabulary file from:$HOME/tm/train/corpus/fr.vcb 10000 20000 ... What can I do to help determine where mgiza fails and get it up & running? Sub-question: is it really worth running mgiza instead of GIZA++? Best regards, Matjaz PS: I changed /home/... to $HOME in the above examples. _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support