I'm training a corpus with 50 million pairs and each corpus file is over 2 GB. So, I set train-model.perl with "--parts 4" (see below). Everything seems to be running fine. Step "(2.1a) running snt2cooc f-e" was split into 4 parts.
Step "(2.1b) running giza f-e" is running now. "top", however, shows that mgiza is running 100% CPU, not +500% as expected. Other than the --parts option, nothing else is different in the command line from other runs with "--mgiza-cpus 6". Any ideas? /usr/bin/perl -w /home/tahoar/domy-2.5/bin/train-model.perl --cores 6 --corpus /opt/domy/BUILDS/tm/set_1/bitext --e e --external-bin-dir /home/tahoar/domy-2.5/bin --f f --lm 0:0:/tmp/placeholder.lm:0 --mgiza --mgiza-cpus 6 --parts 4 --root-dir /opt/domy/ENGINES/tables
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
