Hi, lines like this:
> gzip: work/model/extract.o.gz already exists; do you wish to overwrite (y or n)? y indicate that you are executing a run over a previous failed run, which sometimes causes the proper files not to be created. In your case, where you are starting to see empty files is during the word alignment, and I would assume that giza++ alignment, which creates a file called giza*final.gz created in a previous run an empty file that cannot be overwritten with the correct file now. Another possibility is that your current GIZA++ run is failing, so please check what running step 2 by itself does and why it does not produce the proper output. See the following page in the Moses documentation on what should be produced: http://www.statmt.org/moses/?n=FactoredTraining.RunGIZA -phi On Thu, Mar 24, 2011 at 3:08 PM, saadi <[email protected]> wrote: > Hello, > > i followed step by step the sample in Moses Installation and Traininig, > but after calling the train-model.perl i get as a result empty files in > the woek/model folder please would you tell me why and how i proceed to > get the right results > thanks > Halim > > saadi@saadi-HP-Pavilion-dv2700-Notebook-PC:~$ > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl > -scripts-root-dir tools/moses-scripts/scripts-20110321-2029/ -root-dir > work -corpus work/corpus/news-commentary.lowercased -f fr -e en > -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm > 0:3:/home/saadi/work/lm/news-commentary.lm > work/training.out > Using SCRIPTS_ROOTDIR: tools/moses-scripts/scripts-20110321-2029/ > Using single-thread GIZA > (1) preparing corpus @ Thu Mar 24 14:08:11 CET 2011 > Executing: mkdir -p work/corpus > (1.0) selecting factors @ Thu Mar 24 14:08:11 CET 2011 > (1.1) running mkcls @ Thu Mar 24 14:08:11 CET 2011 > /home/saadi/tools/bin/mkcls -c50 -n2 > -pwork/corpus/news-commentary.lowercased.fr -Vwork/corpus/fr.vcb.classes > opt > work/corpus/fr.vcb.classes already in place, reusing > (1.1) running mkcls @ Thu Mar 24 14:08:11 CET 2011 > /home/saadi/tools/bin/mkcls -c50 -n2 > -pwork/corpus/news-commentary.lowercased.en -Vwork/corpus/en.vcb.classes > opt > work/corpus/en.vcb.classes already in place, reusing > (1.2) creating vcb file work/corpus/fr.vcb @ Thu Mar 24 14:08:11 CET > 2011 > (1.2) creating vcb file work/corpus/en.vcb @ Thu Mar 24 14:08:12 CET > 2011 > (1.3) numberizing corpus work/corpus/fr-en-int-train.snt @ Thu Mar 24 > 14:08:13 CET 2011 > work/corpus/fr-en-int-train.snt already in place, reusing > (1.3) numberizing corpus work/corpus/en-fr-int-train.snt @ Thu Mar 24 > 14:08:13 CET 2011 > work/corpus/en-fr-int-train.snt already in place, reusing > (2) running giza @ Thu Mar 24 14:08:13 CET 2011 > (2.1a) running snt2cooc fr-en @ Thu Mar 24 14:08:13 CET 2011 > > Executing: mkdir -p work/giza.fr-en > Executing: /home/saadi/tools/bin/snt2cooc.out work/corpus/en.vcb > work/corpus/fr.vcb work/corpus/fr-en-int-train.snt > > work/giza.fr-en/fr-en.cooc > line 1000 > line 2000 > line 3000 > line 4000 > line 5000 > line 6000 > line 7000 > line 8000 > line 9000 > line 10000 > line 11000 > line 12000 > line 13000 > line 14000 > line 15000 > line 16000 > line 17000 > line 18000 > line 19000 > line 20000 > line 21000 > line 22000 > line 23000 > line 24000 > line 25000 > line 26000 > line 27000 > line 28000 > line 29000 > line 30000 > line 31000 > line 32000 > line 33000 > line 34000 > line 35000 > line 36000 > line 37000 > line 38000 > line 39000 > line 40000 > line 41000 > line 42000 > line 43000 > line 44000 > END. > (2.1b) running giza fr-en @ Thu Mar 24 14:08:28 CET 2011 > /home/saadi/tools/bin/GIZA++ -CoocurrenceFile > work/giza.fr-en/fr-en.cooc -c work/corpus/fr-en-int-train.snt -m1 5 -m2 > 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 > -nsmooth 4 -o work/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s > work/corpus/en.vcb -t work/corpus/fr.vcb > (2.1a) running snt2cooc en-fr @ Thu Mar 24 14:08:28 CET 2011 > > Executing: mkdir -p work/giza.en-fr > Executing: /home/saadi/tools/bin/snt2cooc.out work/corpus/fr.vcb > work/corpus/en.vcb work/corpus/en-fr-int-train.snt > > work/giza.en-fr/en-fr.cooc > line 1000 > line 2000 > line 3000 > line 4000 > line 5000 > line 6000 > line 7000 > line 8000 > line 9000 > line 10000 > line 11000 > line 12000 > line 13000 > line 14000 > line 15000 > line 16000 > line 17000 > line 18000 > line 19000 > line 20000 > line 21000 > line 22000 > line 23000 > line 24000 > line 25000 > line 26000 > line 27000 > line 28000 > line 29000 > line 30000 > line 31000 > line 32000 > line 33000 > line 34000 > line 35000 > line 36000 > line 37000 > line 38000 > line 39000 > line 40000 > line 41000 > line 42000 > line 43000 > line 44000 > END. > (2.1b) running giza en-fr @ Thu Mar 24 14:08:44 CET 2011 > /home/saadi/tools/bin/GIZA++ -CoocurrenceFile > work/giza.en-fr/en-fr.cooc -c work/corpus/en-fr-int-train.snt -m1 5 -m2 > 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 > -nsmooth 4 -o work/giza.en-fr/en-fr -onlyaldumps 1 -p0 0.999 -s > work/corpus/fr.vcb -t work/corpus/en.vcb > (3) generate word alignment @ Thu Mar 24 14:08:44 CET 2011 > Combining forward and inverted alignment from files: > work/giza.fr-en/fr-en.A3.final.{bz2,gz} > work/giza.en-fr/en-fr.A3.final.{bz2,gz} > Executing: mkdir -p work/model > Executing: > tools/moses-scripts/scripts-20110321-2029//training/symal/giza2bal.pl -d > "gzip -cd work/giza.en-fr/en-fr.A3.final.gz" -i "gzip -cd > work/giza.fr-en/fr-en.A3.final.gz" | > tools/moses-scripts/scripts-20110321-2029//training/symal/symal > -alignment="grow" -diagonal="yes" -final="yes" -both="yes" > > work/model/aligned.grow-diag-final-and > symal: computing grow alignment: diagonal (1) final (1)both-uncovered > (1) > skip=<0> counts=<0> > (4) generate lexical translation table 0-0 @ Thu Mar 24 14:08:44 CET > 2011 > (work/corpus/news-commentary.lowercased.fr,work/corpus/news-commentary.lowercased.en,work/model/lex) > reusing: work/model/lex.f2e and work/model/lex.e2f > (5) extract phrases @ Thu Mar 24 14:08:44 CET 2011 > tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/extract > work/corpus/news-commentary.lowercased.en > work/corpus/news-commentary.lowercased.fr > work/model/aligned.grow-diag-final-and work/model/extract 7 orientation > --model wbe-msd > Executing: > tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/extract > work/corpus/news-commentary.lowercased.en > work/corpus/news-commentary.lowercased.fr > work/model/aligned.grow-diag-final-and work/model/extract 7 orientation > --model wbe-msd > PhraseExtract v1.4, written by Philipp Koehn > phrase extraction from an aligned parallel corpus > ....Executing: gzip work/model/extract.o > gzip: work/model/extract.o.gz already exists; do you wish to overwrite > (y or n)? y > Executing: gzip work/model/extract.inv > gzip: work/model/extract.inv.gz already exists; do you wish to overwrite > (y or n)? y > Executing: gzip work/model/extract > gzip: work/model/extract.gz already exists; do you wish to overwrite (y > or n)? y > (6) score phrases @ Thu Mar 24 14:08:51 CET 2011 > Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1365. > Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1366. > Use of uninitialized value $CORE_SCORE_OPTIONS in substitution (s///) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1368. > Use of uninitialized value $CORE_SCORE_OPTIONS in substitution (s///) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1369. > (6.1) sorting f2e @ Thu Mar 24 14:08:51 CET 2011 > Executing: gunzip < work/model/extract.gz | LC_ALL=C sort -T work/model >> work/model/extract.sorted > (6.2) creating table half work/model/phrase-table.half.f2e @ Thu Mar 24 > 14:08:51 CET 2011 > Executing: > tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/score > work/model/extract.sorted work/model/lex.f2e > work/model/phrase-table.half.f2e > Score v2.0 written by Philipp Koehn > scoring methods for extracted rules > Loading lexical translation table from work/model/lex.f2e > Executing: rm -f work/model/extract.sorted > (6.3) sorting e2f @ Thu Mar 24 14:08:51 CET 2011 > Executing: gunzip < work/model/extract.inv.gz | LC_ALL=C sort -T > work/model > work/model/extract.inv.sorted > (6.4) creating table half work/model/phrase-table.half.e2f @ Thu Mar 24 > 14:08:51 CET 2011 > Executing: > tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/score > work/model/extract.inv.sorted work/model/lex.e2f > work/model/phrase-table.half.e2f --Inverse > Score v2.0 written by Philipp Koehn > scoring methods for extracted rules > using inverse mode > Loading lexical translation table from work/model/lex.e2f > Executing: rm -f work/model/extract.inv.sorted > (6.5) sorting inverse e2f table@ Thu Mar 24 14:08:51 CET 2011 > Executing: LC_ALL=C sort -T work/model work/model/phrase-table.half.e2f >> work/model/phrase-table.half.e2f.sorted > Executing: rm -f work/model/phrase-table.half.e2f > (6.6) consolidating the two halves @ Thu Mar 24 14:08:51 CET 2011 > Executing: > tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/consolidate > work/model/phrase-table.half.f2e work/model/phrase-table.half.e2f.sorted > work/model/phrase-table > Consolidate v2.0 written by Philipp Koehn > consolidating direct and indirect rule tables > Executing: rm -f work/model/phrase-table.half.* > Executing: gzip work/model/phrase-table > gzip: work/model/phrase-table.gz already exists; do you wish to > overwrite (y or n)? y > (7) learn reordering model @ Thu Mar 24 14:08:52 CET 2011 > (7.1) [no factors] learn reordering model @ Thu Mar 24 14:08:52 CET 2011 > Executing: gunzip < work/model/extract.o.gz | LC_ALL=C sort -T > work/model > work/model/extract.o.sorted > (7.2) building tables @ Thu Mar 24 14:08:52 CET 2011 > Executing: > tools/moses-scripts/scripts-20110321-2029//training/lexical-reordering/score > work/model/extract.o.sorted 0.5 work/model/reordering-table. --model "wbe msd > wbe-msd-bidirectional-fe" > Lexical Reordering Scorer > scores lexical reordering models of several types (hierarchical, > phrase-based and word-based-extraction > Executing: rm work/model/extract.o.sorted > (8) learn generation model @ Thu Mar 24 14:08:52 CET 2011 > no generation model requested, skipping step > (9) create moses.ini @ Thu Mar 24 14:08:52 CET 2011 > Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1680. > Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1681. > Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1682. > Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1775. > Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1776. > Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at > tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line > 1777. > > -------------------------------------------------------------------------- > saadi@saadi-HP-Pavilion-dv2700-Notebook-PC:~/work/corpus$ ls -l > total 53200 > -rw-r--r-- 1 saadi saadi 6718735 2011-03-23 22:34 en-fr-int-train.snt > -rw-r--r-- 1 saadi saadi 505468 2011-03-24 14:08 en.vcb > -rw-r--r-- 1 saadi saadi 356848 2011-03-23 22:34 en.vcb.classes > -rw-r--r-- 1 saadi saadi 274736 2011-03-23 22:34 en.vcb.classes.cats > -rw-r--r-- 1 saadi saadi 6718735 2011-03-23 22:34 fr-en-int-train.snt > -rw-r--r-- 1 saadi saadi 625341 2011-03-24 14:08 fr.vcb > -rw-r--r-- 1 saadi saadi 448277 2011-03-23 22:32 fr.vcb.classes > -rw-r--r-- 1 saadi saadi 348417 2011-03-23 22:32 fr.vcb.classes.cats > -rw-r--r-- 1 saadi saadi 4939798 2011-03-23 22:04 > news-commentary.clean.en > -rw-r--r-- 1 saadi saadi 5969570 2011-03-23 22:04 > news-commentary.clean.fr > -rw-r--r-- 1 saadi saadi 4939798 2011-03-23 22:22 > news-commentary.lowercased.en > -rw-r--r-- 1 saadi saadi 5969570 2011-03-23 22:08 > news-commentary.lowercased.fr > -rw-r--r-- 1 saadi saadi 7489086 2011-03-23 21:58 news-commentary.tok.en > -rw-r--r-- 1 saadi saadi 9127240 2011-03-23 21:52 news-commentary.tok.fr > ---------------------------------------------------------------------------------------- > > saadi@saadi-HP-Pavilion-dv2700-Notebook-PC:~/work/model$ ls -l > total 24 > -rw-r--r-- 1 saadi saadi 0 2011-03-24 14:08 > aligned.grow-diag-final-and > -rw-r--r-- 1 saadi saadi 28 2011-03-24 14:08 extract.gz > -rw-r--r-- 1 saadi saadi 32 2011-03-24 14:08 extract.inv.gz > -rw-r--r-- 1 saadi saadi 30 2011-03-24 14:08 extract.o.gz > -rw-r--r-- 1 saadi saadi 0 2011-03-23 14:11 lex.e2f > -rw-r--r-- 1 saadi saadi 0 2011-03-23 14:11 lex.f2e > -rw-r--r-- 1 saadi saadi 1266 2011-03-24 14:08 moses.ini > -rw-r--r-- 1 saadi saadi 33 2011-03-24 14:08 phrase-table.gz > -rw-r--r-- 1 saadi saadi 35 2011-03-24 14:08 > reordering-table.wbe-msd-bidirectional-fe.gz > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
