Hi,

lines like this:

> gzip: work/model/extract.o.gz already exists; do you wish to overwrite
(y or n)? y

indicate that you are executing a run over a previous failed run,
which sometimes causes the proper files not to be created.

In your case, where you are starting to see empty files is during
the word alignment, and I would assume that giza++ alignment,
which creates a file called giza*final.gz created in a
previous run an empty file that cannot be overwritten with
the correct file now.

Another possibility is that your current GIZA++ run is failing,
so please check what running step 2 by itself does and why
it does not produce the proper output. See the following page
in the Moses documentation on what should be produced:
http://www.statmt.org/moses/?n=FactoredTraining.RunGIZA

-phi

On Thu, Mar 24, 2011 at 3:08 PM, saadi <[email protected]> wrote:
> Hello,
>
> i followed step by step the sample in Moses Installation and Traininig,
> but after calling the train-model.perl i get as a result empty files in
> the woek/model folder please would you tell me why and how i proceed to
> get the right results
> thanks
> Halim
>
> saadi@saadi-HP-Pavilion-dv2700-Notebook-PC:~$
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl
> -scripts-root-dir tools/moses-scripts/scripts-20110321-2029/ -root-dir
> work -corpus work/corpus/news-commentary.lowercased -f fr -e en
> -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm
> 0:3:/home/saadi/work/lm/news-commentary.lm > work/training.out
> Using SCRIPTS_ROOTDIR: tools/moses-scripts/scripts-20110321-2029/
> Using single-thread GIZA
> (1) preparing corpus @ Thu Mar 24 14:08:11 CET 2011
> Executing: mkdir -p work/corpus
> (1.0) selecting factors @ Thu Mar 24 14:08:11 CET 2011
> (1.1) running mkcls  @ Thu Mar 24 14:08:11 CET 2011
> /home/saadi/tools/bin/mkcls -c50 -n2
> -pwork/corpus/news-commentary.lowercased.fr -Vwork/corpus/fr.vcb.classes
> opt
>  work/corpus/fr.vcb.classes already in place, reusing
> (1.1) running mkcls  @ Thu Mar 24 14:08:11 CET 2011
> /home/saadi/tools/bin/mkcls -c50 -n2
> -pwork/corpus/news-commentary.lowercased.en -Vwork/corpus/en.vcb.classes
> opt
>  work/corpus/en.vcb.classes already in place, reusing
> (1.2) creating vcb file work/corpus/fr.vcb @ Thu Mar 24 14:08:11 CET
> 2011
> (1.2) creating vcb file work/corpus/en.vcb @ Thu Mar 24 14:08:12 CET
> 2011
> (1.3) numberizing corpus work/corpus/fr-en-int-train.snt @ Thu Mar 24
> 14:08:13 CET 2011
>  work/corpus/fr-en-int-train.snt already in place, reusing
> (1.3) numberizing corpus work/corpus/en-fr-int-train.snt @ Thu Mar 24
> 14:08:13 CET 2011
>  work/corpus/en-fr-int-train.snt already in place, reusing
> (2) running giza @ Thu Mar 24 14:08:13 CET 2011
> (2.1a) running snt2cooc fr-en @ Thu Mar 24 14:08:13 CET 2011
>
> Executing: mkdir -p work/giza.fr-en
> Executing: /home/saadi/tools/bin/snt2cooc.out work/corpus/en.vcb
> work/corpus/fr.vcb work/corpus/fr-en-int-train.snt >
> work/giza.fr-en/fr-en.cooc
> line 1000
> line 2000
> line 3000
> line 4000
> line 5000
> line 6000
> line 7000
> line 8000
> line 9000
> line 10000
> line 11000
> line 12000
> line 13000
> line 14000
> line 15000
> line 16000
> line 17000
> line 18000
> line 19000
> line 20000
> line 21000
> line 22000
> line 23000
> line 24000
> line 25000
> line 26000
> line 27000
> line 28000
> line 29000
> line 30000
> line 31000
> line 32000
> line 33000
> line 34000
> line 35000
> line 36000
> line 37000
> line 38000
> line 39000
> line 40000
> line 41000
> line 42000
> line 43000
> line 44000
> END.
> (2.1b) running giza fr-en @ Thu Mar 24 14:08:28 CET 2011
> /home/saadi/tools/bin/GIZA++  -CoocurrenceFile
> work/giza.fr-en/fr-en.cooc -c work/corpus/fr-en-int-train.snt -m1 5 -m2
> 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1
> -nsmooth 4 -o work/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s
> work/corpus/en.vcb -t work/corpus/fr.vcb
> (2.1a) running snt2cooc en-fr @ Thu Mar 24 14:08:28 CET 2011
>
> Executing: mkdir -p work/giza.en-fr
> Executing: /home/saadi/tools/bin/snt2cooc.out work/corpus/fr.vcb
> work/corpus/en.vcb work/corpus/en-fr-int-train.snt >
> work/giza.en-fr/en-fr.cooc
> line 1000
> line 2000
> line 3000
> line 4000
> line 5000
> line 6000
> line 7000
> line 8000
> line 9000
> line 10000
> line 11000
> line 12000
> line 13000
> line 14000
> line 15000
> line 16000
> line 17000
> line 18000
> line 19000
> line 20000
> line 21000
> line 22000
> line 23000
> line 24000
> line 25000
> line 26000
> line 27000
> line 28000
> line 29000
> line 30000
> line 31000
> line 32000
> line 33000
> line 34000
> line 35000
> line 36000
> line 37000
> line 38000
> line 39000
> line 40000
> line 41000
> line 42000
> line 43000
> line 44000
> END.
> (2.1b) running giza en-fr @ Thu Mar 24 14:08:44 CET 2011
> /home/saadi/tools/bin/GIZA++  -CoocurrenceFile
> work/giza.en-fr/en-fr.cooc -c work/corpus/en-fr-int-train.snt -m1 5 -m2
> 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1
> -nsmooth 4 -o work/giza.en-fr/en-fr -onlyaldumps 1 -p0 0.999 -s
> work/corpus/fr.vcb -t work/corpus/en.vcb
> (3) generate word alignment @ Thu Mar 24 14:08:44 CET 2011
> Combining forward and inverted alignment from files:
>  work/giza.fr-en/fr-en.A3.final.{bz2,gz}
>  work/giza.en-fr/en-fr.A3.final.{bz2,gz}
> Executing: mkdir -p work/model
> Executing:
> tools/moses-scripts/scripts-20110321-2029//training/symal/giza2bal.pl -d
> "gzip -cd work/giza.en-fr/en-fr.A3.final.gz" -i "gzip -cd
> work/giza.fr-en/fr-en.A3.final.gz" |
> tools/moses-scripts/scripts-20110321-2029//training/symal/symal
> -alignment="grow" -diagonal="yes" -final="yes" -both="yes" >
> work/model/aligned.grow-diag-final-and
> symal: computing grow alignment: diagonal (1) final (1)both-uncovered
> (1)
> skip=<0> counts=<0>
> (4) generate lexical translation table 0-0 @ Thu Mar 24 14:08:44 CET
> 2011
> (work/corpus/news-commentary.lowercased.fr,work/corpus/news-commentary.lowercased.en,work/model/lex)
>  reusing: work/model/lex.f2e and work/model/lex.e2f
> (5) extract phrases @ Thu Mar 24 14:08:44 CET 2011
> tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/extract 
> work/corpus/news-commentary.lowercased.en 
> work/corpus/news-commentary.lowercased.fr 
> work/model/aligned.grow-diag-final-and work/model/extract 7 orientation 
> --model wbe-msd
> Executing:
> tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/extract 
> work/corpus/news-commentary.lowercased.en 
> work/corpus/news-commentary.lowercased.fr 
> work/model/aligned.grow-diag-final-and work/model/extract 7 orientation 
> --model wbe-msd
> PhraseExtract v1.4, written by Philipp Koehn
> phrase extraction from an aligned parallel corpus
> ....Executing: gzip work/model/extract.o
> gzip: work/model/extract.o.gz already exists; do you wish to overwrite
> (y or n)? y
> Executing: gzip work/model/extract.inv
> gzip: work/model/extract.inv.gz already exists; do you wish to overwrite
> (y or n)? y
> Executing: gzip work/model/extract
> gzip: work/model/extract.gz already exists; do you wish to overwrite (y
> or n)? y
> (6) score phrases @ Thu Mar 24 14:08:51 CET 2011
> Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1365.
> Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1366.
> Use of uninitialized value $CORE_SCORE_OPTIONS in substitution (s///) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1368.
> Use of uninitialized value $CORE_SCORE_OPTIONS in substitution (s///) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1369.
> (6.1)  sorting f2e @ Thu Mar 24 14:08:51 CET 2011
> Executing: gunzip < work/model/extract.gz | LC_ALL=C sort -T work/model
>> work/model/extract.sorted
> (6.2)  creating table half work/model/phrase-table.half.f2e @ Thu Mar 24
> 14:08:51 CET 2011
> Executing:
> tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/score
> work/model/extract.sorted work/model/lex.f2e
> work/model/phrase-table.half.f2e
> Score v2.0 written by Philipp Koehn
> scoring methods for extracted rules
> Loading lexical translation table from work/model/lex.f2e
> Executing: rm -f work/model/extract.sorted
> (6.3)  sorting e2f @ Thu Mar 24 14:08:51 CET 2011
> Executing: gunzip < work/model/extract.inv.gz | LC_ALL=C sort -T
> work/model > work/model/extract.inv.sorted
> (6.4)  creating table half work/model/phrase-table.half.e2f @ Thu Mar 24
> 14:08:51 CET 2011
> Executing:
> tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/score
> work/model/extract.inv.sorted work/model/lex.e2f
> work/model/phrase-table.half.e2f  --Inverse
> Score v2.0 written by Philipp Koehn
> scoring methods for extracted rules
> using inverse mode
> Loading lexical translation table from work/model/lex.e2f
> Executing: rm -f work/model/extract.inv.sorted
> (6.5) sorting inverse e2f table@ Thu Mar 24 14:08:51 CET 2011
> Executing: LC_ALL=C sort -T work/model work/model/phrase-table.half.e2f
>> work/model/phrase-table.half.e2f.sorted
> Executing: rm -f work/model/phrase-table.half.e2f
> (6.6) consolidating the two halves @ Thu Mar 24 14:08:51 CET 2011
> Executing:
> tools/moses-scripts/scripts-20110321-2029//training/phrase-extract/consolidate
>  work/model/phrase-table.half.f2e work/model/phrase-table.half.e2f.sorted 
> work/model/phrase-table
> Consolidate v2.0 written by Philipp Koehn
> consolidating direct and indirect rule tables
> Executing: rm -f work/model/phrase-table.half.*
> Executing: gzip work/model/phrase-table
> gzip: work/model/phrase-table.gz already exists; do you wish to
> overwrite (y or n)? y
> (7) learn reordering model @ Thu Mar 24 14:08:52 CET 2011
> (7.1) [no factors] learn reordering model @ Thu Mar 24 14:08:52 CET 2011
> Executing: gunzip < work/model/extract.o.gz | LC_ALL=C sort -T
> work/model > work/model/extract.o.sorted
> (7.2) building tables @ Thu Mar 24 14:08:52 CET 2011
> Executing:
> tools/moses-scripts/scripts-20110321-2029//training/lexical-reordering/score 
> work/model/extract.o.sorted 0.5 work/model/reordering-table. --model "wbe msd 
> wbe-msd-bidirectional-fe"
> Lexical Reordering Scorer
> scores lexical reordering models of several types (hierarchical,
> phrase-based and word-based-extraction
> Executing: rm work/model/extract.o.sorted
> (8) learn generation model @ Thu Mar 24 14:08:52 CET 2011
>  no generation model requested, skipping step
> (9) create moses.ini @ Thu Mar 24 14:08:52 CET 2011
> Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1680.
> Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1681.
> Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1682.
> Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1775.
> Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1776.
> Use of uninitialized value $_SCORE_OPTIONS in pattern match (m//) at
> tools/moses-scripts/scripts-20110321-2029/training/train-model.perl line
> 1777.
>
> --------------------------------------------------------------------------
> saadi@saadi-HP-Pavilion-dv2700-Notebook-PC:~/work/corpus$ ls -l
> total 53200
> -rw-r--r-- 1 saadi saadi 6718735 2011-03-23 22:34 en-fr-int-train.snt
> -rw-r--r-- 1 saadi saadi  505468 2011-03-24 14:08 en.vcb
> -rw-r--r-- 1 saadi saadi  356848 2011-03-23 22:34 en.vcb.classes
> -rw-r--r-- 1 saadi saadi  274736 2011-03-23 22:34 en.vcb.classes.cats
> -rw-r--r-- 1 saadi saadi 6718735 2011-03-23 22:34 fr-en-int-train.snt
> -rw-r--r-- 1 saadi saadi  625341 2011-03-24 14:08 fr.vcb
> -rw-r--r-- 1 saadi saadi  448277 2011-03-23 22:32 fr.vcb.classes
> -rw-r--r-- 1 saadi saadi  348417 2011-03-23 22:32 fr.vcb.classes.cats
> -rw-r--r-- 1 saadi saadi 4939798 2011-03-23 22:04
> news-commentary.clean.en
> -rw-r--r-- 1 saadi saadi 5969570 2011-03-23 22:04
> news-commentary.clean.fr
> -rw-r--r-- 1 saadi saadi 4939798 2011-03-23 22:22
> news-commentary.lowercased.en
> -rw-r--r-- 1 saadi saadi 5969570 2011-03-23 22:08
> news-commentary.lowercased.fr
> -rw-r--r-- 1 saadi saadi 7489086 2011-03-23 21:58 news-commentary.tok.en
> -rw-r--r-- 1 saadi saadi 9127240 2011-03-23 21:52 news-commentary.tok.fr
> ----------------------------------------------------------------------------------------
>
> saadi@saadi-HP-Pavilion-dv2700-Notebook-PC:~/work/model$ ls -l
> total 24
> -rw-r--r-- 1 saadi saadi    0 2011-03-24 14:08
> aligned.grow-diag-final-and
> -rw-r--r-- 1 saadi saadi   28 2011-03-24 14:08 extract.gz
> -rw-r--r-- 1 saadi saadi   32 2011-03-24 14:08 extract.inv.gz
> -rw-r--r-- 1 saadi saadi   30 2011-03-24 14:08 extract.o.gz
> -rw-r--r-- 1 saadi saadi    0 2011-03-23 14:11 lex.e2f
> -rw-r--r-- 1 saadi saadi    0 2011-03-23 14:11 lex.f2e
> -rw-r--r-- 1 saadi saadi 1266 2011-03-24 14:08 moses.ini
> -rw-r--r-- 1 saadi saadi   33 2011-03-24 14:08 phrase-table.gz
> -rw-r--r-- 1 saadi saadi   35 2011-03-24 14:08
> reordering-table.wbe-msd-bidirectional-fe.gz
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to