Hello everyone,
I have a problem with the extraction of phrases in Moses
I tested on two parallel corpora containing only  not two long  sentences,
it built a table of phrases not containing all the phrases:

Also I do not understand why it gives me :
!Use of uninitialized value $a in scalar chomp at ./train-model.perl line
1079, <A> line 1.
Use of uninitialized value $a in split at ./train-model.perl line 1082, <A>
line 1.


Any ide please?

Thank you

Bests

*Run:

*Using SCRIPTS_ROOTDIR:
/home/cyrine/tools/moses-scripts/scripts-20100801-2053/
Using single-thread GIZA
(1) preparing corpus @ Sun Jan  9 17:21:13 CET 2011
Executing: mkdir -p /home/cyrine/tools/work/corpus
(1.0) selecting factors @ Sun Jan  9 17:21:13 CET 2011
(1.1) running mkcls  @ Sun Jan  9 17:21:13 CET 2011
/home/cyrine/tools/bin/mkcls -c50 -n2
-p/home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.en
-V/home/cyrine/tools/work/corpus/en.vcb.classes opt
Executing: /home/cyrine/tools/bin/mkcls -c50 -n2
-p/home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.en
-V/home/cyrine/tools/work/corpus/en.vcb.classes opt
WARNING: StatVar.cc
WARNING: StatVar.cc

***** 2 runs. (algorithm:TA)*****
;KategProblem:cats: 50   words: 12

start-costs: MEAN: 21.3177 (20.7944-21.8409)  SIGMA:0.523248
  end-costs: MEAN: 18.0218 (18.0218-18.0218)  SIGMA:0
   start-pp: MEAN: 1.90717 (1.85175-1.9626)  SIGMA:0.0554247
     end-pp: MEAN: 1.5874 (1.5874-1.5874)  SIGMA:0
 iterations: MEAN: 50021 (50020-50022)  SIGMA:1
       time: MEAN: 0.36 (0.36-0.36)  SIGMA:0
(1.1) running mkcls  @ Sun Jan  9 17:21:14 CET 2011
/home/cyrine/tools/bin/mkcls -c50 -n2 -p/home/cyrine/tools/work/corpus/
news-commentary.lowercased.clean.fr-V/home/cyrine/tools/work/corpus/fr.vcb.classes
opt
Executing: /home/cyrine/tools/bin/mkcls -c50 -n2
-p/home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.fr-V/home/cyrine/tools/work/corpus/fr.vcb.classes
opt
WARNING: StatVar.cc
WARNING: StatVar.cc

***** 2 runs. (algorithm:TA)*****
;KategProblem:cats: 50   words: 13

start-costs: MEAN: 10.2273 (8.31777-12.1369)  SIGMA:1.90954
  end-costs: MEAN: 5.54518 (5.54518-5.54518)  SIGMA:0
   start-pp: MEAN: 1.65709 (1.44727-1.86691)  SIGMA:0.20982
     end-pp: MEAN: 1.20303 (1.20303-1.20303)  SIGMA:0
 iterations: MEAN: 50021 (50019-50023)  SIGMA:2
       time: MEAN: 0.36 (0.36-0.36)  SIGMA:0
(1.2) creating vcb file /home/cyrine/tools/work/corpus/en.vcb @ Sun Jan  9
17:21:14 CET 2011
(1.2) creating vcb file /home/cyrine/tools/work/corpus/fr.vcb @ Sun Jan  9
17:21:14 CET 2011
(1.3) numberizing corpus /home/cyrine/tools/work/corpus/en-fr-int-train.snt
@ Sun Jan  9 17:21:14 CET 2011
(1.3) numberizing corpus /home/cyrine/tools/work/corpus/fr-en-int-train.snt
@ Sun Jan  9 17:21:14 CET 2011
(2) running giza @ Sun Jan  9 17:21:14 CET 2011
(2.1a) running snt2cooc en-fr @ Sun Jan  9 17:21:14 CET 2011

Executing: mkdir -p /home/cyrine/tools/work/giza.en-fr
Executing: /home/cyrine/tools/bin/snt2cooc.out
/home/cyrine/tools/work/corpus/fr.vcb /home/cyrine/tools/work/corpus/en.vcb
/home/cyrine/tools/work/corpus/en-fr-int-train.snt >
/home/cyrine/tools/work/giza.en-fr/en-fr.cooc
/home/cyrine/tools/bin/snt2cooc.out /home/cyrine/tools/work/corpus/fr.vcb
/home/cyrine/tools/work/corpus/en.vcb
/home/cyrine/tools/work/corpus/en-fr-int-train.snt >
/home/cyrine/tools/work/giza.en-fr/en-fr.cooc
END.
(2.1b) running giza en-fr @ Sun Jan  9 17:21:14 CET 2011
/home/cyrine/tools/bin/GIZA++  -CoocurrenceFile
/home/cyrine/tools/work/giza.en-fr/en-fr.cooc -c
/home/cyrine/tools/work/corpus/en-fr-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3
-model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o
/home/cyrine/tools/work/giza.en-fr/en-fr -onlyaldumps 1 -p0 0.999 -s
/home/cyrine/tools/work/corpus/fr.vcb -t
/home/cyrine/tools/work/corpus/en.vcb
  /home/cyrine/tools/work/giza.en-fr/en-fr.A3.final.gz seems finished,
reusing.
(2.1a) running snt2cooc fr-en @ Sun Jan  9 17:21:14 CET 2011

Executing: mkdir -p /home/cyrine/tools/work/giza.fr-en
Executing: /home/cyrine/tools/bin/snt2cooc.out
/home/cyrine/tools/work/corpus/en.vcb /home/cyrine/tools/work/corpus/fr.vcb
/home/cyrine/tools/work/corpus/fr-en-int-train.snt >
/home/cyrine/tools/work/giza.fr-en/fr-en.cooc
/home/cyrine/tools/bin/snt2cooc.out /home/cyrine/tools/work/corpus/en.vcb
/home/cyrine/tools/work/corpus/fr.vcb
/home/cyrine/tools/work/corpus/fr-en-int-train.snt >
/home/cyrine/tools/work/giza.fr-en/fr-en.cooc
END.
(2.1b) running giza fr-en @ Sun Jan  9 17:21:14 CET 2011
/home/cyrine/tools/bin/GIZA++  -CoocurrenceFile
/home/cyrine/tools/work/giza.fr-en/fr-en.cooc -c
/home/cyrine/tools/work/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3
-model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o
/home/cyrine/tools/work/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s
/home/cyrine/tools/work/corpus/en.vcb -t
/home/cyrine/tools/work/corpus/fr.vcb
  /home/cyrine/tools/work/giza.fr-en/fr-en.A3.final.gz seems finished,
reusing.
(3) generate word alignment @ Sun Jan  9 17:21:14 CET 2011
Combining forward and inverted alignment from files:
  /home/cyrine/tools/work/giza.en-fr/en-fr.A3.final.{bz2,gz}
  /home/cyrine/tools/work/giza.fr-en/fr-en.A3.final.{bz2,gz}
Executing: mkdir -p /home/cyrine/tools/work/model
Executing:
/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/symal/
giza2bal.pl -d "gzip -cd
/home/cyrine/tools/work/giza.fr-en/fr-en.A3.final.gz" -i "gzip -cd
/home/cyrine/tools/work/giza.en-fr/en-fr.A3.final.gz"
|/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/symal/symal
-alignment="grow" -diagonal="yes" -final="yes" -both="yes" >
/home/cyrine/tools/work/model/aligned.grow-diag-final-and
symal: computing grow alignment: diagonal (1) final (1)both-uncovered (1)
skip=<0> counts=<1>
(4) generate lexical translation table 0-0 @ Sun Jan  9 17:21:14 CET 2011
(/home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.en,/home/cyrine/tools/work/corpus/
news-commentary.lowercased.clean.fr,/home/cyrine/tools/work/model/lex)
!Use of uninitialized value $a in scalar chomp at ./train-model.perl line
1079, <A> line 1.
Use of uninitialized value $a in split at ./train-model.perl line 1082, <A>
line 1.

Saved: /home/cyrine/tools/work/model/lex.f2e and
/home/cyrine/tools/work/model/lex.e2f
FILE: /home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.fr
FILE: /home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.en
FILE: /home/cyrine/tools/work/model/aligned.grow-diag-final-and
(5) extract phrases @ Sun Jan  9 17:21:14 CET 2011
/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/phrase-extract/extract
/home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.fr/home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.en
/home/cyrine/tools/work/model/aligned.grow-diag-final-and
/home/cyrine/tools/work/model/extract 7 orientation --model wbe-msd
Executing:
/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/phrase-extract/extract
/home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.fr/home/cyrine/tools/work/corpus/news-commentary.lowercased.clean.en
/home/cyrine/tools/work/model/aligned.grow-diag-final-and
/home/cyrine/tools/work/model/extract 7 orientation --model wbe-msd
PhraseExtract v1.4, written by Philipp Koehn
phrase extraction from an aligned parallel corpus
Executing: gzip /home/cyrine/tools/work/model/extract.o
Executing: gzip /home/cyrine/tools/work/model/extract.inv
Executing: gzip /home/cyrine/tools/work/model/extract
(6) score phrases @ Sun Jan  9 17:21:14 CET 2011
(6.1)  sorting f2e @ Sun Jan  9 17:21:15 CET 2011
Executing: gunzip < /home/cyrine/tools/work/model/extract.gz | LC_ALL=C sort
-T /home/cyrine/tools/work/model >
/home/cyrine/tools/work/model/extract.sorted
(6.2)  creating table half
/home/cyrine/tools/work/model/phrase-table.half.f2e @ Sun Jan  9 17:21:15
CET 2011
Executing:
/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/phrase-extract/score
/home/cyrine/tools/work/model/extract.sorted
/home/cyrine/tools/work/model/lex.f2e
/home/cyrine/tools/work/model/phrase-table.half.f2e
/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/phrase-extract/score
/home/cyrine/tools/work/model/extract.sorted
/home/cyrine/tools/work/model/lex.f2e
/home/cyrine/tools/work/model/phrase-table.half.f2e
Score v2.0 written by Philipp Koehn
scoring methods for extracted rules
Loading lexical translation table from /home/cyrine/tools/work/model/lex.f2e
Executing: rm -f /home/cyrine/tools/work/model/extract.sorted
(6.3)  sorting e2f @ Sun Jan  9 17:21:15 CET 2011
Executing: gunzip < /home/cyrine/tools/work/model/extract.inv.gz | LC_ALL=C
sort -T /home/cyrine/tools/work/model >
/home/cyrine/tools/work/model/extract.inv.sorted
(6.4)  creating table half
/home/cyrine/tools/work/model/phrase-table.half.e2f @ Sun Jan  9 17:21:15
CET 2011
Executing:
/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/phrase-extract/score
/home/cyrine/tools/work/model/extract.inv.sorted
/home/cyrine/tools/work/model/lex.e2f
/home/cyrine/tools/work/model/phrase-table.half.e2f  --Inverse
/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/phrase-extract/score
/home/cyrine/tools/work/model/extract.inv.sorted
/home/cyrine/tools/work/model/lex.e2f
/home/cyrine/tools/work/model/phrase-table.half.e2f  --Inverse
Score v2.0 written by Philipp Koehn
scoring methods for extracted rules
using inverse mode
Loading lexical translation table from /home/cyrine/tools/work/model/lex.e2f
Executing: rm -f /home/cyrine/tools/work/model/extract.inv.sorted
(6.5) sorting inverse e2f table@ Sun Jan  9 17:21:15 CET 2011
Executing: LC_ALL=C sort -T /home/cyrine/tools/work/model
/home/cyrine/tools/work/model/phrase-table.half.e2f >
/home/cyrine/tools/work/model/phrase-table.half.e2f.sorted
Executing: rm -f /home/cyrine/tools/work/model/phrase-table.half.e2f
(6.6) consolidating the two halves @ Sun Jan  9 17:21:15 CET 2011
Executing:
/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/phrase-extract/consolidate
/home/cyrine/tools/work/model/phrase-table.half.f2e
/home/cyrine/tools/work/model/phrase-table.half.e2f.sorted
/home/cyrine/tools/work/model/phrase-table
Consolidate v2.0 written by Philipp Koehn
consolidating direct and indirect rule tables
Executing: rm -f /home/cyrine/tools/work/model/phrase-table.half.*
Executing: gzip /home/cyrine/tools/work/model/phrase-table
(7) learn reordering model @ Sun Jan  9 17:21:15 CET 2011
(7.1) [no factors] learn reordering model @ Sun Jan  9 17:21:15 CET 2011
Executing: gunzip < /home/cyrine/tools/work/model/extract.o.gz | LC_ALL=C
sort -T /home/cyrine/tools/work/model >
/home/cyrine/tools/work/model/extract.o.sorted
(7.2) building tables @ Sun Jan  9 17:21:15 CET 2011
Executing:
/home/cyrine/tools/moses-scripts/scripts-20100801-2053//training/lexical-reordering/score
/home/cyrine/tools/work/model/extract.o.sorted 0.5
/home/cyrine/tools/work/model/reordering-table. --model "wbe msd
wbe-msd-bidirectional-fe"
Lexical Reordering Scorer
scores lexical reordering models of several types (hierarchical,
phrase-based and word-based-extraction
Executing: rm /home/cyrine/tools/work/model/extract.o.sorted
(8) learn generation model @ Sun Jan  9 17:21:15 CET 2011
  no generation model requested, skipping step
(9) create moses.ini @ Sun Jan  9 17:21:15 CET 2011
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to