Hi Moses-support,

I'm looking for help on a problem that arose while building a baseline
system.  Apart from changing FR to DE, I've tried to follow the
instructions http://www.statmt.org/moses/?n=Moses.Baseline exactly.

When I run the script train-model, the transcript training.out reports

(line 2385) ERROR: Giza did not produce the output file
train/giza.de-en/de-en.A3.final. Is your corpus clean (reasonably-sized
sentences)? at /home/heather/mosesdecoder/dist/training/train-model.perl
line 1077.
(line 3285) ERROR: Giza did not produce the output file
train/giza.en-de/en-de.A3.final. Is your corpus clean (reasonably-sized
sentences)? at /home/heather/mosesdecoder/dist/training/train-model.perl
line 1077.

(I had indeed cleaned the corpus as instructed.  The output concluded with
Input sentences: 158840  Output sentences:  158020
So I take it this step went through ok.)

It seems that people have had this problem before, for instance
http://www.mail-archive.com/[email protected]/msg03434.html

Barry Haddow's suggestion in that thread was to "have a look at the giza
log file to see what went wrong. Maybe the merging of alignments failed."
 Does "giza log file" mean the Step 2 part of training.out?  If so, I've
tried this, but I'm not exactly sure what I'm looking for.  There are a lot
of WARNINGS, mainly of the form "already N iterations in hillclimb," but no
other errors.

Any suggestions for what symptoms to look for in the giza log file would be
very welcome.


In case it's relevant, let me mention another error that happens later
(which I assume is a consequence of the first error):  during word
alignments, a "sentence mismatch error" on almost every sentence.  Here's
the relevant part of the transcript:  at the beginning of Step 3 (around
line 5500):

(3) generate word alignment @ Mon May 14 05:17:23 EDT 2012
Combining forward and inverted alignment from files:
  train/giza.de-en/de-en.A3.final.{bz2,gz}
  train/giza.en-de/en-de.A3.final.{bz2,gz}
Executing: mkdir -p train/model
Executing: /home/heather/mosesdecoder/dist/training/symal/giza2bal.pl -d
"gzip -cd train/giza.en-de/en-de.A3.final.gz" -i "gzip -cd
train/giza.de-en/de-en.A3.final.gz"
|/home/heather/mosesdecoder/dist/training/symal/symal -alignment="grow"
-diagonal="yes" -final="yes" -both="yes" >
train/model/aligned.grow-diag-final-and
symal: computing grow alignment: diagonal (1) final (1)both-uncovered (1)
Sentence mismatch error! Line #16665
Sentence mismatch error! Line #16666
Sentence mismatch error! Line #16667
Sentence mismatch error! Line #16668
Sentence mismatch error! Line #16669
....
Sentence mismatch error! Line #158018
Sentence mismatch error! Line #158019
Sentence mismatch error! Line #158020


Sincerely,
Heather Macbeth
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to