Hi Moses-support, I'm looking for help on a problem that arose while building a baseline system. Apart from changing FR to DE, I've tried to follow the instructions http://www.statmt.org/moses/?n=Moses.Baseline exactly.
When I run the script train-model, the transcript training.out reports (line 2385) ERROR: Giza did not produce the output file train/giza.de-en/de-en.A3.final. Is your corpus clean (reasonably-sized sentences)? at /home/heather/mosesdecoder/dist/training/train-model.perl line 1077. (line 3285) ERROR: Giza did not produce the output file train/giza.en-de/en-de.A3.final. Is your corpus clean (reasonably-sized sentences)? at /home/heather/mosesdecoder/dist/training/train-model.perl line 1077. (I had indeed cleaned the corpus as instructed. The output concluded with Input sentences: 158840 Output sentences: 158020 So I take it this step went through ok.) It seems that people have had this problem before, for instance http://www.mail-archive.com/[email protected]/msg03434.html Barry Haddow's suggestion in that thread was to "have a look at the giza log file to see what went wrong. Maybe the merging of alignments failed." Does "giza log file" mean the Step 2 part of training.out? If so, I've tried this, but I'm not exactly sure what I'm looking for. There are a lot of WARNINGS, mainly of the form "already N iterations in hillclimb," but no other errors. Any suggestions for what symptoms to look for in the giza log file would be very welcome. In case it's relevant, let me mention another error that happens later (which I assume is a consequence of the first error): during word alignments, a "sentence mismatch error" on almost every sentence. Here's the relevant part of the transcript: at the beginning of Step 3 (around line 5500): (3) generate word alignment @ Mon May 14 05:17:23 EDT 2012 Combining forward and inverted alignment from files: train/giza.de-en/de-en.A3.final.{bz2,gz} train/giza.en-de/en-de.A3.final.{bz2,gz} Executing: mkdir -p train/model Executing: /home/heather/mosesdecoder/dist/training/symal/giza2bal.pl -d "gzip -cd train/giza.en-de/en-de.A3.final.gz" -i "gzip -cd train/giza.de-en/de-en.A3.final.gz" |/home/heather/mosesdecoder/dist/training/symal/symal -alignment="grow" -diagonal="yes" -final="yes" -both="yes" > train/model/aligned.grow-diag-final-and symal: computing grow alignment: diagonal (1) final (1)both-uncovered (1) Sentence mismatch error! Line #16665 Sentence mismatch error! Line #16666 Sentence mismatch error! Line #16667 Sentence mismatch error! Line #16668 Sentence mismatch error! Line #16669 .... Sentence mismatch error! Line #158018 Sentence mismatch error! Line #158019 Sentence mismatch error! Line #158020 Sincerely, Heather Macbeth
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
