Dear Josh, many thanks for getting back to me on this. I'm guessing that in the following command:
bin/moses-scripts/scripts-YYYYMMDD-HHMM/training/train-factored-phrase-model.perl -scripts-root-dir bin/moses-scripts/scripts-YYYYMMDD-HHMM -root-dir working-dir -corpus working-dir/corpus/europarl.lowercased -f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:5:working-dir/lm/europarl.lm:0 the alignment is done before train-factored-phrase-model.perl is invoked, and it's not necessary to retreat to the ngram-count step? It would be good to go back to the Mac as it has more RAM (2GB vs 1.5GB on the Ubuntu machine) and more disk space (66GB available). Is this enough for the europarl corpus? I read somewhere that 4GB RAM is desirable. Llio On 8/15/08, Josh Schroeder <[EMAIL PROTECTED]> wrote: > I was able to regenerate the bug below on my Mac. As expected, it's a > problem with OS X needing to use "gzcat" instead of "zcat", and > train-factored-phrase-model.perl being hard-coded to zcat. > OS X has a dumbed-down version of zcat that won't work with the script. The > failure is actually a few steps earlier, in word alignment extraction. An > empty word alignment file gets generated and things go downhill from there. > > There's a $ZCAT variable in the file you can set to "gzcat", but there was > one instance that doesn't use it. I've checked in a change with $ZCAT used > exclusively, but left the value as 'zcat' instead of 'gzcat' to keep the > linux majority happy. > > http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/scripts/training/train-factored-phrase-model.perl?r1=1865&r2=1875&pathrev=1875 > > change > my $ZCAT = "zcat"; > to > my $ZCAT = "gzcat"; > > in the new file and it should work for all the Mac folks. > > There are a few other training scripts that do this, it's pretty easy to do > a find and replace in them. > > -Josh > > On 13 Aug 2008, at 11:53, Josh Schroeder wrote: > > > > > > > you may have already received my email on the following problem when > > > building the language model: > > > > > > Executing: cat ./model/extract.0-0.o.part* > ./model/extract.0-0.o > > > cat: ./model/extract.0-0.o.part*: No such file or directory > > > Exit code: 1 > > > > > > > That's building the phrase table, not the language model. It seems like > several people on the list are having problems with this step, so I'm going > to take a look at the training process and post something to the list in the > next day or two. > > > > > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
