Hi, which version of symal are you using?
The one distributed with Moses has not changed, but I am aware that Nicola Bertoldi's online mgiza includes a version symal with reported behaviour. You should use the Moses one (in the Moses bin directory). -phi On Mon, Oct 6, 2014 at 4:00 AM, Maarten van Gompel <[email protected]> wrote: > Hi, > > I'm using the latest git version of moses, and it seems as if the training > pipeline got broken somehow as the format of aligned.grow-diag.final changed. > > I'm invoking model-train.perl as follows: > > /vol/customopt/machine-translation/src/mosesdecoder/scripts/training/train-model.perl > -external-bin-dir /vol/customopt/machine-translation/bin -root-dir . > --corpus train --f fr --e en --first-step 1 --last-step 9 -reordering > msd-bidirectional-fe --lm 0:3:/scratch/proycon/mosestest/train.fr.lm -mgiza > -mgiza-cpus 20 -cores 20 -sort-buffer-size 10G -sort-batch-size 253 > -sort-compress gzip -sort-parallel 20 > > And it fails with warning like these on every sentence pair: > > WARNING: Et is a bad alignment point in sentence 44968 > T: If we do , I am sure we will be listened to . > S: Et lorsque nous serons capables de le faire , je suis sûr qu' ils nous > écouteront . > > Looking into the code of 'extract', I see aligned.grow-diag-final is supposed > to consist of space seperated lines with %d-%d (the alignments). But my > aligned.grow-diag-final seems to be in a newer format and looks like this: > > Je trouve que ce n' est pas acceptable . {##} I consider this to be > unacceptable . {##} 0-0 1-1 2-1 3-2 6-4 4-5 5-5 6-5 7-5 8-6 > > The 'extract' program only expects the latter part. So I manually stripped > the source and target sentences and left only that, and then it works. It > seems something is going wrong in the training pipeline? > > Regards, > > -- > > Maarten van Gompel > Centre for Language Studies > Radboud Universiteit Nijmegen > > [email protected] > http://proycon.anaproy.nl > http://github.com/proycon > > GnuPG key: 0x1A31555C XMPP: [email protected] > Bitcoin: 1BRptZsKQtqRGSZ5qKbX2azbfiygHxJPsd > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
