Here is my list of commands for running: rm -r ./model rm ./giza.en-fr/* rm ./giza.fr-en/* rm -r ./corpus
/home/cuongh/mosesdecoder/scripts/training/train-model.perl -mgiza -mgiza-cpus 24 -cores 24 -parallel -sort-buffer-size 10G -sort-batch-size 1021 -sort-compress gzip -sort-parallel 24 -root-dir /home/cuongh/mosesdecoder -corpus /home/cuongh/mosesdecoder/corpus.lowercased -f en -e fr -alignment grow-diag-final-and -reordering msd-bidirectional-fe -giza-option m1=5,m2=3,mh=0,m3=3,m4=0 -external /home/cuongh/CODE/giza-pp -lm 0:3:/home/cuongh/mosesdecoder/corpus.lowercased.fr.lm I guess the runing commands are okie. The problem is the corpus. So, please suggest me some of errors which many people here maybe face with the training data which could be the problem. Thanks and best regards, C. Hoang On Tue, Dec 11, 2012 at 11:43 PM, Cuong Hoang <[email protected]>wrote: > Hi all, > I train MOSES on the task of a little bit noisy. It means around 90%-95% > pairs are bilingual pairs. I have to face with quite disturbing errors. > > When I use GIZA++, I'm stuck with the errors, such as: > *alignment point (28,29) out of range (0-14,0-9) in line 468517, ignoring* > *alignment point (30,31) out of range (0-14,0-9) in line 468517, ignoring* > *alignment point (31,31) out of range (0-14,0-9) in line 468517, ignoring* > *alignment point (31,32) out of range (0-14,0-9) in line 468517, ignoring* > > and the next results, for example: > > *WARNING: sentence 415878 has alignment point (6, 7) out of bounds (7, 7)* > *T: øksnes là một đô_thị ở hạt nordland* > *S: øksnes is a municipality in nordland county* > > Otherwise, when I also use MGIZA++ instead of GIZA++, i am stuck in the > errors such as: > > *Use of uninitialized value in substitution ...* > *Use of uninitialized value $a in split* > or > *Use of uninitialized value $a in scalar chomp > at scripts/training/LexicalTranslationModel.pm* > > If you can, please give me any suggestion for this problem? > Tks and best regards, > C. Hoang > -- > Best Regards, > C. Hoang > SMT Nerd > > -- Best Regards, C. Hoang SMT Nerd
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
