Hello I'm translating from frensh to english language. To generate a translation model, i prepared bilingual corpus based on class. So, to do that, i substitute each word in the bilingual corpus based on words by the class to which it belongs. the original corpus based on words is trained well, but when i trained the bilingual corpus based on class, i get the next errors:
[bensalemraja@localhost simple_demo]$ /home/bensalemraja/moses-scripts/scripts-20101214-2126/training/train-model.perl -scripts-root-dir /home/bensalemraja/moses-scripts/scripts-20101214-2126/ -root-dir /media/win_d/simple_demo/travail_manel_classes -corpus /media/win_d/simple_demo/travail_manel_classes/corpus/corpus_classes.lowercased -f fr -e en -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:3:/media/win_d/simple_demo/travail_manel_classes/lm/corpus_classes.lm > /media/win_d/simple_demo/travail_manel_classes/training.out Using SCRIPTS_ROOTDIR: /home/bensalemraja/moses-scripts/scripts-20101214-2126/ Using single-thread GIZA (1) preparing corpus @ Tue Nov 22 15:49:25 CET 2011 (1.1)...... (1.2)...... (1.3) numberizing corpus /media/win_d/simple_demo/travail_manel_classes/corpus/fr-en-int-train.snt @ Tue Nov 22 15:49:33 CET 2011 Unknown word 'cluster72 ' Use of uninitialized value in concatenation (.) or string at /home/bensalemraja/moses-scripts/scripts-20101214-2126/training/train-model.perl line 782, <IN_EN> line 1112. (....) Use of uninitialized value in concatenation (.) or string at /home/bensalemraja/moses-scripts/scripts-20101214-2126/training/train-model.perl line 782, <IN_EN> line 24373. (2) running giza @ Tue Nov 22 15:49:37 CET 2011 (2.1a) running snt2cooc fr-en @ Tue Nov 22 15:49:37 CET 2011 ...... (2.1b) running giza fr-en @ Tue Nov 22 15:49:38 CET 2011 /media/win_d/demo/tools/bin/GIZA++ -CoocurrenceFile /media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en.cooc -c /media/win_d/simple_demo/travail_manel_classes/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o /media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s /media/win_d/simple_demo/travail_manel_classes/corpus/en.vcb -t /media/win_d/simple_demo/travail_manel_classes/corpus/fr.vcb Executing: /media/win_d/demo/tools/bin/GIZA++ -CoocurrenceFile /media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en.cooc -c /media/win_d/simple_demo/travail_manel_classes/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o /media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s /media/win_d/simple_demo/travail_manel_classes/corpus/en.vcb -t /media/win_d/simple_demo/travail_manel_classes/corpus/fr.vcb Reading vocabulary file from:/media/win_d/simple_demo/travail_manel_classes/corpus/en.vcb Reading vocabulary file from:/media/win_d/simple_demo/travail_manel_classes/corpus/fr.vcb ERROR: Forbidden zero sentence length 0 ERROR: Forbidden zero sentence length 0 ERROR: Forbidden zero sentence length 0 ERROR: Forbidden zero sentence length 0 ERROR: Execution of: /media/win_d/demo/tools/bin/GIZA++ -CoocurrenceFile /media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en.cooc -c /media/win_d/simple_demo/travail_manel_classes/corpus/fr-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o /media/win_d/simple_demo/travail_manel_classes/giza.fr-en/fr-en -onlyaldumps 1 -p0 0.999 -s /media/win_d/simple_demo/travail_manel_classes/corpus/en.vcb -t /media/win_d/simple_demo/travail_manel_classes/corpus/fr.vcb died with signal 11, without coredump ------------------------------------------------ My class model is like this: =========english======== access;cluster25 accidental;cluster53 accidentally;cluster53 accompanied;cluster32 accompanying;cluster32 accordance;cluster78 account;cluster64 accrued;cluster99 accumulated;cluster37 accuracy;cluster99 ====================== ========frensh========= absolues;cluster64 absolument;cluster45 absolus;cluster64 accent;cluster90 accents;cluster90 accentuation;cluster51 accentue;cluster51 accentué;cluster51 accentuées;cluster51 acceptables;cluster78 acceptant;cluster78 ===================== can you help me? thanks in advance.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
