On Thursday 27 January 2011 16:22, Roberto Rios wrote: > thank you for your previous answer..it was a lot of help....I am new at > this and trying to leartn how to use this tool.....I am able to run > mgiza..I am using a small corpus now for testing (news-commentary en es). I > get this error after model4 iteration6: > > ERROR: Giza did not produce the output file > /home/roberto/demo/tools/working-dir/giza.es-en/es-en.A3.final. Is your > corpus clean (reasonably-sized sentences)? at ./train-model.perl line 1013. > > 1. corpuses were tokenized.cleaned and lowercased...what could it be wrong?
Hi Roberto You need to have a look at the giza log file to see what went wrong. Maybe the merging of alignments failed. > > 2. why do i have to run so many iterations per model (m1=5 hmm=5, H333444: > Viterbi Training) and what is the difference between the models? where can > i find good information about it? The best place would be Philipp Koehn's book. http://www.statmt.org/book/ The original reference for the IBM models is this one http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf and also see http://acl.ldc.upenn.edu/J/J03/J03-1002.pdf The sequence of model iterations is one that people have found works reasonably well in most circumstances, through a lot of experimentation. best regards - Barry -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
