On Thursday 27 January 2011 16:22, Roberto Rios wrote:
> thank you for your previous answer..it was a lot of help....I am new at
> this and trying to leartn how to use this tool.....I am able to run
> mgiza..I am using a small corpus now for testing (news-commentary en es). I
> get this error after model4 iteration6:
>
> ERROR: Giza did not produce the output file
> /home/roberto/demo/tools/working-dir/giza.es-en/es-en.A3.final. Is your
> corpus clean (reasonably-sized sentences)? at ./train-model.perl line 1013.
>
> 1. corpuses were tokenized.cleaned and lowercased...what could it be wrong?

Hi Roberto

You need to have a look at the giza log file to see what went wrong. Maybe the 
merging of alignments failed. 

>
> 2. why do i have to run so many iterations per model (m1=5 hmm=5, H333444:
> Viterbi Training) and what is the difference between the models? where can
> i find good information about it?

The best place would be Philipp Koehn's book.
http://www.statmt.org/book/

The original reference for the IBM models is this one
http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf

and also see
http://acl.ldc.upenn.edu/J/J03/J03-1002.pdf

The sequence of model iterations is one that people have found works 
reasonably well in most circumstances, through a lot of experimentation.

best regards - Barry

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to