Hi,

1. When training a language model (LM).

2&3. Yes, you need a monolingual corpus to create a language model. The monolingual corpus can be a stand-alone corpus or it can be the target language of a parallel corpus. Rephrased, when you translate from language A to B, you need a language model for language B. This LM can come from a monolingual corpus in language B, or from the part in language B of a parallel corpus. It does not matter if you trained a LM from a monolingual corpus that is different from your parallel training corpus, you will still be able to perform translations.

If you use an already trained moses model, you do not need to add another line in moses.ini, just replace the existing LM with your trained LM if that is what you want. If you start from the beginning and train your model from scratch, then there is no need to do anything as you have already specified the language model path in the training parameters.

4. For example with SRILM, go to srilm/bin and run "./ngram-count -order 5 -unk -kndiscount -interpolate -text path_to_corpus -lm name_of_your_lm_file", this will create a 5gram language model. The parameter explanations are found in the Moses manual or on the srilm website.

Stefan<http://www.indoona.com/>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to