In the baseline translation system, the command -
"~/irstlm/bin/build-lm.sh \
   -i news-commentary-v8.fr-en.sb.en                  \
   -t ./tmp  -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en"

is used to create a "3-gram language model, removing singletons, smoothing
with improved Kneser-Ney, and adding sentence boundary symbols" is what the
tutorial says. How can I build a 2-gram language model?

And, in general, which is more efficient for a bilingual corpus of 160,000
sentences? 2-gram or 3-gram?

Thanks in advance,
Rajkiran
College of Engineering Guindy, India
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to