In the baseline translation system, the command - "~/irstlm/bin/build-lm.sh \ -i news-commentary-v8.fr-en.sb.en \ -t ./tmp -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en"
is used to create a "3-gram language model, removing singletons, smoothing with improved Kneser-Ney, and adding sentence boundary symbols" is what the tutorial says. How can I build a 2-gram language model? And, in general, which is more efficient for a bilingual corpus of 160,000 sentences? 2-gram or 3-gram? Thanks in advance, Rajkiran College of Engineering Guindy, India
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
