I am doing some preliminary experiments with different LM models including SRILM, RandLM, KenLM. The problem I've noticed is the massive disagreement between what SRILM and KenLM report on perplexity. I noticed that the backoff technique used are different (GoodTuring,Katz V.S. Kneser-Ney) but I suspect that should have so much contribution between the reported scores.
Here what I do, on both, and what I get: On SRI: ./ngram-count -order 5 -text test.txt -write text.ngrams ./ngram-count -order 5 -read text.ngrams -lm text.arpa and then I query via: ./ngram −lm text.arpa −ppl query.txt On KenLM: bin/lmplz -o 5 <text.txt >text.arpa binarized with: bin/build_binary text.arpa text.binary and then I query via: bin/query text.arpa <query.txt The perplexity reported by KenLM is 8.98 and on the same dataset by SRI is 73.7443. This is the same dataset and I suspect if backoff will have such an effect. K.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
