Hello, I wonder why the numerical precision used in srilm is off, and how come this has never turned into a real problem when using inside moses (for translation)?
It can be shown in toy size dataset that for long test sentences (40 words let's say) you may get a perplexity by srilm that is 1 points below what it should actually be having it precisely computed. Assuming that perplexity is a measure people use for comparing language models, then obviously relying on what srilm produces for benchmarking purpose is unfair. I suppose part of this precision problem is due to the fact that these precomputed counts are being loaded from the arpa files into some hacky suffix tree/array data structures which demand a lossy coding of numbers-with-floating-points to keep the size of the data structure reasonable. Is this the reason, or I am absolutely wrong? regards, -K
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
