Hi, I wonder if SRI does any sort of implicit pruning or refinement? To be more precise, is there any way to force SRI not to prune anything (removing singletons, etc). I thought that my way of calling it does what I want (not pruning), but then I don't know how to explain getting different results. This is how I call SRI:
----------------------------------------------------------------------------------------------------- ./ngram-count -order 3 -text training.txt -write training.ngrams ./ngram-count -order 3 -read training.ngrams -lm training.binary -interpolate -ukndiscount -gt1min 0 -gt2min 0 -gt3min 0 -write-binary-lm ./ngram -order 3 -lm training.binary -ppl test.txt -debug 2 am I missing/misusing something? -------------------------------------------------------------------------------------------------------- An example to show this problem: (Example-1): Test: "13 13 13" Training: "13 13 13 13 17" perplexity *matches* SRI: "2.79327" (Example-2): Test: "13 13 13" Training "13 13 13 13 13 13 17 17 17 17 17 14 14 15 15 15 16 16 16 16" perplexity *doesn't match* SRI: "4.51546" and what SRI returns us "4.242". ------------------------------------------------------------------------------------------------------- Thanks in advance, Koorm
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
