Hi,

I wonder if SRI does any sort of implicit pruning or refinement? To be more
precise, is there any way to force SRI not to prune anything (removing
singletons, etc). I thought that my way of calling it does what I want (not
pruning), but then I don't know how to explain getting different results.
This is how I call SRI:

-----------------------------------------------------------------------------------------------------
./ngram-count -order 3 -text training.txt -write training.ngrams

./ngram-count -order 3 -read training.ngrams -lm training.binary
-interpolate -ukndiscount -gt1min 0 -gt2min 0 -gt3min 0 -write-binary-lm

./ngram -order 3 -lm training.binary -ppl test.txt -debug 2

am I missing/misusing something?

--------------------------------------------------------------------------------------------------------
An example to show this problem:
(Example-1):
Test: "13 13 13"
Training: "13 13 13 13 17"
perplexity *matches* SRI: "2.79327"

(Example-2):
Test: "13 13 13"
Training "13 13 13 13 13 13 17 17 17 17 17 14 14 15 15 15 16 16 16 16"
perplexity *doesn't match* SRI: "4.51546" and what SRI returns us "4.242".
-------------------------------------------------------------------------------------------------------

Thanks in advance,
Koorm
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to