SRILM prunes singletons for trigrams and above by default.  You're
likely to get better answers to SRILM-specific questions on srilm-user.

On 02/22/2015 06:28 AM, koormoosh wrote:
> Hi,
> 
> I wonder if SRI does any sort of implicit pruning or refinement? To be more
> precise, is there any way to force SRI not to prune anything (removing
> singletons, etc). I thought that my way of calling it does what I want (not
> pruning), but then I don't know how to explain getting different results.
> This is how I call SRI:
> 
> -----------------------------------------------------------------------------------------------------
> ./ngram-count -order 3 -text training.txt -write training.ngrams
> 
> ./ngram-count -order 3 -read training.ngrams -lm training.binary
> -interpolate -ukndiscount -gt1min 0 -gt2min 0 -gt3min 0 -write-binary-lm
> 
> ./ngram -order 3 -lm training.binary -ppl test.txt -debug 2
> 
> am I missing/misusing something?
> 
> --------------------------------------------------------------------------------------------------------
> An example to show this problem:
> (Example-1):
> Test: "13 13 13"
> Training: "13 13 13 13 17"
> perplexity *matches* SRI: "2.79327"
> 
> (Example-2):
> Test: "13 13 13"
> Training "13 13 13 13 13 13 17 17 17 17 17 14 14 15 15 15 16 16 16 16"
> perplexity *doesn't match* SRI: "4.51546" and what SRI returns us "4.242".
> -------------------------------------------------------------------------------------------------------
> 
> Thanks in advance,
> Koorm
> 
> 
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
> 
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to