Hi,

I created a 3-gram LM with the irstlm toolkit (5.0.22). The LM has about 
25M entries:

ngram 1= 300209
ngram 2= 4864097
ngram 3= 20336549


I tried to prune it with prune-lm on a Linux machine.

prune-lm --threshold=1e-6,1e-6 sun.irstlm.gz sun.pruned.irlstlm &> x.out

In the out x.out I get repeated error messages

ng: qu  0 ts=1.00059 tbs=0.0196106 k=0 ns=20

probably more than 100M identical ones. After running the pruning over 
night the stderr output reached 100GB size and I stopped the process.

Just looking at the source code I assume that lmtable::wdprune() loops 
endless over the "prune:" goto statement. Are there any problems with 
the pscale() routine?

Any hints where to look at are highly appreciated.

best regards
Christof

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to