it has been a while since i looked at this, but look at this (good-turning):
*not pruning* [rydell]miles: ./ngram-count -lm /tmp/test2.lm -order 3 -gt1min 0 -gt2min 0 -gt3min 0 -text ../../../mt/diskbased-l m-training/temp.txt warning: discount coeff 1 is out of range: 5.55654e-17 warning: discount coeff 6 is out of range: 1.17267 warning: discount coeff 7 is out of range: 1.14801 warning: count of count 8 is zero -- lowering maxcount warning: count of count 7 is zero -- lowering maxcount warning: count of count 6 is zero -- lowering maxcount [rydell]miles: head /tmp/test2.lm \data\ ngram 1=161 ngram 2=306 ngram 3=328 \1-grams: -1.980823 != -0.1440484 *pruning* [rydell]miles: ./ngram-count -lm /tmp/test3.lm -order 3 -text ../../../mt/diskbased-lm-training/temp.txt warning: discount coeff 1 is out of range: 5.55654e-17 warning: discount coeff 6 is out of range: 1.17267 warning: discount coeff 7 is out of range: 1.14801 warning: count of count 8 is zero -- lowering maxcount warning: count of count 7 is zero -- lowering maxcount warning: count of count 6 is zero -- lowering maxcount [rydell]miles: head /tmp/test3.lm \data\ ngram 1=161 ngram 2=306 ngram 3=44 MIles 2008/8/5 Miles Osborne <[EMAIL PROTECTED]> > you want to also check that ngrams are not getting pruned by probability > (in addition to counts) > > this whole business is a bit on the murky side and the only reason i know > about it was when i was writing a disk-based version of ngram-count a year > or so back > > Miles > > 2008/8/5 John D. Burger <[EMAIL PROTECTED]> > >> Miles Osborne wrote: >> >> >> > by default the srilm prunes singletons >> >> OK, that's good to know. But when I prune the IRST LM, I still get >> lots =more= 4-grams than the SRI LM, but lots =fewer= 5-grams >> (although less than a factor of two in either case). >> >> But perhaps I'm a bit in the weeds here ... :) >> >> - John Burger >> MITRE >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > > > -- > The University of Edinburgh is a charitable body, registered in Scotland, > with registration number SC005336. > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
