Re: [Moses-support] Left language model state in 4247

Tom Hoar Thu, 22 Sep 2011 09:40:29 -0700

 Excellent, Ken. I'll try the scoring again.

 Tom


 On Wed, 21 Sep 2011 17:33:09 +0100, Kenneth Heafield 
 <[email protected]> wrote:
> Dear Moses,
>
>     Trunk revision 4247 incorporates KenLM changes from MT Marathon
> (team: Hieu Hoang, Tetsuo Kiso, Marcello Federico, and myself) to
> minimize left language model state for chart decoding.  This resulted 
> in
> a binary file format change.
>
>     Previously, if you used e.g. a 5-gram language model, the chart
> entries would be separated by their first 4 words (in addition to 
> other
> constraints).  This change relaxes this to only as many words as
> required for correct scoring, leading to more recombination (so
> theoretically, you could lower the pop limit).  Further, the left 
> state
> keeps pointers, in lieu of word indices, that make the language model
> scoring faster.  This change only impacts KenLM; other language 
> models
> will still keep 4 words (IRSTLM is invited to read kenlm/lm/left.hh 
> and
> implement the same interface).  As a result, you should expect to see
> better model scores (on average; theoretically it could not prune a
> hypothesis that later kicks out what would have become single-best) 
> when
> using KenLM.  Also, chart now runs 5% faster with the same pruning
> settings.
>
>     When SRILM's default pruning keeps n-gram A B C D E, but removes 
> B C
> D E, this leads to several nasty corner cases.  Previously, I
> re-inserted B C D E with a blank probability.  To avoid the corner
> cases, KenLM now fully restores these entries: p(B C D E) = p(C D E) 
> +
> backoff(B C D) where p(C D E) may itself be restored.  This led to 
> major
> changes in the trie builder, but it's passing tests.  Since the blank
> probability no longer needs to be encoded, quantization now gives you
> the full 2^b probability values instead of 2^b - 1 (but backoff still
> reserves two values for +/- 0).
>
>     We've tested that LM scores come out correctly and that average
> model score goes up and I'm running more stuff.  Tom Hoar, here's 
> your
> cue.
>
> Kenneth
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Left language model state in 4247

Reply via email to