2008/8/5 John D. Burger <[EMAIL PROTECTED]>: > I'm starting to think it's a lost cause to try to get one LM > implementation to act very much like the other. Thanks for the > insights, though!
I also spent some time unsuccessfully trying to exactly match the SRILM toolkit's output. Aside from the various default settings, there is some pruning going on when using kndiscount. It's fairly easy to produce a LM that's within a few digits of precision, but it's hard to replicate perfectly. Of course, those pesky few last digits change the LM scores very much. You could just re-tune, but that's non-deterministic so things are still not directly comparable; kind of annoying. There is also the larger question of "What does it get you?" (aside from curiosity)... At the time, we were interested in building monolithic SRI-style LMs on huge corpora. In the end, general interest seems to have moved towards distributed LMs, mooting the original exercise. Um... Good luck! ~amittai _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
