Hi all. I implemented a naïve system to predict the most likely next word for a given sentence, using KenLM python interface. The algorithm is simple: append each word in the vocabulary to the last n-1 (being n the order of the language model) words of the sentence, then score them using kenlm.Model.score(). The 1-grams of the language model are used as vocabulary.
But, the prediction takes around 1 second. Is there any way to speed up this process? I have though of a couple of approaches: - The code can be further optimized by using the fragment score to cache some operations, but it seems that the feature is not implemented in the python interface (https://github.com/kpu/kenlm/issues/78). - The vocabulary can be pruned. Currently, it has around 400k words, but I cannot find a meaningful way of pruning the model that I can also justify. Thanks a lot.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
