Hi all.

I implemented a naïve system to predict the most likely next word for a
given sentence, using KenLM python interface. The algorithm is simple:
append each word in the vocabulary to the last n-1 (being n the order of
the language model) words of the sentence, then score them using
kenlm.Model.score(). The 1-grams of the language model are used as
vocabulary.

But, the prediction takes around 1 second. Is there any way to speed up
this process? I have though of a couple of approaches:

   - The code can be further optimized by using the fragment score to cache
   some operations, but it seems that the feature is not implemented in the
   python interface (https://github.com/kpu/kenlm/issues/78).
   - The vocabulary can be pruned. Currently, it has around 400k words, but
   I cannot find a meaningful way of pruning the model that I can also justify.

Thanks a lot.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to