Process time is divided by 6 from original code. But these results uses the old ngrams profiles (that gathers ngrams of different sizes). I must rebuild the ngrams profiles with only 3-grams in order to correctly bench the code.

nice improvement!


Sami, do you uses the whole set available at http://people.csail.mit.edu/people/koehn/publications/europarl/ , or just some parts of text to build the profiles? (If I correctly remember my previous works on ngrams, just a few Mo are necessary to have a representative set of 3-grams).

I used a relative small subset - just a few MB to build the profiles.

--
 Sami Siren

Reply via email to