Hi,
Univ. of Amsterdam has provided a downloadable version of a language
modelling version of Lucene. Their language model is not BM25 but is
quite similar in nature. The version is at:
http://ilps.science.uva.nl/Resources/#lm-lucen
I have worked on their version a bit, they have created new clas
One thing that may be causing problems is that "cooc" is not summing on the
various cases that the "ignore case equality" holds. Since you are ignoring
cases I assume the analyzer being used is not a lower casing one, so in
this case if you have terms f:a and f:A you would get a count of 1 instead