RE: wrong BM25 implementation in Lucene

2006-10-26 Thread J.Zhu
Hi, Univ. of Amsterdam has provided a downloadable version of a language modelling version of Lucene. Their language model is not BM25 but is quite similar in nature. The version is at: http://ilps.science.uva.nl/Resources/#lm-lucen I have worked on their version a bit, they have created new clas

Re: wrong BM25 implementation in Lucene

2006-10-25 Thread Doron Cohen
One thing that may be causing problems is that "cooc" is not summing on the various cases that the "ignore case equality" holds. Since you are ignoring cases I assume the analyzer being used is not a lower casing one, so in this case if you have terms f:a and f:A you would get a count of 1 instead