Excuse me, I don't want to write a very long email.
This is the BM25 Scorer formule:
log((N-f+0.5)/(f+0.5)) · (k1 + 1) · c / (c+k1·( (1-b)+b·l/L))
where
N = total number of documents
f = inverse frecuency (number of documents which contain the
term)
c = term frecuency in a document
l = lenght of document
L = average document lenght
k1, b = constants
I think f is the same as idf in default Lucene scorer formule and c is the
same as tf.
I implement BM25 Scorer formule in score method of BM25Scorer class (my own
Scorer class that extends of Scorer class)
public class BM25Scorer extends Scorer{
public BM25Scorer(Similarity similarity) {
super(similarity);
}
}
The problem is that I would have to implement my own Similarity class with
some specific abstract methods like queryNorm(float sumOfSquaredWeights) but
I don't know how to calculate sumOfSquaredWeights with the parameters of
BM25 Scorer formule
Do I have to change only Query, Weigth and Scorer class or I need to create
my own Similarity class?
Thanks
On 19/10/06, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
Please provide more information about what you have done so far.
On Oct 19, 2006, at 9:10 AM, beatriz ramos wrote:
> Hello,
> I'm trying to implement my own scoring algorithm with Lucene but I
> don't get
> any results.
>
> Lucene documentation explains how to implement new scoring,
> modifying Query,
> Weight and Scorer classes. I have tried this but doesn't work
>
> Do you have any idea?
> I need some example to understand the process and modifications
>
> Thanks
--------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]