Hello,
I am really new to Lucene, last week through this list i was really
successfull into finding a solution to my problem.
I have a new question now, i am trying to implement a new similarity
class that uses the Jaccard coefficient, i have been reading the
javadocs and a lot of other webpages on the matter, but my problem is
that i still cannot understand how to do it.
So far i know that i have to subclass the DefaultSimilarity and (if i am
not wrong) i have to edit all the build in methods to return the corect
score. Since Jaccard coefficiency is the conjuction of the
query/document sets divided by the union of the two sets i think i only
need the coord(q,d) and all the rest measures in the default similarity
can return 1 to the score computation. My problem is that i cannot
locate how to obtain the number of terms that each document has.
Also do you think this approach is correct?
I would be gratefull if you could give me advice or point towards a
tutorial on the matter cause two days of searching were fruitless in
finding an example code.
Thank you in advance.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org