I think you might find BM25 useful:

http://nlp.uned.es/~jperezi/Lucene-BM25/ <http://nlp.uned.es/%7Ejperezi/Lucene-BM25/>

https://issues.apache.org/jira/browse/LUCENE-2091


Other than that, CLucene should follow Lucene's API especially in those areas...


Itamar.


On 21/10/2010 8:41 PM, Šplíchal Jiří wrote:

Hi,

I have to change clucene scoring in the way that the number of documents that contain a certain term

does not influence the final score when querying documents using this term.

In other words I have to make the score of a document independent of content of the index -- it should

be influenced only by the document itself and the query.

I understood that to achieve this it is enough to overwrite the Similarity and make the inverse document

frequency - idf constant. This seems to work fine. But I noticed that the base class Similarity contains two other

idf() functions which are not virtual -- especially the float_t Similarity::idf(Term* term, Searcher* searcher)

which retrieves the document frequency and calls the virtual idf() function to compute the result.

I was wondering whether we could make those two other functions also virtual so that I could save retrieving

the document frequency at all. I could just return the constant value.

Jiri


------------------------------------------------------------------------------
Nokia and AT&T present the 2010 Calling All Innovators-North America contest
Create new apps&  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store
http://p.sf.net/sfu/nokia-dev2dev


_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to