Hi all, I am wondering if the raw scores obtained from HitCollector can be used to compare relevance of documents to different queries?
E.g. two phrase queries are issued : (PQ1: "Barack Obama" and PQ2: "John McCain"). if a document (doc1) belongs to the result sets of both queries and has the raw score of 5 for PQ1 and 3 for PQ2, can I say that doc1 is more relevant to "Barack Obama" than to "John McCain"? There have been some previous discussions about this at [1,2]. On the other hand, the javadoc of the Similarity class says "*queryNorm(q) * is a normalizing factor used to make scores between queries comparable. This factor does not affect document ranking (since all ranked documents are multiplied by the same factor), but rather just attempts to make scores from different queries (or even different indexes) comparable. " Please advise. Thanks. Ng. [1] http://thread.gmane.org/gmane.comp.jakarta.lucene.user/10760/focus=10810 [2] http://www.gossamer-threads.com/lists/lucene/java-user/35051?search_string=compare%20score%20across%20queries;#35051 [3] http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/search/Similarity.html