On 4/22/07, jafarim <[EMAIL PROTECTED]> wrote:
I am trying to implement some TopScoreHitCollector class; a kind of
TopDocCollector which collects the documents the score of which is higher
than a threshold. The threshold will be configurable in the constructor of
the class. There is seemingly a document starvation about TopDocCollector as
I could not find anything other than javadocs. My copy of lucene book seems
to be too old and could not provide any help in this regard. I know this
must be quite easy but I don't know where to start. Any help?
--jaf
As far as documentation, TopDocsCollector is really an expert-level class
and is public more due to it needing in Nutch than it being a necessary
part of the public API.
TopDocsCollector is just a hit collector, so look at the collect()
method... and replace 0.0f with your score threshold. Be aware that
score thresholds don't work well in general since scores aren't really
comparable from one query to another.
// javadoc inherited
public void collect(int doc, float score) {
if (score > 0.0f) {
totalHits++;
if (hq.size() < numHits || score >= minScore) {
hq.insert(new ScoreDoc(doc, score));
minScore = ((ScoreDoc)hq.top()).score; // maintain minScore
}
}
}
-Yonik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]