I think the TermScorer could be used to produce some useful feedback on performance of terms used in queries with the addition of some new methods: int getNumDocMatches(); float getAverageScore();
These could be used in the following scenarios: * selecting which terms to offer spelling correction on (when numDocMatches==0) * influencing the highlighter selections (doc fragments scored based on contained term weights) * For "more like this" natural language type queries the highlighter could highlight only "significantly" scored terms and ignore low-scoring noise words. The stats accumulation code that would need adding to term scorer would add negligible overhead but the main issue would be how to expose the TermScorer object to users. I had initially planned to do all of this with a new class that required no Lucene changes. That would have looked like this: //wrap normal query in a new query ProfilerQuery pq=new ProfilerQuery(anyLuceneQuery); //run query as normal searcher.search(pq...) //analyze results ProfiledTermStats[] ts=pq.getTermStats() for(int i=0;i<ts.length;i++) { System.out.println(ts[i].getTerm()+" in "+ts[i].getNumMatches+ " docs, ave score="+ts[i].getAverageScore() ); } I quickly discovered this wasnt possible with requiring a change to the existing lucene code. Anyone else find this a worthwhile change? I know it would be possible to derive all this information using existing APIs but it would effectively involve another pass of the same index data. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]