With regards to Korfut's TermCollector proposition: I do not like the new requirement for all query classes to implement getTerms(). This is effectively what they are currently required to do in the query.rewrite() method - express their high-level logic in primitive terms.
I beleive the getTerms() implementation should make use of this existing feature of all query objects (as I have done in QueryHighlightExtractor.java), and not create a new set of requirements for all query classes - lets not add complexity where its not needed. So, I think the real question is should there be a home for a getTerms() function that operates on primitive (rewritten) queries? We can move some of the logic in QueryHighlightExtractor.java to somewhere core if the consensus is that this is a generally useful feature (though I have yet to think of one outside of highlighting) Incidentally, it may be of interest to note that I am busy packaging up a getTopTerms() feature that analyses the contents of query result sets and returns the "significant" terms and phrases found in the result set based on their relative frequency compared to that of the corpus. Its quite effective and of use in query expansion and highlighting. This may be of interest to those proposing query.getTerms() changes. Cheers Mark --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]