With regards to Korfut's TermCollector proposition:
I do not like the new requirement for all query classes to implement getTerms(). This 
is effectively what they are currently
required to do in the query.rewrite() method - express their high-level logic in 
primitive terms.

I beleive the getTerms() implementation should make use of this existing feature of 
all query objects (as I have done in
QueryHighlightExtractor.java), and not create a new set of requirements for all query 
classes - lets not add complexity where its
not needed.
So, I think the real question is should there be a home for a getTerms() function that 
operates on primitive (rewritten) queries?

We can move some of the logic in QueryHighlightExtractor.java to somewhere core if the 
consensus is that 
this is a generally useful feature (though I have yet to think of one outside of 
highlighting)

Incidentally, it may be of interest to note that I am busy packaging up a 
getTopTerms() feature that analyses the contents 
of query result sets and returns the "significant" terms and phrases found in the 
result set based on their relative frequency
compared to that of the corpus. 
Its quite effective and of use in query expansion and highlighting. 
This may be of interest to those proposing query.getTerms() changes.

Cheers
Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to