Having switched the highlighter over from lots of Query-specific code to using the generic Query.extractTerms API I realize I have both gained something (support for all query types) and lost something (detailed boost info for each term in the tree eg Fuzzy spelling variants). The boost info was useful for selecting snippets and grading highlight intensity.
This exercise has led me to the conclusion that extractTerms is not the greatest way to provide information about queries. I see a clear analogy with the way exceptions are/were implemented in Java - there used to be no standard way of unravelling nested exceptions and this was solved in JDK1.4 by adding a "getCause()" method to exceptions to allow progressive unravelling of all exception types. Unfortunately, Query.extractTerms(Set) is a bit like solving the Java nested exceptions problem by providing a method like Throwable.getMessageStrings(Set) - it only gives part of the information about the tree elements (ie no boosts info) and provides no indication of the nested structure. Maybe we should have as a standard part of Query: //immediate child queries only Query [] getNestedQueries(); and... //immediate terms only Term [] getTerms(); A generic highlighter implementation could then: a) work with any query type b) more accurately assess the score contribution each term provides based on it's position in the stack and the boosts applied to each parent query on that branch This doesn't seem a particularly onerous API to implement and a more feature-rich Query introspection API may well enable other applications such as Query optimizers. Cheers, Mark ___________________________________________________________ Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]