markharw00d wrote:
I was thinking along the lines of wrapping some core classes such as IndexReader to somehow observe the query matching process and deduce from that what to highlight (avoiding the need for MemoryIndex) but I'm not sure that is viable. It would be nice to get some more match info out of the main query logic as it runs to aid highlighting rather than reverse engineering the basis of a match after the event.
I have been thinking about a way to pursue this, and it does not seem clear that there is a nice solution. Even if you could wrap Querys or other classes to observe matched tokens (non trivial since a Query is only concerned if it matches a doc, not which tokens it matches at which positions), you would still have the major problem of which matches do you keep information for. It does not seem practical to save all of the information to highlight *any* doc after a search and it also seems unlikely that you would know which docs you wanted to highlight before the search. The only compromise that I can see is maybe just storing info to highlight the first n docs, but even here, while the scoring is occurring you do not yet know the return order. Also, there is probably little value in knowing which Tokens were matches for highlighting unless you have stored offsets as well.
Unless someone has any suggestions on how to accomplish this, I think time would be better spent improving the existing Highlighter framework.
Perhaps Ronnie's Highlighter should be added as an alternate Highlighter that is less feature rich but much faster on large docs. It looks to me like there is unlikely to be a faster Highlighting method for simple non-position aware highlighting.
- Mark --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]