Adding support for phrases could be tricky. So far I have deliberately avoided reimplementing specialized highlighting logic for each of the different types of queries eg understanding the nuances of "slop factor" in Phrase queries. I may be wrong but adding specialized support for different query types just feels like the start of a slippery slope.
If people are keen to add such support though, here are some pointers to bear in mind... Remember that the highlighter is also designed to summarize docs by selecting best fragments. One decision to be made up front is to consider if a special "Fragmenter" implementation is required that uses the query to influence the way it breaks the doc into fragments ie. it ensures that matching words in phrase queries or span queries remain in the same fragment. If phrases matches are allowed to span fragments thought needs to be given as to how the fragments are scored. Do phrases/spans get marked up with one tag eg <B>My Phrase</B> or many eg <B>My</B> <B>Phrase</B> ? I expect "many" is the answer given the possibility of other query terms appearing intermingled in a phrase with a high slop factor or a span. The position of terms in the phrases will need to be known by the Formatter implementation before attempting to mark up the text. This could/should be done using position info in the Lucene index rather than requiring a separate analyzer pass over the original text. Most of this should be acheivable using specialized implementations of Formatter, Fragmenter and Scorer so the main Highlighter code should be untouched. These are just some of the "gotchas" off the top of my head. I'm sure there will be several more issues waiting to be revealed... Hope this helps anyway. Cheers Mark --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]