I need to take an html page  that I retrieve from my lucene search and 
highlight all of the terms that are part of the search.  I need to skip over 
any html tags since I don't want any words in tags which happen to match the 
search to be highlighted.

Note that I don't want sections of the document.  I need to highlight all terms 
in the document (with a <span> or something similar) and get back the entire 
document (with the new <span>s) so it can be displayed in its entirety with the 
search terms highlighted.

Last time I did this (in the days of 1.4.2 - so a while ago), I had to write a 
custom tokenizer that skipped over the html tokens so that I didn't 
accidentally highlight them.  I'm hoping that there is an easier way to do this 
now.

Suggestions?

Reply via email to