I've just released an open source highlighter for XML documents, written in Java, that works great for Lucene search results. You can download a copy at http://www.iandallas.com/projects/xmlhighlighter/.
The program uses regular expressions to search a set of DOM nodes and transparently handles highlighting matches that span multiple elements. For example, if you had: <LINE>I am as vigilant</LINE> <STAGE-DIRECTION>Enter MESSENGER</STAGE-DIRECTION> <LINE>as a cat to steal cream</LINE> You could extract just the <LINE> nodes and the highlighter would correctly match the phrase "vigilant as a cat". Highlight events are passed to a user defined highlighter for processing, and events are generated for each node affected, which makes it easy to avoid problems with interleaving tags. For example, the XMLHighlightListenerImpl class included in the release inserts "<B>" tags around highlighted text, which would produce: <LINE>I am as <B>vigilant</B></LINE> <STAGE-DIRECTION>Enter MESSENGER</STAGE-DIRECTION> <LINE><B>as a cat</B> to steal cream</LINE> The current version is 0.8. I'm hoping to release version 1.0 in about a month, so if anyone has any feature requests, bug reports, etc, I'd love to hear them. Thanks, --ian / [EMAIL PROTECTED] -- To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@;jakarta.apache.org> For additional commands, e-mail: <mailto:lucene-user-help@;jakarta.apache.org>
