I've just released an open source highlighter for XML documents, written in
Java, that works great for Lucene search results. You can download a copy at
http://www.iandallas.com/projects/xmlhighlighter/.

The program uses regular expressions to search a set of DOM nodes and
transparently handles highlighting matches that span multiple elements.

For example, if you had:
<LINE>I am as vigilant</LINE>
  <STAGE-DIRECTION>Enter MESSENGER</STAGE-DIRECTION>
<LINE>as a cat to steal cream</LINE>

You could extract just the <LINE> nodes and the highlighter would correctly
match the phrase "vigilant as a cat".

Highlight events are passed to a user defined highlighter for processing,
and events are generated for each node affected, which makes it easy to
avoid problems with interleaving tags. For example, the
XMLHighlightListenerImpl class included in the release inserts "<B>" tags
around highlighted text, which would produce:

<LINE>I am as <B>vigilant</B></LINE>
  <STAGE-DIRECTION>Enter MESSENGER</STAGE-DIRECTION>
<LINE><B>as a cat</B> to steal cream</LINE>

The current version is 0.8. I'm hoping to release version 1.0 in about a
month, so if anyone has any feature requests, bug reports, etc, I'd love to
hear them.

Thanks,
--ian / [EMAIL PROTECTED]


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@;jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@;jakarta.apache.org>

Reply via email to