I work with a lot of XML data sources and have needed to implement an
analysis chain for Solr/Lucene that accepts XML. In the course of doing
that, I found I needed something very much like HTMLCharFilter, but that
does standard XML parsing (understands XML entities defined in an
internal or external DTD, for example). So I wrote XmlCharFilter, which
uses the Woodstox XML parser (already used by Solr). I think this could
be useful for others, and it would be nice for me if it were committed
here, so I'd like to contribute. Should I open a JIRA for this? Is
there anybody that can spare the time to review? It is basically one
class (plus a factory class) and has a fairly complete set of tests.
-Mike Sokolov
Engineering Directory
iFactory.com
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org