I work with a lot of XML data sources and have needed to implement an analysis chain for Solr/Lucene that accepts XML. In the course of doing that, I found I needed something very much like HTMLCharFilter, but that does standard XML parsing (understands XML entities defined in an internal or external DTD, for example). So I wrote XmlCharFilter, which uses the Woodstox XML parser (already used by Solr). I think this could be useful for others, and it would be nice for me if it were committed here, so I'd like to contribute. Should I open a JIRA for this? Is there anybody that can spare the time to review? It is basically one class (plus a factory class) and has a fairly complete set of tests.

-Mike Sokolov
Engineering Directory
iFactory.com


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to