Just out of curiousity, why does LUCENE-1377 have a minor priorty? https://issues.apache.org/jira/browse/LUCENE-1377
Don't people index, filter, search HTML, perhaps more than any other format? Looks like Solr has moved from a Reader to CharFilter: http://lucene.apache.org/solr/api/org/apache/solr/analysis/HTMLStripCharFilter.html https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/java/org/apache/solr/analysis/HTMLStripCharFilter.java If the Lucene developers prefer a contrib package or to keep such an extension in Solr, the issue should probably be closed. I'm not sure I follow the discussion in JIRA as Solr developers can choose whether or not to use any class added to Lucene at any time after its addition. Thanks for any feedback, Justin --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
