[ 
https://issues.apache.org/jira/browse/LUCENE-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193105#comment-13193105
 ] 

Yonik Seeley commented on LUCENE-3721:
--------------------------------------

OK, so it looks like all CharFilters were broken in Solr by LUCENE-3396 (since 
last Sept!).
I just checked in a fix and added a test.
Thanks for bringing this to our attention Mike!
                
> HTMLStripCharFilterFactory behavior is different in Solr4 than it was in Solr 
> 3.x
> ---------------------------------------------------------------------------------
>
>                 Key: LUCENE-3721
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3721
>             Project: Lucene - Java
>          Issue Type: Bug
>            Reporter: Mike Hugo
>         Attachments: htmlstripfilter_test.patch
>
>
> In Solr3, using the attached configuration, HTML entities like trademark and 
> registered were being stripped (and NOT indexed) using the 
> HTMLStripCharFilterFactory.  In Solr4 it looks like those values are still 
> making it through to the index and are then appearing in faceted results 
> (we'd like them not to)
> see 
> http://lucene.472066.n3.nabble.com/HTMLStripCharFilterFactory-not-working-in-Solr4-td3685599.html
>  for background
> possibly related to this https://issues.apache.org/jira/browse/LUCENE-3690

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to