[
https://issues.apache.org/jira/browse/SOLR-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096839#comment-13096839
]
Shem M commented on SOLR-1343:
------------------------------
Is there a reason why the filter replace text tags like <b> or <i> with space?
I see that in the past it wasn't like this (from the code):
//break;//was
//return whitespace from
It make the life a lot harder when I have for example this text:
Some t<b>ex</b>t here
and I want to find "text"
> HTMLStripCharFilter
> -------------------
>
> Key: SOLR-1343
> URL: https://issues.apache.org/jira/browse/SOLR-1343
> Project: Solr
> Issue Type: Improvement
> Components: Schema and Analysis
> Affects Versions: 1.4
> Reporter: Koji Sekiguchi
> Assignee: Koji Sekiguchi
> Priority: Trivial
> Fix For: 1.4
>
> Attachments: SOLR-1343.patch
>
>
> Introducing HTMLStripCharFilter:
> * move html strip logic from HTMLStripReader to HTMLStripCharFilter
> * make HTMLStripReader depracated
> * make HTMLStrip*TokenizerFactory deprecated
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]