[
https://issues.apache.org/jira/browse/SOLR-12808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628521#comment-16628521
]
Federico Grillini commented on SOLR-12808:
------------------------------------------
I've inserted this bug because the official documentation says:
{quote}
CharFilters can be chained like Token Filters and placed in front of a
Tokenizer. CharFilters can add, change, or remove characters while preserving
the original character offsets to support features like highlighting.
{quote}
> Wrong highlighting using PatternReplaceCharFilterFactory
> --------------------------------------------------------
>
> Key: SOLR-12808
> URL: https://issues.apache.org/jira/browse/SOLR-12808
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: highlighter
> Affects Versions: 7.2.1, 7.4, 7.5
> Environment: Java: Oracle Corporation Java HotSpot(TM) 64-Bit Server
> VM 1.8.0_162 25.162-b12
> OS: Linux Debian 8.11
> Reporter: Federico Grillini
> Priority: Major
> Attachments: text_analysis.png
>
>
> Hi,
> the default highlighter seems to work badly in conjunction with
> PatternReplaceCharFilterFactory.
> My query is: {{verb_esame_num_tnv:(00031665 0035 9)}}
> The field type used by the field "verb_esame_num_tnv" is:
> {code:xml}
> <fieldType name="text_num_verbale" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <charFilter class="solr.PatternReplaceCharFilterFactory"
> pattern="^0*([0-9]+\s+[0-9]+\s+[0-9]+)$" replacement=" $1"/>
> <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="\s+"
> replacement=" "/>
> <tokenizer class="solr.StandardTokenizerFactory"/>
> </analyzer>
> </fieldType>
> {code}
> I've attached a screenshot of the text analysis.
> It seems that the highlighter uses the wrong offsets in the original text to
> highligth the matched tokens.
> Hope this helps.
> Regards.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]