[
https://issues.apache.org/jira/browse/SOLR-8212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002447#comment-15002447
]
David Smiley commented on SOLR-8212:
------------------------------------
Do the Postings or FastVector highlighters work properly for you? I know they
don't have this specific deficiency but I'm wondering if they highlight NGram
based analysis the same way as the Standard highlighter.
https://cwiki.apache.org/confluence/display/solr/Highlighting
note that postings highlighter effectively only supports
{{hl.usePhraseHighlighter=false}} at this time.
> Standard Highlighter Inconsistent with NGram Tokenizer
> ------------------------------------------------------
>
> Key: SOLR-8212
> URL: https://issues.apache.org/jira/browse/SOLR-8212
> Project: Solr
> Issue Type: Bug
> Reporter: Esther Quansah
> Priority: Minor
> Attachments: SOLR-8212.patch
>
>
> Noticing some inconsistent behavior with the Standard Highlighter and its
> function on terms that use the NGram Tokenizer. Ex:
> I created a field called "title_contains" which uses the NGram Tokenizer and
> I indexed the term "bronchoscopy". Querying "co" on the title_contains field
> should return "bronchos<em>co</em>py", but the Standard highlighter returns
> "bronchoscopy" without the highlighting information.
> I created a test called testNgram() which tests the above example using (1)
> the Standard Highlighter on the ngram field type and (2) the Fast Vector
> Highlighter on the ngram field type. The first fails and the second passes.
> Problem identified: MAX_NUM_TOKENS_PER_GROUP = 50 (in TokenGroup.Java) and
> for some terms numTokens >=50...this causes incorrect match start and end
> offsets and therefore no highlighting on found term.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]