[ 
https://issues.apache.org/jira/browse/SOLR-8212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005982#comment-15005982
 ] 

Esther Quansah commented on SOLR-8212:
--------------------------------------

FastVector works properly with NGram but doesn't use the same process (i.e., 
there is no measure of MAX_NUM_TOKENS_PER_GROUP with FastVector) as the 
Standard highlighter. The Postings highlight isn't working properly though - 
it's returning the same as the Standard highlighter (returning the full query 
term without formatting) and it does undergo the exact same process as the 
Standard highlighter (with the max num tokens count equalling or exceeding 50). 

> Standard Highlighter Inconsistent with NGram Tokenizer
> ------------------------------------------------------
>
>                 Key: SOLR-8212
>                 URL: https://issues.apache.org/jira/browse/SOLR-8212
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Esther Quansah
>            Priority: Minor
>         Attachments: SOLR-8212.patch
>
>
> Noticing some inconsistent behavior with the Standard Highlighter and its 
> function on terms that use the NGram Tokenizer. Ex: 
> I created a field called "title_contains" which uses the NGram Tokenizer and 
> I indexed the term "bronchoscopy". Querying "co" on the title_contains field 
> should return "bronchos<em>co</em>py", but the Standard highlighter returns 
> "bronchoscopy" without the highlighting information.
> I created a test called testNgram() which tests the above example using (1) 
> the Standard Highlighter on the ngram field type and (2) the Fast Vector 
> Highlighter on the ngram field type. The first fails and the second passes. 
> Problem identified: MAX_NUM_TOKENS_PER_GROUP = 50 (in TokenGroup.Java) and 
> for some terms numTokens >=50...this causes incorrect match start and end 
> offsets and therefore no highlighting on found term. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to