[ 
https://issues.apache.org/jira/browse/SOLR-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401070#comment-16401070
 ] 

David Smiley commented on SOLR-9935:
------------------------------------

Yes it's resolved. What not obvious is that the semantics are not identical 
between the UH and the original Highlighter (and I forget what the FVH does 
here). The OH breaks at the word break following hl.fragsize chars, whereas the 
UH does so at the sentence (not word) break. Technically the UH's choice is 
configurable via {{hl.bs.type}} but as a practical matter it probably doesn't 
make sense to use {{WORD}} or {{CHAR}}, since then the highlights would never 
contain any words to the left of the highlighted word (based on how the UH uses 
the underlying BreakIterator).

Fragmenting is a rather difficult problem, I've found.  It's hard to satisfy 
everyone's desires.

> When hl.method=unified add support for hl.fragsize param
> --------------------------------------------------------
>
>                 Key: SOLR-9935
>                 URL: https://issues.apache.org/jira/browse/SOLR-9935
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: highlighter
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Major
>             Fix For: 6.4
>
>         Attachments: SOLR_9935_UH_fragsize.patch, SOLR_9935_UH_fragsize.patch
>
>
> In LUCENE-7620 the UnifiedHighlighter is getting a BreakIterator that allows 
> it to support the equivalent of Solr's {{hl.fragsize}}.  So lets support this 
> on the Solr side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to