[
https://issues.apache.org/jira/browse/SOLR-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401070#comment-16401070
]
David Smiley commented on SOLR-9935:
------------------------------------
Yes it's resolved. What not obvious is that the semantics are not identical
between the UH and the original Highlighter (and I forget what the FVH does
here). The OH breaks at the word break following hl.fragsize chars, whereas the
UH does so at the sentence (not word) break. Technically the UH's choice is
configurable via {{hl.bs.type}} but as a practical matter it probably doesn't
make sense to use {{WORD}} or {{CHAR}}, since then the highlights would never
contain any words to the left of the highlighted word (based on how the UH uses
the underlying BreakIterator).
Fragmenting is a rather difficult problem, I've found. It's hard to satisfy
everyone's desires.
> When hl.method=unified add support for hl.fragsize param
> --------------------------------------------------------
>
> Key: SOLR-9935
> URL: https://issues.apache.org/jira/browse/SOLR-9935
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: highlighter
> Reporter: David Smiley
> Assignee: David Smiley
> Priority: Major
> Fix For: 6.4
>
> Attachments: SOLR_9935_UH_fragsize.patch, SOLR_9935_UH_fragsize.patch
>
>
> In LUCENE-7620 the UnifiedHighlighter is getting a BreakIterator that allows
> it to support the equivalent of Solr's {{hl.fragsize}}. So lets support this
> on the Solr side.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]