David Smiley created LUCENE-8446:
------------------------------------
Summary: UnifiedHighlighter DefaultPassageFormatter should merge
overlapping offsets
Key: LUCENE-8446
URL: https://issues.apache.org/jira/browse/LUCENE-8446
Project: Lucene - Core
Issue Type: Improvement
Components: modules/highlighter
Reporter: David Smiley
Assignee: David Smiley
The UnifiedHighlighter's DefaultPassageFormatter (mostly unchanged from the old
PostingsHighlighter) will format overlapping matches by closing a tag and
immediately opening a tag. I think this is a bit ugly structurally and it
ought to continue the tag is if the matches were merged. This is extremely
rare in practice today since a match is always a word, and thus we'd only see
this behavior if multiple words at the same position of different offsets are
highlighted. The advent of matches representing phrases will increase the
probability of this, and indeed was discovered while working on LUCENE-8286.
Additionally, and related, OffsetsEnums should internally be ordered by the end
offset if the start offset is the same.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]