[
https://issues.apache.org/jira/browse/LUCENE-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578808#comment-16578808
]
ASF subversion and git services commented on LUCENE-8446:
---------------------------------------------------------
Commit 8d3f59a47f2a4d6e53ef352e9ce436553f617070 in lucene-solr's branch
refs/heads/master from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8d3f59a ]
LUCENE-8446: DefaultPassageFormatter: merge overlapping matches
> UnifiedHighlighter DefaultPassageFormatter should merge overlapping offsets
> ---------------------------------------------------------------------------
>
> Key: LUCENE-8446
> URL: https://issues.apache.org/jira/browse/LUCENE-8446
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/highlighter
> Reporter: David Smiley
> Assignee: David Smiley
> Priority: Minor
> Attachments: LUCENE-8446.patch
>
>
> The UnifiedHighlighter's DefaultPassageFormatter (mostly unchanged from the
> old PostingsHighlighter) will format overlapping matches by closing a tag and
> immediately opening a tag. I think this is a bit ugly structurally and it
> ought to continue the tag is if the matches were merged. This is extremely
> rare in practice today since a match is always a word, and thus we'd only see
> this behavior if multiple words at the same position of different offsets are
> highlighted. The advent of matches representing phrases will increase the
> probability of this, and indeed was discovered while working on LUCENE-8286.
> Additionally, and related, OffsetsEnums should internally be ordered by the
> end offset if the start offset is the same.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]