kdt523 opened a new pull request, #15346: URL: https://github.com/apache/lucene/pull/15346
This PR delivers two minimal, targeted fixes with regression tests: Core: Prevent MaxScoreBulkScorer from advancing past a leaf’s maxDoc under filtered disjunctions (avoids potential EOF when norms are accessed after NO_MORE_DOCS). Highlighter: Don’t merge zero-scored fragments (GH-15333) to avoid producing merged passages that include content with no matches. Motivation MaxScoreBulkScorer: With a restrictive filter plus a disjunction, the candidate windowing logic could overshoot a segment’s maxDoc. If norms were accessed after NO_MORE_DOCS, this could trigger unexpected EOF. Highlighter: Zero-score fragments should not be merged with adjacent fragments, otherwise the final passage can include unrelated content with no matches. Changes Core (lucene/core) Clamp candidate advancement at the leaf boundary in MaxScoreBulkScorer (e.g., within nextCandidate) so NO_MORE_DOCS is returned when rangeEnd exceeds maxDoc. Added regression test: org.apache.lucene.search.TestMaxScoreBulkScorerFilterBounds. Highlighter (lucene/highlighter) In Highlighter, filter out zero-scored TextFragments before mergeContiguousFragments to prevent unintended merges. Added regression test: org.apache.lucene.search.highlight.TestZeroScoreMerging. Docs Updated [CHANGES.txt] with both fixes and referenced test names. Testing New tests: lucene/core: TestMaxScoreBulkScorerFilterBounds validates filtered-disjunction execution does not score past maxDoc and does not throw. lucene/highlighter: TestZeroScoreMerging ensures zero-score fragments aren’t merged. Both tests pass locally in isolation for their respective modules. Backwards compatibility Behavior is strictly safer/more correct: Core: Prevents out-of-bounds progression; no API changes. Highlighter: Merge semantics exclude fragments with score == 0; expected/intuitive behavior, no API changes. Performance Neutral. The core change is a simple bound check in the candidate advancement logic. Highlighter change is a small pre-filter on fragments. Risk Low. Changes are localized and covered by focused regression tests. Related Fix: #15333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
