kdt523 opened a new pull request, #15346:
URL: https://github.com/apache/lucene/pull/15346

   This PR delivers two minimal, targeted fixes with regression tests:
   
   Core: Prevent MaxScoreBulkScorer from advancing past a leaf’s maxDoc under 
filtered disjunctions (avoids potential EOF when norms are accessed after 
NO_MORE_DOCS).
   Highlighter: Don’t merge zero-scored fragments (GH-15333) to avoid producing 
merged passages that include content with no matches.
   
   Motivation
   MaxScoreBulkScorer: With a restrictive filter plus a disjunction, the 
candidate windowing logic could overshoot a segment’s maxDoc. If norms were 
accessed after NO_MORE_DOCS, this could trigger unexpected EOF.
   Highlighter: Zero-score fragments should not be merged with adjacent 
fragments, otherwise the final passage can include unrelated content with no 
matches.
   
   Changes
   Core (lucene/core)
   Clamp candidate advancement at the leaf boundary in MaxScoreBulkScorer 
(e.g., within nextCandidate) so NO_MORE_DOCS is returned when rangeEnd exceeds 
maxDoc.
   Added regression test: 
org.apache.lucene.search.TestMaxScoreBulkScorerFilterBounds.
   Highlighter (lucene/highlighter)
   In Highlighter, filter out zero-scored TextFragments before 
mergeContiguousFragments to prevent unintended merges.
   Added regression test: 
org.apache.lucene.search.highlight.TestZeroScoreMerging.
   Docs
   Updated [CHANGES.txt] with both fixes and referenced test names.
   
   Testing
   New tests:
   lucene/core: TestMaxScoreBulkScorerFilterBounds validates 
filtered-disjunction execution does not score past maxDoc and does not throw.
   lucene/highlighter: TestZeroScoreMerging ensures zero-score fragments aren’t 
merged.
   Both tests pass locally in isolation for their respective modules.
   
   Backwards compatibility
   Behavior is strictly safer/more correct:
   Core: Prevents out-of-bounds progression; no API changes.
   Highlighter: Merge semantics exclude fragments with score == 0; 
expected/intuitive behavior, no API changes.
   
   Performance
   Neutral. The core change is a simple bound check in the candidate 
advancement logic. Highlighter change is a small pre-filter on fragments.
   
   Risk
   Low. Changes are localized and covered by focused regression tests.
   Related
   Fix: #15333 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to