dubin555 opened a new pull request, #7332:
URL: https://github.com/apache/paimon/pull/7332

   ### Purpose
   
   Linked issue: close #7230
   
   `FileIndexPredicate.getRequiredNames()` calls `child.visit(this)` twice per 
child in its `CompoundPredicate` visitor — once discarding the result, then 
again to collect it. Since `PredicateBuilder.or()` produces right-nested binary 
trees via `reduce()`, this doubles work at each tree level, resulting in O(2^n) 
time complexity.
   
   For an IN clause with 20 values (which produces a nested OR tree of depth 
19), this means ~1,048,576 leaf visits instead of 20. In production, queries 
with moderately sized IN clauses hang indefinitely.
   
   The fix removes the redundant `child.visit(this)` call (line 130), matching 
the correct pattern already used in `PredicateVisitor.FieldNameCollector`.
   
   The bug was introduced in `ebdfa02bd` ("[hotfix] Correct visitors for 
TransformPredicate"), which refactored the visitor to handle 
`TransformPredicate` and accidentally left the duplicate call.
   
   ### Tests
   
   - `FileIndexPredicateTest.testGetRequiredNamesLinearComplexity()` — builds a 
20-element OR chain, counts leaf visits via `AtomicInteger`. Asserts exactly 20 
visits (linear). Before fix: 1,048,575 visits (exponential).
   - `FileIndexPredicateTest.testGetRequiredNamesPerformance()` — builds a 
20-element OR chain, asserts completion within 100ms.
   - `FileIndexPredicateTest.testGetRequiredNamesBasic()` — verifies 
correctness: all field names are collected from a compound predicate.
   - `FileIndexPredicateTest.testGetRequiredNamesSinglePredicate()` — verifies 
single leaf predicate returns the correct field name.
   
   ### API and Format
   
   No.
   
   ### Documentation
   
   No.
   
   ### Generative AI tooling
   
   Generated-by: Claude Code 1.0.33


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to