xumingming opened a new pull request, #56499:
URL: https://github.com/apache/spark/pull/56499

   ### What changes were proposed in this pull request?
   
   When a predicate binds an attribute to a literal (e.g. a.pt = '20260610') 
and another predicate references that attribute (e.g. b.pt >= f(a.pt)), 
Catalyst previously did not exploit the literal binding to derive a simpler, 
pushable predicate.
   
     ```sql
        SELECT *
        FROM a
        LEFT JOIN b
          ON a.key = b.key
         AND b.pt >= f(a.pt)
        WHERE a.pt = '20260610';
      ```
   
   Here table b is full scaned, thus very bad performance.
   
   This change extends ConstraintHelper.inferAdditionalConstraints with a 
second pass that:
   
     1. Collects Attribute = Literal bindings from the constraint set.
     2. Substitutes the literal into non-equality predicates that reference 
those attributes.
     3. Adds the resulting deterministic expressions as new inferred 
constraints.
   
   After constant folding, the inferred predicates can be pushed into scans as 
partition filters, avoiding full-table scans in cases where only a small subset 
of partitions can match.
   
   ### Why are the changes needed?
   
   Currently query of the following pattern causes full table scan for table b:
   
     ```sql
        SELECT *
        FROM a
        LEFT JOIN b
          ON a.key = b.key
         AND b.pt >= f(a.pt)
        WHERE a.pt = '20260610';
      ```
   
   With this optimization table b can get very good partition pruning.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Added unit tests.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to