ahshahid opened a new pull request, #53838:
URL: https://github.com/apache/spark/pull/53838

   …romConstraints rule are run as part of the operatorOptimizationRuleSet, so 
that generated filter on Left leg of Join, in some specific situation, is not 
missed
   
   ### What changes were proposed in this pull request?
   The change is to merge the batch 
   ```
   Batch("Infer Filters", Once,
           InferFiltersFromGenerate,
           InferFiltersFromConstraints)
   
   ```
   into the batch of rules defined by 
   "operatorOptimizationRuleSet".
   
   ### Why are the changes needed?
   The way current set of optimizer rules work is that 
   ```
   // **step1**
   Batch("Operator Optimization before Inferring Filters", fixedPoint,
   operatorOptimizationRuleSet: _*),
    
   // **step2**
   Batch("Infer Filters", Once,
   InferFiltersFromGenerate,
   InferFiltersFromConstraints),
    
   // **step3**
   Batch("Operator Optimization after Inferring Filters", fixedPoint,
   operatorOptimizationRuleSet: _*)
   ```
   In the batch of rules "operatorOptimizationRuleSet", the conversion of Joins 
like "Left Outer" to Inner happens.{}
   After that "InferFiltersFromConstraints" is called which is able to create 
new constraints like IsNotNull, to be pushed on either side of the Inner Join 
tables.{}
    
   Notice that "operatorOptimizationRuleSet" is called twice, before and after 
inferring filters.{}
    
   It so happens that in TPCDS Q5, atleast, the conversion of LeftOuter to 
Inner for one of the Join cases, happens in step3.
    
   But since, there is no further call of InferFiltersFromConstraints, the 
IsNotNull constraints generation is missed, for the Left Leg of the Join.
   
   Please note that the numerous plan changes are cosmetic ( caused by 
reordering of text), except in case of q5, where  two new  not null constraints 
are created for the Left Leg of the Join.
   
   I will try to fix the cosmetic change issue though might be difficult ..
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Passing of existing tests.. 
   will be adding a dedicated bug test.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   <!--
   If generative AI tooling has been used in the process of authoring this 
patch, please include the
   phrase: 'Generated-by: ' followed by the name of the tool and its version.
   If no, write 'No'.
   Please refer to the [ASF Generative Tooling 
Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to