Swinky commented on pull request #34062:
URL: https://github.com/apache/spark/pull/34062#issuecomment-925108409


   > do you mean `InferFiltersFromConstraints` can generate static partition 
predicates and we don't need to trigger DPP in that case?
   
   @cloud-fan correct, examples below:
   
   dimTable `d` has columns (d1, d2...)
   factTable `f` has columns (f1, f2, f3...) partitioned on f1, f2.
   
   Example 1:
   ```
   
                        join(d1=f1)
                       /          \
            Filter(d1=100)   FactTable(f)
                    |                           
                dimTable(d)
   ```
   
         PartitionFilters for FactTable: [f1=100, f1 in dpp-subquery] // 
"f1=100" here is inferred in `InferFiltersFromConstraints`
         After Proposed change: [f1=100]
   
   
   Example 2:
   ```
                join(d1=f1, d2=f2)
                /               \
        Filter(d1=100)    FactTable(f)
                |                               
         dimTable(d)
   ```
   
         PartitionFilters for FactTable now: [f1=100, f1 in (d1 values from 
dpp-subquery1), f2 in (d2 values from dpp-subquery1)] // "f1=100" here is 
inferred in `InferFiltersFromConstraints`
         After Proposed change: [f1=100, f2 in (d2 values from dpp-subquery1)]
   
   
   Example 3:
   ```
                                join(d1=f1, d2=f2)
                                /               \
                Filter(d1=100 || d3=200)         FactTable(f)
                            |                           
                       dimTable(d)
   ```
   
         PartitionFilters for FactTable now: [f1 in (d1 values from 
dpp-subquery1), f2 in (d2 values from dpp-subquery1)]
         After Proposed change: No change in this case as the filter references 
in the filter are not a subset of d1 nor it is subset of d3.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to