weixiuli opened a new pull request #34871:
URL: https://github.com/apache/spark/pull/34871


   
   ### What changes were proposed in this pull request?
    Introduce a rule to handle pushing down a dynamic partition pruning from a 
child join to its parent join, when the following conditions are met:
    (1) the pruning side of the parent join is partition table
    (2) the table to prune is filterable by the JOIN key
    (3) the parent join operation is one of the following types: INNER, LEFT 
SEMI,
     LEFT OUTER (partitioned on right), or RIGHT OUTER (partitioned on left)
    
   A query example: 
   ```sql
   SELECT f.store_id, k.units_sold
   FROM fact_stats f
   JOIN dim_stats d
       ON f.store_id = d.store_id
           AND d.country = 'NL'
   LEFT JOIN fact_sk k
       ON f.store_id = k.store_id
   ```
   
   Before the PR:
   
   
![image](https://user-images.githubusercontent.com/39684231/145710866-70b44119-12ba-4afc-8a02-385189712950.png)
   
   After the PR:
   
![image](https://user-images.githubusercontent.com/39684231/145710859-93ed90ae-1b5b-4516-a780-dfb1c14e5e0d.png)
   
   ### Why are the changes needed?
   Push down a dynamic partition pruning from one join to other joins to prune 
the table of other joins  to improve performance.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as 
the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes 
- provide the console output, description and/or an example to show the 
behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to 
the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   No.
   
   ### How was this patch tested?
   Added unittests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to