liupc opened a new pull request #25020: [SPARK-28220]Fix foldable join condition not pushed down when parent filter is wholly pushed down URL: https://github.com/apache/spark/pull/25020 ## What changes were proposed in this pull request? Optimizer rule `PushPredicateThroughJoin` will try to push parent filter down though the join, however, when the parent filter is wholly pushed down through the join, the join will become the top node, and then the `transform` method will skip the join to apply the rule. Suppose we have two tables: table1 and table2: ``` table1: (a: string, b: string, c: string) table2: (d: string) ``` sql as: `select * from table1 left join (select d, 'w1' as r from table2) on a = d and r = 'w2' where b = 2` let's focus on the following optimizer rules: ``` PushPredicateThroughJoin FodablePropagation BooleanSimplification PruneFilters ``` In the above case, on the first iteration of these rules: PushPredicateThroughJoin -> ` select * from table1 where b=2 left join (select d, 'w1' as r from table2) on a = d and r = 'w2'` FodablePropagation -> `select * from table1 where b=2 left join (select d, 'w1' as r from table2) on a = d and 'w1' = 'w2'` BooleanSimplification -> `select * from table1 where b=2 left join (select d, 'w1' as r from table2) on false` PruneFilters -> No effective After several iteration of these rules, the join condition will still never be pushed to the right hand of the left join. thus, in some case(e.g. Large right table), the `BroadcastNestedLoopJoin` may be slow or oom. This PR will fix this problem! ## How was this patch tested? exist UT
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
