Github user viirya commented on the issue: https://github.com/apache/spark/pull/14452 @hvanhovell Got it. Thanks for the cue. I think the thought is the same, to reuse the results of the plans wit same results. The performance gain of this PR is from the reuse of the spark plan of the subqueries in CTE. It is not only for exchange plan. For example, the TPC-DS query 64 has two CTEs with long join chain of many tables. If we materialize the CTE subqueries in the query each time it is referred, the performance is bad. With this de-duplication, we can just run the long join once and reuse its results.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org