[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution in CTE by ...

viirya Wed, 10 Aug 2016 06:20:59 -0700

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/14452
  
    @hvanhovell Got it. Thanks for the cue.
    
    I think the thought is the same, to reuse the results of the plans wit same 
results.
    
    The performance gain of this PR is from the reuse of the spark plan of the 
subqueries in CTE. It is not only for exchange plan.
    
    For example, the TPC-DS query 64 has two CTEs with long join chain of many 
tables. If we materialize the CTE subqueries in the query each time it is 
referred, the performance is bad. With this de-duplication, we can just run the 
long join once and reuse its results.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution in CTE by ...

Reply via email to