[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

jerryshao Fri, 19 May 2017 01:53:42 -0700

Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/17936
  
    @viirya , this is slightly different from caching RDD. It is more like 
broadcasting, the final state is that each executor will hold the whole data of 
RDD2, the difference is that this is executor-executor sync, not 
driver-executor sync.
    
    I also have the similar concern. The performance can be varied by 
workloads, we'd better have some different workloads to see general 
improvements.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

Reply via email to