[
https://issues.apache.org/jira/browse/SPARK-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-1762:
-----------------------------
Target Version/s: (was: 1.2.0)
> Add functionality to pin RDDs in cache
> --------------------------------------
>
> Key: SPARK-1762
> URL: https://issues.apache.org/jira/browse/SPARK-1762
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 1.0.0
> Reporter: Andrew Or
>
> Right now, all RDDs are created equal, and there is no mechanism to identify
> a certain RDD to be more important than the rest. This is a problem if the
> RDD fraction is small, because just caching a few RDDs can evict more
> important ones.
> A side effect of this feature is that we can now more safely allocate a
> smaller spark.storage.memoryFraction if we know how large our important RDDs
> are, without having to worry about them being evicted. This allows us to use
> more memory for shuffles, for instance, and avoid disk spills.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]