[
https://issues.apache.org/jira/browse/SPARK-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993909#comment-13993909
]
Sandy Ryza commented on SPARK-1767:
-----------------------------------
Currently, RDDs only support a single level of location preference through
RDD#preferredLocations(split), which returns a sequence of strings. To prefer
cached-replicas, this needs to be extended in some way. We could deprecate
preferredLocations and add a preferredLocations(split, storageType), where
storageType is MEMORY, DISK, and eventually FLASH? Maybe more hackily, we
could give the location strings a prefix like "inmem:" that specifies the
storage type.
> Prefer HDFS-cached replicas when scheduling data-local tasks
> ------------------------------------------------------------
>
> Key: SPARK-1767
> URL: https://issues.apache.org/jira/browse/SPARK-1767
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 1.0.0
> Reporter: Sandy Ryza
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)