[ 
https://issues.apache.org/jira/browse/SPARK-15352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chopra resolved SPARK-15352.
------------------------------------
    Resolution: Fixed

> Topology aware block replication
> --------------------------------
>
>                 Key: SPARK-15352
>                 URL: https://issues.apache.org/jira/browse/SPARK-15352
>             Project: Spark
>          Issue Type: New Feature
>          Components: Block Manager, Mesos, Spark Core, YARN
>            Reporter: Shubham Chopra
>            Assignee: Shubham Chopra
>
> With cached RDDs, Spark can be used for online analytics where it is used to 
> respond to online queries. But loss of RDD partitions due to node/executor 
> failures can cause huge delays in such use cases as the data would have to be 
> regenerated.
> Cached RDDs, even when using multiple replicas per block, are not currently 
> resilient to node failures when multiple executors are started on the same 
> node. Block replication currently chooses a peer at random, and this peer 
> could also exist on the same host. 
> This effort would add topology aware replication to Spark that can be enabled 
> with pluggable strategies. For ease of development/review, this is being 
> broken down to three major work-efforts:
> 1.    Making peer selection for replication pluggable
> 2.    Providing pluggable implementations for providing topology and topology 
> aware replication
> 3.    Pro-active replenishment of lost blocks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to