[
https://issues.apache.org/jira/browse/SPARK-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074521#comment-14074521
]
Josh Rosen commented on SPARK-2387:
-----------------------------------
{quote}
For example, in a push-style shuffle, the pushed data won't have to be stored
to disk if the reducers have started on the destination node.
{quote}
One of the reasons to materialize shuffle outputs to disk on the map side is
fault-tolerance: an individual reducer depends on all mappers, so without
materialization the failure of any reducer would cause re-computation of all
mappers.
> Remove the stage barrier for better resource utilization
> --------------------------------------------------------
>
> Key: SPARK-2387
> URL: https://issues.apache.org/jira/browse/SPARK-2387
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Reporter: Rui Li
>
> DAGScheduler divides a Spark job into multiple stages according to RDD
> dependencies. Whenever there’s a shuffle dependency, DAGScheduler creates a
> shuffle map stage on the map side, and another stage depending on that stage.
> Currently, the downstream stage cannot start until all its depended stages
> have finished. This barrier between stages leads to idle slots when waiting
> for the last few upstream tasks to finish and thus wasting cluster resources.
> Therefore we propose to remove the barrier and pre-start the reduce stage
> once there're free slots. This can achieve better resource utilization and
> improve the overall job performance, especially when there're lots of
> executors granted to the application.
--
This message was sent by Atlassian JIRA
(v6.2#6252)