[ 
https://issues.apache.org/jira/browse/SPARK-17522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495243#comment-15495243
 ] 

Saisai Shao commented on SPARK-17522:
-------------------------------------

[~sunrui] I think the performance is depended on different workloads. For 
example if your workloads are mainly ETL like workloads, spreading out will 
better leverage the network and IO bandwidth. But in some other cases like ML, 
in which CPU plays a dominant role while input data is not large, it is better 
to put executors together for fast data exchange and iteration. You could refer 
to Slider for affinity and anti-affinity resource allocation, should could 
either be done in cluster manager or upstream frameworks.

> [MESOS] More even distribution of executors on Mesos cluster
> ------------------------------------------------------------
>
>                 Key: SPARK-17522
>                 URL: https://issues.apache.org/jira/browse/SPARK-17522
>             Project: Spark
>          Issue Type: Improvement
>          Components: Mesos
>    Affects Versions: 2.0.0
>            Reporter: Sun Rui
>
> The MesosCoarseGrainedSchedulerBackend launch executors in a round-robin way 
> among accepted offers that are received at once, but it is observed that 
> typically executors are launched on a small number of slaves.
> It is found that MesosCoarseGrainedSchedulerBackend mostly is receiving only 
> one offer once on a cluster composed of many nodes, so that the round-robin 
> assignment of executors among offers do not have expected result, which leads 
> to the fact that executors are located on a smaller number of slave nodes 
> than expected, which suffers bad data locality.
> An experimental slight change to 
> MesosCoarseGrainedSchedulerBackend::buildMesosTasks() shows better executor 
> distribution among nodes:
> {code}
>     while (launchTasks) {
>       launchTasks = false
>       for (offer <- offers) {
>         ...
>        }
> +      if (conf.getBoolean("spark.deploy.spreadOut", true)) {
> +        launchTasks = false
> +      }
>     }
>     tasks.toMap
> {code}
> One of my spark programs can run 30% faster due to this change because of 
> better data locality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to