[
https://issues.apache.org/jira/browse/SPARK-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-13803.
-------------------------------
Resolution: Fixed
Fix Version/s: 1.6.2
1.4.2
1.5.3
2.0.0
Issue resolved by pull request 11702
[https://github.com/apache/spark/pull/11702]
> Standalone master does not balance cluster-mode drivers across workers
> ----------------------------------------------------------------------
>
> Key: SPARK-13803
> URL: https://issues.apache.org/jira/browse/SPARK-13803
> Project: Spark
> Issue Type: Bug
> Components: Deploy, Spark Core
> Affects Versions: 1.6.1
> Reporter: Brian Wongchaowart
> Fix For: 2.0.0, 1.5.3, 1.4.2, 1.6.2
>
>
> The Spark standalone cluster master does not balance drivers running in
> cluster mode across all the available workers. Instead, it assigns each
> submitted driver to the first available worker. The schedule() method
> attempts to randomly shuffle the HashSet of workers before launching drivers,
> but that operation has no effect because the Scala HashSet is an unordered
> data structure. This behavior is a regression introduced by SPARK-1706:
> previously, the workers were copied into an ordered list before the random
> shuffle is performed.
> I am able to reproduce this bug in all releases of Spark from 1.4.0 to 1.6.1
> using the following steps:
> # Start a standalone master and two workers
> # Repeatedly submit applications to the master in cluster mode (--deploy-mode
> cluster)
> Observe that all the drivers are scheduled on only one of the two workers as
> long as resources are available on that worker. The expected behavior is that
> the master randomly assigns drivers to both workers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]