[ https://issues.apache.org/jira/browse/SPARK-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-13803. ------------------------------- Resolution: Fixed Fix Version/s: 1.6.2 1.4.2 1.5.3 2.0.0 Issue resolved by pull request 11702 [https://github.com/apache/spark/pull/11702] > Standalone master does not balance cluster-mode drivers across workers > ---------------------------------------------------------------------- > > Key: SPARK-13803 > URL: https://issues.apache.org/jira/browse/SPARK-13803 > Project: Spark > Issue Type: Bug > Components: Deploy, Spark Core > Affects Versions: 1.6.1 > Reporter: Brian Wongchaowart > Fix For: 2.0.0, 1.5.3, 1.4.2, 1.6.2 > > > The Spark standalone cluster master does not balance drivers running in > cluster mode across all the available workers. Instead, it assigns each > submitted driver to the first available worker. The schedule() method > attempts to randomly shuffle the HashSet of workers before launching drivers, > but that operation has no effect because the Scala HashSet is an unordered > data structure. This behavior is a regression introduced by SPARK-1706: > previously, the workers were copied into an ordered list before the random > shuffle is performed. > I am able to reproduce this bug in all releases of Spark from 1.4.0 to 1.6.1 > using the following steps: > # Start a standalone master and two workers > # Repeatedly submit applications to the master in cluster mode (--deploy-mode > cluster) > Observe that all the drivers are scheduled on only one of the two workers as > long as resources are available on that worker. The expected behavior is that > the master randomly assigns drivers to both workers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org