[
https://issues.apache.org/jira/browse/SPARK-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14618362#comment-14618362
]
Nishkam Ravi commented on SPARK-8881:
-------------------------------------
There's more to it. Consider the following: three workers with num_cores (8, 8,
2). spark.cores.maximum = 12, spark.executor.cores = 4. Core allocation would
be (5, 5, 2). num_executors = num_workers and nothing gets launched!
Problem isn't that num_workers > num_executors (that's just a place it
manifests in practice). Problem is we are allocating one core at a time and
ignoring spark.executor.cores during allocation.
> Scheduling fails if num_executors < num_workers
> -----------------------------------------------
>
> Key: SPARK-8881
> URL: https://issues.apache.org/jira/browse/SPARK-8881
> Project: Spark
> Issue Type: Bug
> Components: Deploy
> Affects Versions: 1.4.0, 1.5.0
> Reporter: Nishkam Ravi
>
> Current scheduling algorithm (in Master.scala) has two issues:
> 1. cores are allocated one at a time instead of spark.executor.cores at a time
> 2. when spark.cores.max/spark.executor.cores < num_workers, executors are not
> launched and the app hangs (due to 1)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]