[ 
https://issues.apache.org/jira/browse/SPARK-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617732#comment-14617732
 ] 

Nishkam Ravi commented on SPARK-8881:
-------------------------------------

No that's not the problem.

You have 4 workers with 16 cores each. You request 3 executors (spark.cores.max 
= 48, spark.executor.cores = 16). App hangs. Because the following condition is 
never satisfied: while (coresLeft >= coresPerExecutor && worker.memoryFree >= 
memoryPerExecutor). You will have to stare at the scheduling algorithm for a 
good 5 minutes to understand what's happening. Try to simulate the case stated 
above. 

> Scheduling fails if num_executors < num_workers
> -----------------------------------------------
>
>                 Key: SPARK-8881
>                 URL: https://issues.apache.org/jira/browse/SPARK-8881
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 1.4.0, 1.5.0
>            Reporter: Nishkam Ravi
>
> Current scheduling algorithm (in Master.scala) has two issues:
> 1. cores are allocated one at a time instead of spark.executor.cores at a time
> 2. when spark.cores.max/spark.executor.cores < num_workers, executors are not 
> launched and the app hangs (due to 1)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to