Thomas Graves created SPARK-30299:
-------------------------------------

             Summary: Dynamic allocation with Standalone mode calculates to 
many executors needed
                 Key: SPARK-30299
                 URL: https://issues.apache.org/jira/browse/SPARK-30299
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.4.4, 3.0.0
            Reporter: Thomas Graves


While I was doing some changes in the executor allocation manager, I realized 
there is a bug with dynamic allocation in standalone mode. 

The issue is that if you run standalone mode with the default settings where 
the executor gets all the cores of the worker, spark core (allocation manager) 
doesn't know the number of cores per executor to be able to calculate how many 
tasks can fit on an executor.

It therefore defaults to the use the default EXECUTOR_CORES which is 1 and thus 
could calculate it needs way more containers there are actually does.

For instance, I have a worker with 12 cores. That means by default when I start 
an executor on it, it gets 12 cores and can fit 12 tasks.  The allocation 
manager would use the default of 1 core per executor and say it needs 12 
executors when it only needs 1.

The fix for this isn't trivial since it would need to know how many cores each 
one has and I assume it would also need to handle  heterogenous nodes.  I could 
start workers on nodes with different numbers of cores - one with 24 cores and 
one with 16 cores.  How do we estimate the number of executors in this case.  
We could just choose the min of existing ones or something like that as an 
estimate and it would be closer, unless of course the next executor you got 
didn't actually have that. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to