abhishekd0907 commented on pull request #35858: URL: https://github.com/apache/spark/pull/35858#issuecomment-1068737949
> Why does spark driver need to be aware of what the cluster size is ? It should ask for the resources it requires to run the application, and it is for the resource manager to handle cross application/cluster wide requirements. @mridulm In Yarn Clusters, starting a new executor container on an already existing node has very small latency (a few seconds) but bringing up a new node might take more time (order of few hundred seconds). Dynamic Allocation can factor in this information while requesting executors from resource manager. For example, if spark is running on a single one-core executor and there is one active stage, with 100 pending tasks, and average task time of tasks completed so far is 1 second, then expected time to complete the stage with single executor will be 100 seconds. If the latency to bring up a new node is 2 minutes (120 seconds), then it doesn't make sense to request executors because all tasks will be finished before the second executor is added. However, if there is a free node already present in the cluster, a new executor can be started on that node immediately, and some of the pending tasks can be scheduled on the new executor. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
