linzebing commented on a change in pull request #27223: [SPARK-30511][CORE]
Spark marks intentionally killed speculative tasks as pending leads to holding
idle executors
URL: https://github.com/apache/spark/pull/27223#discussion_r368203775
##########
File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
##########
@@ -263,9 +263,15 @@ private[spark] class ExecutorAllocationManager(
*/
private def maxNumExecutorsNeeded(): Int = {
val numRunningOrPendingTasks = listener.totalPendingTasks +
listener.totalRunningTasks
- math.ceil(numRunningOrPendingTasks * executorAllocationRatio /
- tasksPerExecutorForFullParallelism)
- .toInt
+ val maxNeeded = math.ceil(numRunningOrPendingTasks *
executorAllocationRatio /
+ tasksPerExecutorForFullParallelism).toInt
+ if (listener.pendingSpeculativeTasks > 0 &&
tasksPerExecutorForFullParallelism > 1) {
+ // If we have pending speculative tasks, allocate one more executor to
satisfy the
+ // locality requirements of speculative tasks
+ maxNeeded + 1
Review comment:
As specified in the comments, this is to satisfy the locality requirements
of speculative tasks. Let's say we have 1 normal task and 1 speculative task,
in this case we should allocate 2 executors instead of 1.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]