Github user dvogelbacher commented on a diff in the pull request:
https://github.com/apache/spark/pull/21366#discussion_r194176787
--- Diff:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala
---
@@ -105,14 +120,15 @@ private[spark] class ExecutorPodsAllocator(
.endSpec()
.build()
kubernetesClient.pods().create(podWithAttachedContainer)
- pendingExecutors += newExecutorId
+ newlyCreatedExecutors(newExecutorId) = clock.getTimeMillis()
}
} else if (currentRunningExecutors >= currentTotalExpectedExecutors) {
+ // TODO handle edge cases if we end up with more running executors
than expected.
logDebug("Current number of running executors is equal to the number
of requested" +
" executors. Not scaling up further.")
- } else if (pendingExecutors.nonEmpty) {
- logDebug(s"Still waiting for ${pendingExecutors.size} executors to
begin running before" +
- " requesting for more executors.")
+ } else if (newlyCreatedExecutors.nonEmpty || currentPendingExecutors
!= 0) {
+ logDebug(s"Still waiting for ${newlyCreatedExecutors.size +
currentPendingExecutors}" +
--- End diff --
can we make this debug statement more verbose to distinguish between
`newlyCreatedExecutors` and `currentPendingExecutors ` ? might help in case
anyone ever needs to debug executors never reaching pending state or something
like that
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]