Holden Karau created SPARK-42260:
------------------------------------
Summary: Log when the K8s Exec Pods Allocator Stalls
Key: SPARK-42260
URL: https://issues.apache.org/jira/browse/SPARK-42260
Project: Spark
Issue Type: Improvement
Components: Kubernetes
Affects Versions: 3.4.0, 3.4.1
Reporter: Holden Karau
Assignee: Holden Karau
Sometimes if the K8s APIs are being slow the ExecutorPods allocator can stall
and it would be good for us to log this (and how long we've stalled for) so
folks can tell more clearly why Spark is unable to reach the desired target
number of executors.
This is _somewhat_ related to SPARK-36664 which logs the time spent waiting for
executor allocation but goes a step further for K8s and logs when we've stalled
because we have too many pending pods.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]