[
https://issues.apache.org/jira/browse/SPARK-49061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Youngkwang (YK) Lee updated SPARK-49061:
----------------------------------------
Summary: Emit Kubernetes event when driver fails to request executor (was:
Emit Kubernetes events when driver fails to request executor)
> Emit Kubernetes event when driver fails to request executor
> -----------------------------------------------------------
>
> Key: SPARK-49061
> URL: https://issues.apache.org/jira/browse/SPARK-49061
> Project: Spark
> Issue Type: Improvement
> Components: Kubernetes
> Affects Versions: 3.5.3
> Reporter: Youngkwang (YK) Lee
> Priority: Major
>
> In Kubernetes, when a driver pod fails to request executor pods (i.e due to
> being out of resource quota), the only visibility around this issue is inside
> the driver logs.
> We would like to expose this issue as a Kubernetes driver event to enhance
> debugging. A possible solution is to add event emission logic in
> ExecutorPodsAllocator.scala when we fail to request executors:
> [https://bbgithub.dev.bloomberg.com/dnaspark/apache-spark-internal/blob/develop-3.4/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala#L439-L463]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]