[ 
https://issues.apache.org/jira/browse/SPARK-49061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Youngkwang (YK) Lee updated SPARK-49061:
----------------------------------------
    Summary: Emit Kubernetes event when driver fails to request executor  (was: 
Emit Kubernetes events when driver fails to request executor)

> Emit Kubernetes event when driver fails to request executor
> -----------------------------------------------------------
>
>                 Key: SPARK-49061
>                 URL: https://issues.apache.org/jira/browse/SPARK-49061
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 3.5.3
>            Reporter: Youngkwang (YK) Lee
>            Priority: Major
>
> In Kubernetes, when a driver pod fails to request executor pods (i.e due to 
> being out of resource quota), the only visibility around this issue is inside 
> the driver logs. 
> We would like to expose this issue as a Kubernetes driver event to enhance 
> debugging. A possible solution is to add event emission logic in 
> ExecutorPodsAllocator.scala when we fail to request executors:
> [https://bbgithub.dev.bloomberg.com/dnaspark/apache-spark-internal/blob/develop-3.4/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala#L439-L463]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to