Andrea Tosatto created SPARK-54197:
--------------------------------------

             Summary: Do not call delete requests if the deletionTimestamp is 
already set on the Pod
                 Key: SPARK-54197
                 URL: https://issues.apache.org/jira/browse/SPARK-54197
             Project: Spark
          Issue Type: Improvement
          Components: k8s
    Affects Versions: 4.0.0, 3.4.3, 3.4.0
            Reporter: Andrea Tosatto


The current code handling deletion of Failed or Succeeded driver Pods is 
calling the Kubernetes API to delete objects until either the Kubelet as 
started the termination the Pod (the status of the object is terminating).

However, depending on configuration, the ExecutorPodsLifecycleManager loop 
might run multiple times before the Kubelet starts the deletion of the Pod 
object, resulting in un-necessary DELETE calls to the Kubernetes API, which are 
particularly expensive since they are served from Etcd.

Following the Kubernetes API specifications in 
https://kubernetes.io/docs/reference/using-api/api-concepts/

> When a client first sends a delete to request the removal of a resource, the 
> .metadata.deletionTimestamp is set to the current time. Once the 
> .metadata.deletionTimestamp is set, external controllers that act on 
> finalizers may start performing their cleanup work at any time, in any order.

we can assume that whenever the deletionTimestamp is set on a Pod, this will be 
eventually terminated without the need of additional DELETE calls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to