Re: [PR] Fix race condition in KubernetesExecutor with concurrently running schedulers [airflow]

via GitHub Fri, 24 Nov 2023 08:24:55 -0800


JCoder01 commented on code in PR #35800:
URL: https://github.com/apache/airflow/pull/35800#discussion_r1404509646



##########
airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py:
##########
@@ -642,39 +641,6 @@ def adopt_launched_task(
         del tis_to_flush_by_key[ti_key]
         self.running.add(ti_key)
 
-    def _adopt_completed_pods(self, kube_client: client.CoreV1Api) -> None:
-        """
-        Patch completed pods so that the KubernetesJobWatcher can delete them.
-
-        :param kube_client: kubernetes client for speaking to kube API
-        """
-        if TYPE_CHECKING:
-            assert self.scheduler_job_id
-
-        new_worker_id_label = 
self._make_safe_label_value(self.scheduler_job_id)
-        query_kwargs = {
-            "field_selector": "status.phase=Succeeded",
-            "label_selector": (
-                "kubernetes_executor=True,"
-                
f"airflow-worker!={new_worker_id_label},{POD_EXECUTOR_DONE_KEY}!=True"
-            ),
-        }
-        pod_list = self._list_pods(query_kwargs)
-        for pod in pod_list:
-            self.log.info("Attempting to adopt pod %s", pod.metadata.name)
-            from kubernetes.client.rest import ApiException
-
-            try:
-                kube_client.patch_namespaced_pod(
-                    name=pod.metadata.name,
-                    namespace=pod.metadata.namespace,
-                    body={"metadata": {"labels": {"airflow-worker": 
new_worker_id_label}}},
-                )
-            except ApiException as e:
-                self.log.info("Failed to adopt pod %s. Reason: %s", 
pod.metadata.name, e)

Review Comment:
   Apologies. I had read your comment but due to the hazards of reading too 
fast, too early in the morning I missed that you had already tried the 
`continue` approach. Reviewing your comment again, it seems adding the 
`continue` solves the majority of the issues. This leaves the remaining 
condition which if I understand it, is that it's possible for 2 schedulers to 
adopt a completed pod, which means it's added to the `running` set. And then 
once the pod is deleted by one scheduler there is nothing to signal to the 
other scheduler(s) that the pod needs to be removed from the `running` set? So 
we need to either :
   A. only adopt pods not owned by a running executor, or 
   B. Clean up all schedulers when a pod is deleted. 
   The proper approach seems to be what @dstandish recommended to check if the 
pod is owned by a live scheduler before adopting. Or alternatively, perhaps 
KubernetesJobWatcher to could react to `DELETE` events rather than ignoring 
them, but this seems far less desirable to just not adopting the pods to start 
with if running multiple schedulers.
   The CronJob approach just seems like the wrong direction. I don't think the 
option to create it as part of the chart exists in charts other than the 
official chart, and even so, in the official chart it's disabled by default.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Fix race condition in KubernetesExecutor with concurrently running schedulers [airflow]

Reply via email to