Re: [PR] Fix race condition in KubernetesExecutor with concurrently running schedulers [airflow]

via GitHub Thu, 21 Dec 2023 03:19:43 -0800


potiuk commented on code in PR #35800:
URL: https://github.com/apache/airflow/pull/35800#discussion_r1433937755



##########
airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py:
##########
@@ -642,38 +642,37 @@ def adopt_launched_task(
         del tis_to_flush_by_key[ti_key]
         self.running.add(ti_key)
 
-    def _adopt_completed_pods(self, kube_client: client.CoreV1Api) -> None:
+    @provide_session
+    def _delete_orphaned_completed_pods(self, session: Session = NEW_SESSION) 
-> None:
         """
-        Patch completed pods so that the KubernetesJobWatcher can delete them.
+        Delete orphaned completed pods with completed TaskInstances.
 
-        :param kube_client: kubernetes client for speaking to kube API
+        Pods that have reached the Completed status are usually deleted by the 
scheduler to which
+        they are attached. In case when the scheduler crashes, there is no one 
to delete these
+        pods. Therefore, they are deleted from another scheduler using this 
function.
         """
+        from airflow.jobs.job import Job, JobState
+
         if TYPE_CHECKING:
-            assert self.scheduler_job_id
+            assert self.kube_scheduler
 
-        new_worker_id_label = 
self._make_safe_label_value(self.scheduler_job_id)
-        query_kwargs = {
-            "field_selector": "status.phase=Succeeded",
-            "label_selector": (
-                "kubernetes_executor=True,"
-                
f"airflow-worker!={new_worker_id_label},{POD_EXECUTOR_DONE_KEY}!=True"
-            ),
-        }
+        alive_schedulers_ids = session.scalars(
+            select(Job.id).where(Job.job_type == "SchedulerJob", Job.state == 
JobState.RUNNING)
+        ).all()
+        labels = ["kubernetes_executor=True", f"{POD_EXECUTOR_DONE_KEY}!=True"]
+        for alive_scheduler_id in alive_schedulers_ids:
+            
labels.append(f"airflow-worker!={self._make_safe_label_value(str(alive_scheduler_id))}")

Review Comment:
   Does it matter which scheduler deletes completed pods? I think this is the 
last stage of a POD and whoever deletes it, is doing a good thing. I believe 
the other schedulers will simply ignore if they see such completed PODs  have 
been deleted, so it does not matter which of the A, B, C schedulers deletes 
such a POD IMHO. - and that also gives the opportunity of cleaning them up 
faster, as the three scheduling loops will generally happen 3 times more often 
and completed pods will be deleted more frequently (which is a good thing).
   
   IMHO there is no particular reason to tie completed POD deletion to the 
scheduler that created it. But maybe there is a reason and I miss it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Fix race condition in KubernetesExecutor with concurrently running schedulers [airflow]

Reply via email to