[GitHub] [airflow] kaxil commented on a change in pull request #15336: Fail task when containers inside a pod fails

GitBox Mon, 12 Apr 2021 16:01:26 -0700


kaxil commented on a change in pull request #15336:
URL: https://github.com/apache/airflow/pull/15336#discussion_r612007522




##########
File path: airflow/executors/kubernetes_executor.py
##########
@@ -218,6 +239,34 @@ def process_status(
                 resource_version,
             )
 
+    def process_container_statuses(
+        self,
+        pod_id: str,
+        statuses: List[Any],
+        namespace: str,
+        annotations: Dict[str, str],
+        resource_version: str,
+    ):
+        """Monitor pod container statuses"""
+        for container_status in statuses:
+            terminated = container_status.state.terminated
+            waiting = container_status.state.waiting
+            if terminated:
+                self.log.debug(
+                    "A container in the pod %s has terminated, reason: %s, 
message: %s",
+                    pod_id,
+                    terminated.reason,
+                    terminated.message,
+                )
+                self.watcher_queue.put((pod_id, namespace, State.FAILED, 
annotations, resource_version))

Review comment:
       should we short-circuit and return here, since we want to mark a task as 
Fail when any container in the POD fails right?
   
   We could also probably do
   
   ```python
   any(container_status.state.terminated for container_status in statuses)
   ```
   
   However, "a terminated container" != "failed container"
   
   >A container in the Terminated state began execution and then either ran to 
completion or failed for some reason. When you use kubectl to query a Pod with 
a container that is Terminated, you see a reason, an exit code, and the start 
and finish time for that container's period of execution.
   
   From: 
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-state-terminated
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] kaxil commented on a change in pull request #15336: Fail task when containers inside a pod fails

Reply via email to