jedcunningham commented on code in PR #36749:
URL: https://github.com/apache/airflow/pull/36749#discussion_r1453905335
##########
airflow/providers/cncf/kubernetes/operators/pod.py:
##########
@@ -573,10 +575,17 @@ def execute_sync(self, context: Context):
self.pod, istio_enabled, self.base_container_name
)
finally:
- self.cleanup(
- pod=self.pod or self.pod_request_obj,
- remote_pod=self.remote_pod,
- )
+ try:
+ self.cleanup(
+ pod=self.pod or self.pod_request_obj,
+ remote_pod=self.remote_pod,
+ )
+ except Exception:
+ # If task got marked as failed, it should not raise exception
(which might cause retry).
+ # https://github.com/apache/airflow/issues/36471
+ if not self._killed:
+ raise
Review Comment:
I almost wonder if we should swallow the exception in `process_pod_deletion`
instead, so we are a little more surgical with what we are ignoring.
##########
airflow/providers/cncf/kubernetes/operators/pod.py:
##########
@@ -826,7 +842,11 @@ def on_kill(self) -> None:
}
if self.termination_grace_period is not None:
kwargs.update(grace_period_seconds=self.termination_grace_period)
- self.client.delete_namespaced_pod(**kwargs)
+
+ try:
+ self.client.delete_namespaced_pod(**kwargs)
+ except kubernetes.client.exceptions.ApiException:
+ self.log.warning("Pod %s no longer exists",
self.pod.metadata.name)
Review Comment:
```suggestion
except kubernetes.client.exceptions.ApiException:
self.log.exception("Unable to delete pod %s",
self.pod.metadata.name)
```
This can't be the only cause of an ApiException... Maybe this instad?
##########
airflow/providers/cncf/kubernetes/operators/pod.py:
##########
@@ -670,10 +679,16 @@ def write_logs(self, pod: k8s.V1Pod):
def post_complete_action(self, *, pod, remote_pod, **kwargs):
"""Actions that must be done after operator finishes logic of the
deferrable_execution."""
- self.cleanup(
- pod=pod,
- remote_pod=remote_pod,
- )
+ try:
+ self.cleanup(
+ pod=pod,
+ remote_pod=remote_pod,
+ )
+ except Exception:
+ # If one task got marked as failed, it should not raise exception
which might cause retry.
+ # https://github.com/apache/airflow/issues/36471
Review Comment:
```suggestion
```
##########
airflow/providers/cncf/kubernetes/operators/pod.py:
##########
@@ -573,10 +575,17 @@ def execute_sync(self, context: Context):
self.pod, istio_enabled, self.base_container_name
)
finally:
- self.cleanup(
- pod=self.pod or self.pod_request_obj,
- remote_pod=self.remote_pod,
- )
+ try:
+ self.cleanup(
+ pod=self.pod or self.pod_request_obj,
+ remote_pod=self.remote_pod,
+ )
+ except Exception:
+ # If task got marked as failed, it should not raise exception
(which might cause retry).
+ # https://github.com/apache/airflow/issues/36471
Review Comment:
nit: git history is sufficient here imo
```suggestion
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]