sean-rose commented on issue #49466: URL: https://github.com/apache/airflow/issues/49466#issuecomment-2819423475
> When using the `KubernetesPodOperator` with `on_finish_action` set to `"keep_pod"`, if the task fails in a way where the pod is still running at the time (e.g. the task's specified `execution_timeout` is exceeded, or the pod takes longer than `startup_timeout_seconds` to start) then the pod is left running. It's worth noting that [if `TaskInstance` sees an `AirflowTaskTimeout` it will call the operator's `on_kill()` method](https://github.com/apache/airflow/blob/12a124af3d0483ae85c01874bcb45effa6834e26/airflow/models/taskinstance.py#L764-L765), which for `KubernetesPodOperator` would [delete the pod](https://github.com/apache/airflow/blob/12a124af3d0483ae85c01874bcb45effa6834e26/airflow/providers/cncf/kubernetes/operators/pod.py#L1026). However, because [`KubernetesPodOperator.cleanup()` gets called for all exceptions](https://github.com/apache/airflow/blob/12a124af3d0483ae85c01874bcb45effa6834e26/airflow/providers/cncf/kubernetes/operators/pod.py#L642-L645) and then [raises its own exception if the pod didn't end in the "Succeeded" state](https://github.com/apache/airflow/blob/12a124af3d0483ae85c01874bcb45effa6834e26/airflow/providers/cncf/kubernetes/operators/pod.py#L912-L923), the `TaskInstance` code never sees the original `AirflowTaskTimeout` in that case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
