nclaeys opened a new issue, #58936: URL: https://github.com/apache/airflow/issues/58936
### Apache Airflow version Other Airflow 2/3 version (please specify below) ### If "Other Airflow 2/3 version" selected, which one? 3.0.6 ### What happened? Since upgrading to Airflow 3 we notice that spot interruptions or other container interruptions are not correctly handled anymore. As a result the on_kill for Airflow operators is not triggered, resulting in resources being left running when the pod gets deleted. The root cause is that the new Airflow sdk does not propagate the interruption signals to the subprocess executing the task. ### What you think should happen instead? When an Airflow worker pod get a sigterm or sigint is should shut down correctly. This means triggering the on_kill functions of the operator running at that time. Now it just shuts down. In our case this means that the pods that were launched by the Airflow worker are not cleaned up for example. The interruptions are correctly handled on Airflow 2. ### How to reproduce Run an Airflow task with a custom Operator that triggers a pod that sleeps for 15min. Something similar than the KubernetesPodOperator, but that operator has both a cleanup and an on_kill function that do partly the same. Then interrupt (gracefully kill) the worker pod and notice that the on_kill function of the operator does not get killed. When looking through the code, this is what I expect to happen from the task sdk. The task_runner.py has code to trigger the on_kill for a task on an interrupt, but this is not happening. ### Operating System Kubernetes ### Versions of Apache Airflow Providers apache-airflow-providers-cncf-kubernetes==10.6.0 apache-airflow-providers-common-compat==1.7.3 apache-airflow-providers-common-io==1.6.2 apache-airflow-providers-common-sql==1.27.5 apache-airflow-providers-opsgenie==4.0.0 apache-airflow-providers-postgres==6.2.3 apache-airflow-providers-slack==7.3.2 apache-airflow-providers-smtp==2.2.0 apache-airflow-providers-standard==1.4.1 ### Deployment Other 3rd-party Helm chart ### Deployment details We run Airflow on Kubernetes. It is a custom setup but similar deployment to the official helm chart. We use our own operators to abstract away some logic for users. ### Anything else? / ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
