ryandutton opened a new issue, #57553:
URL: https://github.com/apache/airflow/issues/57553

   ### Apache Airflow Provider(s)
   
   cncf-kubernetes
   
   ### Versions of Apache Airflow Providers
   
   ``` bash
   Using Python 3.12.11 environment at: 
   apache-airflow-providers-amazon==9.2.0
   apache-airflow-providers-cncf-kubernetes==10.1.0
   apache-airflow-providers-common-compat==1.3.0
   apache-airflow-providers-common-io==1.5.0
   apache-airflow-providers-common-sql==1.21.0
   apache-airflow-providers-databricks==7.0.0
   apache-airflow-providers-elasticsearch==6.0.0
   apache-airflow-providers-fab==1.5.2
   apache-airflow-providers-ftp==3.12.0
   apache-airflow-providers-google==12.0.0
   apache-airflow-providers-http==5.0.0
   apache-airflow-providers-imap==3.8.0
   apache-airflow-providers-openlineage==2.0.0
   apache-airflow-providers-postgres==6.0.0
   apache-airflow-providers-sftp==5.0.0
   apache-airflow-providers-smtp==1.9.0
   apache-airflow-providers-sqlite==4.0.0
   apache-airflow-providers-ssh==4.0.0
   ```
   
   ### Apache Airflow version
   
   2.10.5
   
   ### Operating System
   
   Rocky Linux
   
   ### Deployment
   
   Other 3rd-party Helm chart
   
   ### Deployment details
   
   Airflow Scheduler deployment with a single replica in an Airflow namespace. 
Executor pods are created in the same namespace.
   
   ### What happened
   
   When a task starts, an executor pod is created. During execution, the 
scheduler pod restarts. The task finishes before the new scheduler pod becomes 
ready. As a result, the executor pod completes but isn’t cleaned up because the 
scheduler was restarting. The completed pod remains orphaned — it isn’t adopted 
by the new scheduler instance and stays in the namespace indefinitely.
   
   ### What you think should happen instead
   
   The orphaned and completed tasks should be adopted by the new scheduler and 
deleted.
   
   ### How to reproduce
   
   1. Deploy an Airflow scheduler as a pod in an airflow namespace
   2. Create a short running task
   3. Restart the scheduler during the execution of the task
   4. Check that the task complete before the new scheduler had started
   
   During this time the task pod should be left indefinitely in a completed 
state and not adopted by the new scheduler pod.
   
   It may take a try or two to get the timing right so that the task completes 
during the restart.
   
   ### Anything else
   
   We have several deployments throughout the day with hundreds of tasks 
scheduling. Due to this we are seeing tens of pods being orphaned daily.
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to