hterik opened a new issue, #25021:
URL: https://github.com/apache/airflow/issues/25021

   ### Apache Airflow version
   
   2.3.2
   
   ### What happened
   
   Scheduler was restarted.
   After this it starts resetting some running tasks as orphaned. 
   
   I have seen https://github.com/apache/airflow/issues/20982 which lists this 
as known issue _for manually started tasks_, but we also see it occasionally 
for scheduled tasks.
   
   This issue appears to be a regression after 2.3 upgrade, i don't recall ever 
seeing it in 2.2, now we experience it almost every time scheduler restarts. 
Which happen almost once per day due to crashes caused by connection flakiness 
to the Kubernetes API or PGSQL. 
   
   ### What you think should happen instead
   
   Tasks should be adopted
   
   ### How to reproduce
   
   Start some tasks running on Kubernetes with KubernetesCeleryExecutor.
   Restart scheduler
   
   Scheduler logs show following:
   ```
   2022-07-13 09:59:19 {scheduler_job.py:353} INFO tasks up for execution:
                <TaskInstance: XXXXX scheduled__2022-07-12T16:00:00+00:00 
[scheduled]>
        
   2022-07-13 09:59:19 {scheduler_job.py:504} INFO Setting the following tasks 
to queued state:
        <TaskInstance: XXXXX scheduled__2022-07-12T16:00:00+00:00 [scheduled]>
   2022-07-13 09:59:20  {scheduler_job.py:633} INFO Setting external_id for 
<TaskInstance: XXXXX scheduled__2022-07-12T16:00:00+00:00 [queued]> to 38321
   
   ...
   Scheduler crashes and restarts here
   ...
   
   2022-07-13 10:42:59  {scheduler_job.py:1285} Reset the following 8 orphaned 
TaskInstances:
       <TaskInstance: XXXXX scheduled__2022-07-12T16:00:00+00:00 [running]>
       ....
   2022-07-13 10:43:00  {scheduler_job.py:353} Level=INFO Message=10 tasks up 
for execution:
       <TaskInstance: XXXXX scheduled__2022-07-12T16:00:00+00:00 [scheduled]>
       ....
   ....
   2022-07-13 10:43:00  {scheduler_job.py:504} INFO Setting the following tasks 
to queued state:
        <TaskInstance: XXXXX scheduled__2022-07-12T16:00:00+00:00 [scheduled]>
       ...
   ```
   I don't know what other logs that might be relevant.
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow==2.3.2
   apache-airflow-client==2.1.0
   apache-airflow-providers-celery==3.0.0
   apache-airflow-providers-cncf-kubernetes==4.0.2
   apache-airflow-providers-docker==3.0.0
   apache-airflow-providers-ftp==2.1.2
   apache-airflow-providers-http==2.1.2
   apache-airflow-providers-imap==2.2.3
   apache-airflow-providers-postgres==5.0.0
   apache-airflow-providers-sqlite==2.1.3
   
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   Pgsql as database
   
   ### Anything else
   
   Almost every time scheduler restarts
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to