ldacey commented on issue #9975:
URL: https://github.com/apache/airflow/issues/9975#issuecomment-844122190


   This issue impacted me as well recently (I cleared 450 historical tasks to 
reprocess data)
   
   Is there anyway to improve how the scheduler becomes deadlocked if you clear 
a lot of tasks? In my case on 2.0.2:
   
   1) None of my DAGs would actually complete even if the tasks were all 
successful
   2) Many of my DAGs would not start at all
   
   The root cause of this was of course clearing 450 tasks all at once, but I 
had depends_on_past=True and concurrency=1 enabled and each tasks took 20 
minutes to complete so I was stuck. This was exacerbated by all other DAGs 
failing to complete (the state would be running even if the tasks were 
complete). I ended up having to mark all of those DAGs successful in the UI, 
then I had to restart Airflow in order for DAG runs to be marked complete again.
   
   I will refrain from clearing so many tasks at once next time, but perhaps 
Airflow could handle this situation better? Maybe a "queued" state at a 
DAG-level, and the DAG would only be considered running if one or more tasks 
were running? A deadlock is frustrating to deal with.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to