rrbarbosa commented on issue #13542:
URL: https://github.com/apache/airflow/issues/13542#issuecomment-844789569


   We are also experiencing tasks getting stuck in Airflow 2.0.1 with the 
Kubernetes Executor. We created a "heartbeat" DAG that runs every 10m, when it 
doesn't, we get an alert.
   
   In the last occurrence, the scheduler seem to think the heartbeat DAG 
reached its `max_active_runs (1 of 1)`, so it refused to schedule a new task. 
As far as I can tell, the previous execution succeeded and the pod correctly 
was killed. 
   
   That task remained in the `scheduled` state, and I the message `Reset the 
following X orphaned TaskInstances` for the only task in this DAG every 5min. 
After ~1h hour, the DAG was finally executed again.
   
   Our current workaround is a cronjob to restart the scheduler pod every few 
hours.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to