rrbarbosa edited a comment on issue #13542: URL: https://github.com/apache/airflow/issues/13542#issuecomment-844789569
We are also experiencing tasks getting stuck in Airflow 2.0.1 with the Kubernetes Executor. We created a "heartbeat" DAG that runs every 10m, when it doesn't, we get an alert. In the last occurrence, the scheduler seem to think the heartbeat DAG reached its `max_active_runs (1 of 1)`, so it refused to schedule a new task. As far as I can tell, the previous execution succeeded and the pod correctly was killed. That task remained in the `scheduled` state, and the log `Reset the following X orphaned TaskInstances` listed the only task in this DAG every 5min. After ~1h hour, the DAG was finally executed again. Our current workaround is a cronjob to restart the scheduler pod every few hours. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
