Gollum999 opened a new issue, #25653: URL: https://github.com/apache/airflow/issues/25653
### Apache Airflow version 2.3.3 ### What happened If you try to backfill a DAG that uses any [deferrable operators](https://airflow.apache.org/docs/apache-airflow/stable/concepts/deferring.html), those tasks will get indefinitely stuck in a "scheduled" state. If I watch the Grid View, I can see the task state change: "scheduled" (or sometimes "queued") -> "deferred" -> "scheduled". I've tried leaving in this state for over an hour, but there are no further state changes. When the task is stuck like this, the log appears as empty in the web UI. The corresponding log file *does* exist on the worker, but it does not contain any errors or warnings that might point to the source of the problem. Ctrl-C-ing the backfill at this point seems to hang on "Shutting down LocalExecutor; waiting for running tasks to finish." **Force-killing and restarting the backfill will "unstick" the stuck tasks.** However, any deferrable operators downstream of the first will get back into that stuck state, requiring multiple restarts to get everything to complete successfully. ### What you think should happen instead Deferrable operators should work as normal when backfilling. ### How to reproduce ``` #!/usr/bin/env python3 import datetime import logging import pendulum from airflow.decorators import dag, task from airflow.sensors.time_sensor import TimeSensorAsync logger = logging.getLogger(__name__) @dag( schedule_interval='@daily', start_date=datetime.datetime(2022, 8, 10), ) def test_backfill(): time_sensor = TimeSensorAsync( task_id='time_sensor', target_time=datetime.time(0).replace(tzinfo=pendulum.UTC), # midnight - should succeed immediately when the trigger first runs ) @task def some_task(): logger.info('hello') time_sensor >> some_task() dag = test_backfill() if __name__ == '__main__': dag.cli() ``` `airflow dags backfill test_backfill -s 2022-08-01 -e 2022-08-04` ### Operating System CentOS Stream 8 ### Versions of Apache Airflow Providers None ### Deployment Other ### Deployment details Self-hosted/standalone ### Anything else I was able to reproduce this with the following configurations: * `standalone` mode + SQLite backend + `SequentialExecutor` * `standalone` mode + Postgres backend + `LocalExecutor` * Production deployment (self-hosted) + Postgres backend + `CeleryExecutor` I have not yet found anything telling in any of the backend logs. Possibly related: * #23693 * #23145 * #13542 - A modified version of the workaround mentioned in [this comment](https://github.com/apache/airflow/issues/13542#issuecomment-1011598836) works to unstick the first stuck task. However if you run it multiple times to try to unstick any downstream tasks, it causes the backfill command to crash. ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
