Gollum999 opened a new issue, #25653:
URL: https://github.com/apache/airflow/issues/25653

   ### Apache Airflow version
   
   2.3.3
   
   ### What happened
   
   If you try to backfill a DAG that uses any [deferrable 
operators](https://airflow.apache.org/docs/apache-airflow/stable/concepts/deferring.html),
 those tasks will get indefinitely stuck in a "scheduled" state.
   
   If I watch the Grid View, I can see the task state change: "scheduled" (or 
sometimes "queued") -> "deferred" -> "scheduled".  I've tried leaving in this 
state for over an hour, but there are no further state changes.
   
   When the task is stuck like this, the log appears as empty in the web UI.  
The corresponding log file *does* exist on the worker, but it does not contain 
any errors or warnings that might point to the source of the problem.
   
   Ctrl-C-ing the backfill at this point seems to hang on "Shutting down 
LocalExecutor; waiting for running tasks to finish."  **Force-killing and 
restarting the backfill will "unstick" the stuck tasks.**  However, any 
deferrable operators downstream of the first will get back into that stuck 
state, requiring multiple restarts to get everything to complete successfully.
   
   ### What you think should happen instead
   
   Deferrable operators should work as normal when backfilling.
   
   ### How to reproduce
   
   ```
   #!/usr/bin/env python3
   import datetime
   import logging
   
   import pendulum
   from airflow.decorators import dag, task
   from airflow.sensors.time_sensor import TimeSensorAsync
   
   
   logger = logging.getLogger(__name__)
   
   
   @dag(
       schedule_interval='@daily',
       start_date=datetime.datetime(2022, 8, 10),
   )
   def test_backfill():
       time_sensor = TimeSensorAsync(
           task_id='time_sensor',
           target_time=datetime.time(0).replace(tzinfo=pendulum.UTC),  # 
midnight - should succeed immediately when the trigger first runs
       )
   
       @task
       def some_task():
           logger.info('hello')
   
       time_sensor >> some_task()
   
   
   dag = test_backfill()
   
   
   if __name__ == '__main__':
       dag.cli()
   ```
   `airflow dags backfill test_backfill -s 2022-08-01 -e 2022-08-04`
   
   ### Operating System
   
   CentOS Stream 8
   
   ### Versions of Apache Airflow Providers
   
   None
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   Self-hosted/standalone
   
   ### Anything else
   
   I was able to reproduce this with the following configurations:
   * `standalone` mode + SQLite backend + `SequentialExecutor`
   * `standalone` mode + Postgres backend + `LocalExecutor`
   * Production deployment (self-hosted) + Postgres backend + `CeleryExecutor`
   
   I have not yet found anything telling in any of the backend logs.
   
   Possibly related:
   * #23693
   * #23145
   * #13542
     - A modified version of the workaround mentioned in [this 
comment](https://github.com/apache/airflow/issues/13542#issuecomment-1011598836)
 works to unstick the first stuck task.  However if you run it multiple times 
to try to unstick any downstream tasks, it causes the backfill command to crash.
   
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to