frsann opened a new issue, #30737:
URL: https://github.com/apache/airflow/issues/30737

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   An error in the Celery worker regularly leads to a situation where the 
scheduler "finishes" a task in a `queued` state, causing it to be queued 
indefinitely. Clearing the task fixes the situation. The situation show in the 
logs like this:
   
   ```
   Apr 19 03:02:06.586 
   scheduler_job.py:550} INFO - Sending 
TaskInstanceKey(dag_id='weather_forecasts_pipeline', 
task_id='extract_weatherbitio_forecast_daily', 
run_id='scheduled__2023-04-18T23:00:00+00:00', try_number=1, map_index=-1) to 
executor with priority 4 and queue high
   
   Apr 19 03:02:06.586
   executor.py:95} INFO - Adding to queue: ['airflow', 'tasks', 'run', 
'weather_forecasts_pipeline', 'extract_weatherbitio_forecast_daily', 
'scheduled__2023-04-18T23:00:00+00:00', '--local', '--subdir', 
'DAGS_FOLDER/weather_forecasts_pipeline.py']
   
   Apr 19 03:03:33.517
   [2023-04-19 00:03:32,972: WARNING/ForkPoolWorker-4] Dag 
'weather_forecasts_pipeline' not found in path 
/usr/local/airflow/dags/weather_forecasts_pipeline.py; trying path 
/usr/local/airflow/dags/weather_forecasts_pipeline.py
   
   Apr 19 03:04:34.550
   [2023-04-19 00:04:33,611: ERROR/ForkPoolWorker-4] 
[21299cda-c9ed-4340-b778-a684943f09c4] Failed to execute task Dag 
'weather_forecasts_pipeline' could not be found; either it does not exist or it 
failed to parse..
   
   Apr 19 03:04:34.550
   airflow.exceptions.AirflowException: Dag 'weather_forecasts_pipeline' could 
not be found; either it does not exist or it failed to parse.
   
   Apr 19 03:04:35.681
   uler_job.py:645} INFO - TaskInstance Finished: 
dag_id=weather_forecasts_pipeline, task_id=extract_weatherbitio_forecast_daily, 
run_id=scheduled__2023-04-18T23:00:00+00:00, map_index=-1, run_start_date=None, 
run_end_date=None, run_duration=None, state=queued, executor_state=failed, 
try_number=1, max_tries=2, job_id=None, pool=default_pool, queue=high, 
priority_weight=4, operator=ExtractWeatherbitIoData, queued_dttm=2023-04-19 
00:02:05.820271+00:00, queued_by_job_id=19365049, pid=None
   ```
   
   The fact that the dag file is not found is not the main point here, but 
rather the fact that the scheduler sets the task as finished with 
`state=queued, executor_state=failed`, and is not able to recover from this by 
retrying. 
    
   
   ### What you think should happen instead
   
   I would expect that the task is set to "failed" state, and eventually 
retrying the task. 
   
   ### How to reproduce
   
   In our case we are able to reproduce it by causing a AirflowExpection as the 
task is staring, by for example making the dag file unavailable. 
   
   ### Operating System
   
   Debian GNU/Linux 11 (bullseye)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other Docker-based deployment
   
   ### Deployment details
   
   Airflow version: 2.5.0
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to