alex-astronomer commented on issue #35843:
URL: https://github.com/apache/airflow/issues/35843#issuecomment-1863289082

   I think that I'm having the same issue here:
   
   Celery Executor
   
   There was a task that was stuck in a queued state.  The scheduler logs state 
that the task was first queued at 12:55.  At this point it's assigned an 
`external_id`
   
   ```
   [2023-11-30T12:55:16.877+0000] {scheduler_job_runner.py:411} INFO - 1 tasks 
up for execution:
   <TaskInstance: DAG_A.TG_1.TASK_1 manual__2023-11-30T12:54:48.282524+00:00 
[scheduled]>
   [2023-11-30T12:55:16.878+0000] {scheduler_job_runner.py:587} INFO - Setting 
the following tasks to queued state:
   <TaskInstance: DAG_A.TG_1.TASK_1 manual__2023-11-30T12:54:48.282524+00:00 
[scheduled]>
   [2023-11-30T12:55:16.951+0000] {scheduler_job_runner.py:625} INFO - Sending 
TaskInstanceKey(dag_id='DAG_A', task_id='TG_1.TASK_1', 
run_id='manual__2023-11-30T12:54:48.282524+00:00', try_number=1, map_index=-1) 
to executor with priority 2 and queue default
   [2023-11-30T12:55:16.951+0000] {base_executor.py:147} INFO - Adding to 
queue: ['airflow', 'tasks', 'run', 'DAG_A', 'TG_1.TASK_1', 
'manual__2023-11-30T12:54:48.282524+00:00', '--local', '--subdir', 
'DAGS_FOLDER/fund_admin/DAG_FILE.py']
   [2023-11-30T12:55:17.167+0000] {scheduler_job_runner.py:677} INFO - Received 
executor event with state queued for task instance 
TaskInstanceKey(dag_id='DAG_A', task_id='TG_1.TASK_1', 
run_id='manual__2023-11-30T12:54:48.282524+00:00', try_number=1, map_index=-1)
   [2023-11-30T12:55:17.256+0000] {scheduler_job_runner.py:703} INFO - Setting 
external_id for <TaskInstance: DAG_A.TG_1.TASK_1 
manual__2023-11-30T12:54:48.282524+00:00 [queued]> to 
96c9f186-97a4-4bbf-a826-cffbd5ba9b98
   ```
   
   At this point, the task is queued and has that `external_id` assigned. My 
understanding is that, at this point, it's in the executor queue but hasn't 
made it to the worker yet for execution.
   
   From here, the task is cleared from the UI since it was appearing stuck in 
queued for a long period of time, however when the task was cleared and it 
tried to re-queue the logs showed that the task was in a running state.
   
   ```
   [2023-11-30T13:04:23.267+0000] {scheduler_job_runner.py:411} INFO - 2 tasks 
up for execution:
   <TaskInstance: 
pb_goldmansachs_swap.process_as_of_date_minus1.get_files.should_load 
scheduled__2023-11-30T12:40:00+00:00 [scheduled]>
   <TaskInstance: DAG_A.TG_1.TASK_1 manual__2023-11-30T12:54:48.282524+00:00 
[scheduled]>
   [2023-11-30T13:04:23.268+0000] {scheduler_job_runner.py:587} INFO - Setting 
the following tasks to queued state:
   <TaskInstance: 
pb_goldmansachs_swap.process_as_of_date_minus1.get_files.should_load 
scheduled__2023-11-30T12:40:00+00:00 [scheduled]>
   <TaskInstance: DAG_A.TG_1.TASK_1 manual__2023-11-30T12:54:48.282524+00:00 
[scheduled]>
   [2023-11-30T13:04:23.272+0000] {scheduler_job_runner.py:625} INFO - Sending 
TaskInstanceKey(dag_id='DAG_A', task_id='TG_1.TASK_1', 
run_id='manual__2023-11-30T12:54:48.282524+00:00', try_number=1, map_index=-1) 
to executor with priority 2 and queue default
   [2023-11-30T13:04:23.272+0000] {base_executor.py:147} INFO - Adding to 
queue: ['airflow', 'tasks', 'run', 'DAG_A', 'TG_1.TASK_1', 
'manual__2023-11-30T12:54:48.282524+00:00', '--local', '--subdir', 
'DAGS_FOLDER/fund_admin/DAG_FILE.py']
   [2023-11-30T13:04:23.275+0000] {base_executor.py:269} INFO - queued but 
still running; attempt=1 task=TaskInstanceKey(dag_id='DAG_A', 
task_id='TG_1.TASK_1', run_id='manual__2023-11-30T12:54:48.282524+00:00', 
try_number=1, map_index=-1)
   [2023-11-30T13:04:28.068+0000] {base_executor.py:269} INFO - queued but 
still running;  attempt=2 task=TaskInstanceKey(dag_id='DAG_A', 
task_id='TG_1.TASK_1', run_id='manual__2023-11-30T12:54:48.282524+00:00', 
try_number=1, map_index=-1)
   [2023-11-30T13:04:32.571+0000] {base_executor.py:269} INFO - queued but 
still running; attempt=3 task=TaskInstanceKey(dag_id='DAG_A', 
task_id='TG_1.TASK_1', run_id='manual__2023-11-30T12:54:48.282524+00:00', 
try_number=1, map_index=-1)
   [2023-11-30T13:04:37.755+0000] {base_executor.py:269} INFO - queued but 
still running; attempt=4 task=TaskInstanceKey(dag_id='DAG_A', 
task_id='TG_1.TASK_1', run_id='manual__2023-11-30T12:54:48.282524+00:00', 
try_number=1, map_index=-1)
   [2023-11-30T13:04:44.255+0000] {base_executor.py:272} ERROR - could not 
queue task TaskInstanceKey(dag_id='DAG_A', task_id='TG_1.TASK_1', 
run_id='manual__2023-11-30T12:54:48.282524+00:00', try_number=1, map_index=-1) 
(still running after 4 attempts)
   ```
   
   This repeats a few times until the task finally runs.  The task finally runs 
around 13:28 (~30 minutes from when it was first queued).  During that time, a 
new worker was spawned (KEDA and cluster-autoscaler) and the task finally runs.
   
   Let me know if you have further questions!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to