alex-astronomer commented on issue #35843:
URL: https://github.com/apache/airflow/issues/35843#issuecomment-1863289082
I think that I'm having the same issue here:
Celery Executor
There was a task that was stuck in a queued state. The scheduler logs state
that the task was first queued at 12:55. At this point it's assigned an
`external_id`
```
[2023-11-30T12:55:16.877+0000] {scheduler_job_runner.py:411} INFO - 1 tasks
up for execution:
<TaskInstance: DAG_A.TG_1.TASK_1 manual__2023-11-30T12:54:48.282524+00:00
[scheduled]>
[2023-11-30T12:55:16.878+0000] {scheduler_job_runner.py:587} INFO - Setting
the following tasks to queued state:
<TaskInstance: DAG_A.TG_1.TASK_1 manual__2023-11-30T12:54:48.282524+00:00
[scheduled]>
[2023-11-30T12:55:16.951+0000] {scheduler_job_runner.py:625} INFO - Sending
TaskInstanceKey(dag_id='DAG_A', task_id='TG_1.TASK_1',
run_id='manual__2023-11-30T12:54:48.282524+00:00', try_number=1, map_index=-1)
to executor with priority 2 and queue default
[2023-11-30T12:55:16.951+0000] {base_executor.py:147} INFO - Adding to
queue: ['airflow', 'tasks', 'run', 'DAG_A', 'TG_1.TASK_1',
'manual__2023-11-30T12:54:48.282524+00:00', '--local', '--subdir',
'DAGS_FOLDER/fund_admin/DAG_FILE.py']
[2023-11-30T12:55:17.167+0000] {scheduler_job_runner.py:677} INFO - Received
executor event with state queued for task instance
TaskInstanceKey(dag_id='DAG_A', task_id='TG_1.TASK_1',
run_id='manual__2023-11-30T12:54:48.282524+00:00', try_number=1, map_index=-1)
[2023-11-30T12:55:17.256+0000] {scheduler_job_runner.py:703} INFO - Setting
external_id for <TaskInstance: DAG_A.TG_1.TASK_1
manual__2023-11-30T12:54:48.282524+00:00 [queued]> to
96c9f186-97a4-4bbf-a826-cffbd5ba9b98
```
At this point, the task is queued and has that `external_id` assigned. My
understanding is that, at this point, it's in the executor queue but hasn't
made it to the worker yet for execution.
From here, the task is cleared from the UI since it was appearing stuck in
queued for a long period of time, however when the task was cleared and it
tried to re-queue the logs showed that the task was in a running state.
```
[2023-11-30T13:04:23.267+0000] {scheduler_job_runner.py:411} INFO - 2 tasks
up for execution:
<TaskInstance:
pb_goldmansachs_swap.process_as_of_date_minus1.get_files.should_load
scheduled__2023-11-30T12:40:00+00:00 [scheduled]>
<TaskInstance: DAG_A.TG_1.TASK_1 manual__2023-11-30T12:54:48.282524+00:00
[scheduled]>
[2023-11-30T13:04:23.268+0000] {scheduler_job_runner.py:587} INFO - Setting
the following tasks to queued state:
<TaskInstance:
pb_goldmansachs_swap.process_as_of_date_minus1.get_files.should_load
scheduled__2023-11-30T12:40:00+00:00 [scheduled]>
<TaskInstance: DAG_A.TG_1.TASK_1 manual__2023-11-30T12:54:48.282524+00:00
[scheduled]>
[2023-11-30T13:04:23.272+0000] {scheduler_job_runner.py:625} INFO - Sending
TaskInstanceKey(dag_id='DAG_A', task_id='TG_1.TASK_1',
run_id='manual__2023-11-30T12:54:48.282524+00:00', try_number=1, map_index=-1)
to executor with priority 2 and queue default
[2023-11-30T13:04:23.272+0000] {base_executor.py:147} INFO - Adding to
queue: ['airflow', 'tasks', 'run', 'DAG_A', 'TG_1.TASK_1',
'manual__2023-11-30T12:54:48.282524+00:00', '--local', '--subdir',
'DAGS_FOLDER/fund_admin/DAG_FILE.py']
[2023-11-30T13:04:23.275+0000] {base_executor.py:269} INFO - queued but
still running; attempt=1 task=TaskInstanceKey(dag_id='DAG_A',
task_id='TG_1.TASK_1', run_id='manual__2023-11-30T12:54:48.282524+00:00',
try_number=1, map_index=-1)
[2023-11-30T13:04:28.068+0000] {base_executor.py:269} INFO - queued but
still running; attempt=2 task=TaskInstanceKey(dag_id='DAG_A',
task_id='TG_1.TASK_1', run_id='manual__2023-11-30T12:54:48.282524+00:00',
try_number=1, map_index=-1)
[2023-11-30T13:04:32.571+0000] {base_executor.py:269} INFO - queued but
still running; attempt=3 task=TaskInstanceKey(dag_id='DAG_A',
task_id='TG_1.TASK_1', run_id='manual__2023-11-30T12:54:48.282524+00:00',
try_number=1, map_index=-1)
[2023-11-30T13:04:37.755+0000] {base_executor.py:269} INFO - queued but
still running; attempt=4 task=TaskInstanceKey(dag_id='DAG_A',
task_id='TG_1.TASK_1', run_id='manual__2023-11-30T12:54:48.282524+00:00',
try_number=1, map_index=-1)
[2023-11-30T13:04:44.255+0000] {base_executor.py:272} ERROR - could not
queue task TaskInstanceKey(dag_id='DAG_A', task_id='TG_1.TASK_1',
run_id='manual__2023-11-30T12:54:48.282524+00:00', try_number=1, map_index=-1)
(still running after 4 attempts)
```
This repeats a few times until the task finally runs. The task finally runs
around 13:28 (~30 minutes from when it was first queued). During that time, a
new worker was spawned (KEDA and cluster-autoscaler) and the task finally runs.
Let me know if you have further questions!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]