Bowrna commented on PR #39398:
URL: https://github.com/apache/airflow/pull/39398#issuecomment-2268244502

   > > I have been working on this PR after taking a long break due to personal 
commitments. I have been struck in this PR. From whatever I have learnt so far, 
I can see that.
   > > 
   > > 1. there is a configurable param, task_queued_timeout, that allows to 
configure the time above which the task can not stuck in queue. I see that this 
timeout is used in `schedule_job_runner.py` and  it is invoked using a 
timer(event_scheduler) that executes the function `_fail_tasks_stuck_in_queued` 
periodically.
   > > 
   > > In this function, we pick out the TI from DB that is stuck in the queue 
and we collect the executor and task instances and ask the executor to cleanup 
the struck tasks. So the work of cleaning up goes to the executor part and how 
it wants to handle the cleanup process. So far this cleanup is implemented only 
in (Celery, CeleryKubernetes, Kubernetes, LocalKubernetes) Executor.
   > > In case of Celery Executor, it is marked as Failed and task is popped 
out from active task dict. In case of Kubernetes Executor, it only delete pods 
that is associated with task instance. In case of CeleryKubernetes, TI is 
checked if the queue is celery or k8s one and based on it one(or both) of the 
above two ways of cleanup is invoked. In case of LocalKubernetes, TI is 
filtered out for k8s and only cleanup associated with k8s executor is invoked, 
as the local executor doesn't have any cleanup method implementation done.
   > > I have a question wrt k8s executor, if its stuck in queue for long time 
and the implementation at cleanup only deletes the pod. I don't see any place 
where the task is marked as failed ( or any other state). Can anyone help me 
figure out how this one works?
   > > cc: @potiuk
   > > 
https://github.com/apache/airflow/blob/3805050f34dcb575aaad690c6ad1e37f75f3b2cf/airflow/jobs/scheduler_job_runner.py#L1626-L1660
   > 
   > @collinmcnulty this is the reason for enquiring about the type of executor.
   
   @potiuk  A gentle reminder on this as I am struck here. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to