dirrao opened a new issue, #31958: URL: https://github.com/apache/airflow/issues/31958
### Description The function clear_not_launched_queued_tasks takes time when there are more queued tasks (a few hundred). The reason for latency is due to list_namespaced_pod kube API triggered for each queued task. It leads to scheduler heartbeat delay. Improve this function by calling list_namespaced_pod function using pagination and getting all the required pods with fewer calls. ### Use case/motivation As we run the airflow at a large scale, we have found that the clear_not_launched_queued_tasks function might take a few minutes (> 5 minutes). These will delay the heartbeat of the scheduler and leads to the scheduler instance restarting/killed. To avoid this issue, use a list namespaced pod with pagination and get all the worker pods with fewer calls. ### Related issues _No response_ ### Are you willing to submit a PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
