dirrao opened a new pull request, #36092: URL: https://github.com/apache/airflow/pull/36092
What happened _list_pods function uses kube list_namespaced_pod and list_pod_for_all_namespaces kube functions. Right now, these Kube functions will get the entire pod spec though we are interested in the pod metadata alone. This _list_pods is refered in clear_not_launched_queued_tasks. try_adopt_task_instances and _adopt_completed_pods functions. When we run the airflow at large scale (with worker pods of more than > 500). The _list_pods function takes a significant amount of time (upto 15 - 30 seconds with 500 worker pods) due to unnecessary data transfer (V1PodList up to a few 10 MBs) and JSON deserialization overhead. This is blocking us from scaling the airflow to run at large scale What you think should happen instead Request the Pod metadata instead of entire Pod payload. It will help to reduce significant network data transfer and JSON deserialization overhead. More details at https://github.com/apache/airflow/issues/35599 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
