GitHub user kskalski added a comment to the discussion: Starting several 
EksPodOperator tasks clogs the worker making it slow/unresponsive to liveness 
probe

When I enabled debug logging in airflow I can see the last operation is loading 
`kube_config`
```
[2024-11-23, 08:35:27 UTC] {pod.py:1139} INFO - Building pod pcap-o0nfng46 with 
labels: {'dag_id': 'pcap', 'task_id': 'parse0', 'run_id': 
'scheduled__2024-11-22T0048000000-040080e83', 'kubernetes_pod_operator': 
'True', 'try_number': '6'}
[2024-11-23, 08:35:27 UTC] {kubernetes.py:241} DEBUG - loading kube_config 
from: /tmp/tmpwy0ztbk0
[2024-11-23, 08:36:05 UTC] {retries.py:95} DEBUG - Running Job._fetch_from_db 
with retries. Try 1 of 3
[2024-11-23, 08:36:08 UTC] {retries.py:95} DEBUG - Running 
Job._update_heartbeat with retries. Try 1 of 3
[2024-11-23, 08:36:08 UTC] {job.py:234} DEBUG - [heartbeat]
[2024-11-23, 08:36:48 UTC] {retries.py:95} DEBUG - Running Job._fetch_from_db 
with retries. Try 1 of 3
[2024-11-23, 08:36:48 UTC] {retries.py:95} DEBUG - Running 
Job._update_heartbeat with retries. Try 1 of 3
[2024-11-23, 08:36:50 UTC] {job.py:234} DEBUG - [heartbeat]
```
on a successful run this is normally followed by response from EKS api:
```
[2024-11-23, 08:34:09 UTC] {kubernetes.py:241} DEBUG - loading kube_config 
from: /tmp/tmpirxnul08
[2024-11-23, 08:34:20 UTC] {rest.py:235} DEBUG - response body: 
{"kind":"PodList","apiVersion":
```
and this one shows that EKS response times are very long, so I suppose this 
could just be EKS's fault of long / never responding.

However the impact on airflow tasks / workers is terrible, after a few minutes 
the whole worker seems to be failing liveness probe and gets killed.
A separate issue is whether `EksPodOperator` should set some timeout for its 
requests (and retry them after they get exceeded)

GitHub link: 
https://github.com/apache/airflow/discussions/44169#discussioncomment-11356020

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to