arkadiuszbach commented on issue #10860:
URL: https://github.com/apache/airflow/issues/10860#issuecomment-886515641
This is related to Azure LoadBalancers which has TCP_IDLE_TIMEOUT equal to
10minutes by default, which means that if your scheduler is not doing anything
on kubernetes api for more than 10minutes connection will be killed.
- In Airflow 2.x there is option:
```
# Enables TCP keepalive mechanism. This prevents Kubernetes API requests
to hang indefinitely
# when idle connection is time-outed on services like cloud load
balancers or firewalls.
enable_tcp_keepalive = True
```
for enabling TCP_KEEP_ALIVE in python urllib3 library, which is used by
requests and kubernetes library, by default TCP_KEEP_ALIVE is disabled
- In airflow 1.10.x you can copy airflow start script content to some other
python script for example airflow_start_custom.py and add TCP_KEEP_ALIVE before
`parser = CLIFactory.get_parser()` line:
```
import socket
from urllib3 import connection
connection.HTTPConnection.default_socket_options += [
(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1),
(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 60),
(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 60),
(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 3)
]
```
Then just start airflow scheduler not from airflow.py file but from
airflow_start_custom.py
PS: both of the above solutions will not work if you are using istio sidecar
proxies, if that is the case you will need to enable TCP_KEEP_ALIVE on istio
sidecars
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]