kiaradlf opened a new issue, #44285: URL: https://github.com/apache/airflow/issues/44285
### Description One of the factors making Python's `requests` library quite flexible is its use of [transport adapters](https://docs.python-requests.org/en/latest/user/advanced/#transport-adapters), allowing the user to further configure behavior. Airflow's built-in `HttpHook` tho unfortunately does not currently expose such functionality to the user. So far, I've found myself working around this by inheriting the hook with one adding such functionality, see below. It would be nice tho to see the actual HttpHook extended with an extra parameter to expose this functionality to the user, foregoing the need for such sub-classes. ``` # mountable_http_hook.py from typing import Any from urllib.parse import urlparse from airflow.providers.http.hooks.http import HttpHook from requests import Session from requests.adapters import BaseAdapter from requests_toolbelt.adapters.socket_options import ( # type: ignore[import-untyped] TCPKeepAliveAdapter, ) class MountableHttpHook(HttpHook): """Version of AirFlow's HttpHook that allows mounting custom `requests` adapters.""" _adapter: BaseAdapter | None def __init__( self, *args, adapter: BaseAdapter | None = None, tcp_keep_alive: bool = True, tcp_keep_alive_idle: int = 120, tcp_keep_alive_count: int = 20, tcp_keep_alive_interval: int = 30, **kwargs, ) -> None: super().__init__(*args, **kwargs) # ensure HttpHook won't override our mounts. # set manually instead of by constructor in case overrides # pass all params on to a class that doesn't know this parameter, # such as is the case with Oauth2HttpHook for requests.Session. self.tcp_keep_alive = False if adapter is None and tcp_keep_alive: # default to the HttpHook's adapter adapter = TCPKeepAliveAdapter( idle=tcp_keep_alive_idle, count=tcp_keep_alive_count, interval=tcp_keep_alive_interval, ) self._adapter = adapter def get_conn(self, headers: dict[Any, Any] | None = None) -> Session: """Add our adapter to the `requests.Session`.""" session = super().get_conn(headers=headers) if self._adapter: scheme = urlparse(self.base_url).scheme if self.base_url else "https" session.adapters = {scheme: self._adapter} # type: ignore return session ``` ### Use case/motivation Adapters may help handle business logic surrounding HTTP requests, including: - retries - timeouts - HTTP status codes, e.g. to specify which error codes to retry (potentially using say exponential back-off) before marking the task as failed - SSL version - ... ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
