dabla commented on PR #46582:
URL: https://github.com/apache/airflow/pull/46582#issuecomment-2664971853
@Dawnpool really good catch, as indeed this will speed up the
store_directory and retrieve_directory functions as only one connection will be
created, but personally, and that's a personal opinion I don't like the wrapper
that much even though it solves the performance issue. Wouldn't it be possible
to refactor the get_conn method so that it caches the connection so when
invoked a second time it just returns the cached instance and once it's
finished removed the cached connection, then the code using the context manager
stays as it was.
This is how it could be implemented, I've tested it locally and it works:
```
@contextmanager
def get_conn(self) -> Generator[SFTPClient, None, None]:
"""Context manager that reuses an SFTP connection and ensures proper
cleanup."""
if self._sftp_conn is None:
self._ssh_conn = super(SFTPHook, self).get_conn() # Get the base
SSH connection
self._sftp_conn = self._ssh_conn.open_sftp()
self._conn_count += 1
try:
print(f"_conn_count: {self._conn_count}")
yield self._sftp_conn
finally:
self._conn_count -= 1
if self._conn_count == 0: # Only close when last reference is done
self._sftp_conn.close()
self._sftp_conn = None
self._ssh_conn.close()
self._ssh_conn = None
```
The hook would have 3 new fields, namely self._ssh_conn, self._sftp_conn and
self._conn_count.
WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]