dabla commented on PR #46582:
URL: https://github.com/apache/airflow/pull/46582#issuecomment-2664971853

   @Dawnpool really good catch, as indeed this will speed up the 
store_directory and retrieve_directory functions as only one connection will be 
created, but personally, and that's a personal opinion I don't like the wrapper 
that much even though it solves the performance issue.  Wouldn't it be possible 
to refactor the get_conn method so that it caches the connection so when 
invoked a second time it just returns the cached instance and once it's 
finished removed the cached connection, then the code using the context manager 
stays as it was.
   
   This is how it could be implemented, I've tested it locally and it works:
   
   ```
   @contextmanager
   def get_conn(self) -> Generator[SFTPClient, None, None]:
       """Context manager that reuses an SFTP connection and ensures proper 
cleanup."""
       if self._sftp_conn is None:
           self._ssh_conn = super(SFTPHook, self).get_conn()  # Get the base 
SSH connection
           self._sftp_conn = self._ssh_conn.open_sftp()
   
       self._conn_count += 1
       try:
           print(f"_conn_count: {self._conn_count}")
           yield self._sftp_conn
       finally:
           self._conn_count -= 1
           if self._conn_count == 0:  # Only close when last reference is done
               self._sftp_conn.close()
               self._sftp_conn = None
               self._ssh_conn.close()
               self._ssh_conn = None
   ```
   
   
   The hook would have 3 new fields, namely self._ssh_conn, self._sftp_conn and 
 self._conn_count.
   
   WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to