dabla commented on issue #47201: URL: https://github.com/apache/airflow/issues/47201#issuecomment-2691361663
> > > > Duplicate of [#47129](https://github.com/apache/airflow/issues/47129) ? > > > > > > > > > Yes. I think this will continue to break for the users depending upon how they are using it at the moment. For example we have a wrapper around SFTPHook that will allow us to do multiple interactions with SFTP server with one open connection. For instance, check if certain file matches the pattern, send the file to different server(and additionally create directory on the destination server) and then delete the file from source. > > > ``` > > > def my_file_transfer_function(): > > > srcConnectionWrapper = FTPConnection(source) > > > destConnectionWrapper = FTPConnection(dest) > > > srcConnectionWrapper.file_exsist(src_file_path) > > > srcConnectionWrapper.sftp_client.getfo(src_file_path, destConnectionWrapper.sftp_client.open( > > > dest_file_path, 'wb')) > > > srcConnectionWrapper.delete_file(src_file_path) > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So now we need to open three connections to source and 2 connections to destination server if we were to do the same using 5.1.0. I know this is very specific implementation but I am trying to say that we can have all kinds of implementation which could break if we change the get_conn() in sftphook > > > > > > The reason why this [PR](https://github.com/apache/airflow/pull/44247) changed the implementation was that we discovered connection leaking due to the caching of the connection in the SFTPHook. Also even if the connection is closed using the close_conn method of the SFTPHook, the underlying SSHClient isn't closed and caused connections leaks. In you example above, I would then opt to get the context managed connection and call the underlying SFTPClient methods directly. > > While I understand the need for the change, I am just thinking if there could be any other way of providing this functionality instead of enforcing to change the existing implementation. > > Does it really not close the underlying connection even if we call the close method that it is offering right now? Although i did not showcase in my code, we do use that method to close the connection once we are done with all the operations. > > I mean If SFTPHook.close_conn which calls the underlying close function of paramiko then how adding a closing statement solves it? If I i am not wrong the closing function from contextlib will do the same thing right? The original close_conn only closes the SFTPClient, not the SSHClient, hence why the new implementation had 2 closing statements, one for SFTPClient and one for SSHClient. Closing the SFTPClient doesn't close the SSHClient, which ultimately the FTP server we connected to was out of connections after a while. We used the SFTPHook in multi threaded environment with lots of connections, which where leaking. The PR I've made solved that issue, as we didn't experience connection leaking after the fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
