Hi Arpit, Thank you, will contribute the fix shortly. On Wed, 22 Aug 2018 at 00:52, Arpit Agarwal <aagar...@hortonworks.com> wrote:
> Hi Mikhail, > > > > There’s two ways to contribute a fix: > > 1. Attach a patch file to the Jira and click “Submit Patch”. I’ve made > you a contributor and assigned it to you. > 2. Submit a GitHub pull request with the Jira key in the title. Not > tried this in a while and not sure it still works. > > > > > > *From: *Mikhail Pryakhin <m.prya...@gmail.com> > *Date: *Tuesday, August 21, 2018 at 2:40 PM > *To: *<user@hadoop.apache.org> > *Subject: *SFTPConnectionPool connections leakage > > > > Hi, > > I’ve come across a connection leakage while using SFTPFileSystem. Methods > of SFTPFileSystem operate on poolable ChannelSftp instances, thus some > methods of SFTPFileSystem are chained together resulting in establishing > multiple connections to the SFTP server to accomplish one compound action, > those methods are: > > > > *mkdirs method*[1]*. *The public mkdirs method acquires a new ChannelSftp > from the pool and then recursively creates directories, checking for the > directory existence beforehand by calling the method exists[2] which > delegates to the getFileStatus(ChannelSftp channel, Path file) method [3] > and so on until it ends up in returning the FilesStatus instance [4]. The > resource leakage occurs in the method getWorkingDirectory which calls the > getHomeDirectory method [5] which in turn establishes a new connection to > the sftp server instead of using an already created connection. As the > mkdirs method is recursive this results in creating a huge number of > connections. > > > > *open method* [6]. This method returns an instance of FSDataInputStream > which consumes SFTPInputStream instance which doesn't return an acquired > ChannelSftp instance back to the pool but instead it closes it[7]. This > leads to establishing another connection to an SFTP server when the next > method is called on the FileSystem instance. > > > > I’ve issued a Jira ticket > https://issues.apache.org/jira/browse/HADOOP-15358, and fixed the > connection leakage issue. > > Could I create a pull request to merge the fix? > > > > [1] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658 > > [2] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321 > > [3] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202 > > [4] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290 > > [5] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640 > > [6] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504 > > [7] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123 > > > > > > Kind Regards, > > Mike Pryakhin > > > > -- Regards, Mikhail Pryakhin.