[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868666#comment-16868666 ] Sebastien Barnoud commented on HDFS-11280: -- I was using swebhdfs. I just tried with webhdfs ... still having to many connections. When using curl as client, i use a single connection: the server supports keep-alive. Did you have the same behavior ? Did you verify with strace (or a network trace) that the client effectively uses a single connection when it can? > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao >Priority: Major > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868524#comment-16868524 ] Sebastien Barnoud commented on HDFS-11280: -- Ignore the previous message: my active NN switched in between. I still have the issue. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao >Priority: Major > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868519#comment-16868519 ] Sebastien Barnoud commented on HDFS-11280: -- Sorry, when adding -Dhttp.keepAlive=true (which should be the default), it uses only 2 connections. {code:java} $ strace -T -tt -f /usr/bin/hdfs dfs -Dhttp.keepAlive=true -Djavax.net.debug=all -put -f dummy.txt swebhdfs:///user/dtlprd05/test_from_prod/$(hostname)-8u192 2>&1 | grep "sin_port=htons(50470)" [pid 177932] 15:09:26.741407 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000118> [pid 177932] 15:09:26.912454 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16 {code} > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao >Priority: Major > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867717#comment-16867717 ] Sebastien Barnoud commented on HDFS-11280: -- This patch doesn't work as the connection is a local variable. {code:java} $ strace -T -tt -f /usr/bin/hdfs dfs -Djavax.net.debug=all -put -f dummy.txt swebhdfs://:50470/user/seb/test_from_prod/$(hostname) 2>&1 | grep "sin_port=htons(50470)" [pid 62905] 14:51:34.917426 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000101> [pid 62905] 14:51:35.107079 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16 [pid 62905] 14:51:35.340149 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16 [pid 62905] 14:51:35.411417 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16 [pid 62905] 14:51:35.477833 connect(388, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16 [pid 62905] 14:51:35.571895 connect(388, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000127> [pid 62905] 14:51:35.719054 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16 [pid 62905] 14:51:35.770048 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16 [pid 63108] 14:51:35.821029 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16 [pid 63108] 14:51:35.895525 connect(386, {sa_family=AF_INET, sin_port=htons(50470), sin_addr=inet_addr("184.6.241.2")}, 16 [bash-4.2.46][j:0|h:4914|?:0][2019-06-19 14:51:35][dtlprd05@nazare:~/test-hdfs] ${code} As you you can see, 10 physical connect for a single put of one file of 10 bytes ... The same issue for HDFS. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao >Priority: Major > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800550#comment-15800550 ] Haohui Mai commented on HDFS-11280: --- Committed all the way down to branch-2.7. Many thanks everyone for reporting and validating the issues. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800394#comment-15800394 ] Hudson commented on HDFS-11280: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11074 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11074/]) HDFS-11280. Allow WebHDFS to reuse HTTP connections to NN. Contributed (wheat9: rev a605ff36a53a3d1283c3f6d81eb073e4a2942143) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800337#comment-15800337 ] Haohui Mai commented on HDFS-11280: --- Validate that the the latest patch passes the unit tests mentioned in the jira. Committing. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796738#comment-15796738 ] Brahma Reddy Battula commented on HDFS-11280: - As changes are only in hdfs-client ,Tests did not run on hadoop-hdfs project.hence jenkins did not catch here.. Actually I suggested earlier,atleast we should run parent project( cd to parent project) testcases..but it did not happen.. {noformat} unit test pre-reqs: cd /testptch/hadoop/hadoop-common-project/hadoop-common mvn -Dmaven.repo.local=/home/jenkins/yetus-m2/hadoop-trunk-patch-0 install -DskipTests -Pnative -Drequire.libwebhdfs -Drequire.snappy -Drequire.openssl -Drequire.fuse -Drequire.test.libhadoop -Pyarn-ui > /testptch/hadoop/patchprocess/maven-unit-prereq-hadoop-common-project_hadoop-common-install.txt 2>&1 cd /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-client mvn -Dmaven.repo.local=/home/jenkins/yetus-m2/hadoop-trunk-patch-0 -Ptest-patch -Pparallel-tests -P!shelltest -Pnative -Drequire.libwebhdfs -Drequire.snappy -Drequire.openssl -Drequire.fuse -Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/hadoop/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt 2>&1 Elapsed: 1m 1s hadoop-hdfs-client in the patch passed. {noformat} > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796571#comment-15796571 ] Hanisha Koneru commented on HDFS-11280: --- Verified that the following unit tests are passing now. - TestWebHDFSXAttr - TestWebHDFS - TestWebHdfsTokens - TestWebHdfsWithRestCsrfPreventionFilter - TestAuditLogs > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796417#comment-15796417 ] Hadoop QA commented on HDFS-11280: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 14s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 1 new + 62 unchanged - 0 fixed = 63 total (was 62) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-11280 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845438/HDFS-11280.for.2.8.and.beyond.5.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux aba63058763e 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / ebdd2e0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/18011/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/18011/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18011/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project:
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796369#comment-15796369 ] Zheng Shao commented on HDFS-11280: --- The Test Failures were caused by the calling of "conn.getInputStream()" in the case of HTTP Redirect. Since HTTP Redirect will be slow and won't cause the ephemeral port problem, I will skip that change for now. I've uploaded a new patch that passes all the earlier failed tests. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.5.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796310#comment-15796310 ] Hadoop QA commented on HDFS-11280: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HDFS-11280 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-11280 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12845434/HDFS-11280.for.2.7.and.below.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18009/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.7.and.below.patch, HDFS-11280.for.2.8.and.beyond.2.patch, > HDFS-11280.for.2.8.and.beyond.3.patch, HDFS-11280.for.2.8.and.beyond.4.patch, > HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15795967#comment-15795967 ] Haohui Mai commented on HDFS-11280: --- Any ideas why Jenkins is still happy? > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15795189#comment-15795189 ] Hudson commented on HDFS-11280: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11062 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11062/]) Revert "HDFS-11280. Allow WebHDFS to reuse HTTP connections to NN. (brahma: rev b31e1951e044b2c6f6e88a007a8c175941ddd674) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794341#comment-15794341 ] Brahma Reddy Battula commented on HDFS-11280: - [~zshao] thanks for your reply and confirmation.. I will revert this patch today.. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794319#comment-15794319 ] Zheng Shao commented on HDFS-11280: --- Sorry guys, I think we need to revert this patch. My initial understanding is that the HTTP Keep-Alive is breaking some assumptions in the WebHDFS code. [~wheat9] can you help revert this? I will take a second look at the breaking Tests. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794199#comment-15794199 ] John Zhuge commented on HDFS-11280: --- Same here [~cheersyang]. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794115#comment-15794115 ] Weiwei Yang commented on HDFS-11280: Hi [~jzhuge] I did not try all failed tests, but at least this patch is breaking {{TestWebHDFS#testCreateWithNoDN}}, I could get this test pass if I revert the changes this patch has made. Please check. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15793588#comment-15793588 ] John Zhuge commented on HDFS-11280: --- These unit test failures may be caused by this patch: TestWebHDFSXAttr, TestWebHDFS, TestWebHdfsTokens, TestWebHdfsWithRestCsrfPreventionFilter, and TestAuditLogs. After reverting the trunk to "165d01a YARN-5931. Document timeout interfaces CLI and REST APIs (Contributed by Rohith Sharma K S via Daniel Templeton)", these unit tests all passed. https://builds.apache.org/job/PreCommit-HADOOP-Build/11338/artifact/patchprocess/patch-unit-root.txt > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15792232#comment-15792232 ] Brahma Reddy Battula commented on HDFS-11280: - *There are so many testcases are failing after this commit..Following is the stacktrace.* will you look into this..? {noformat} java.io.IOException: localhost:52693: Server returned HTTP response code: 403 for URL: http://localhost:52693/webhdfs/v1/srcdat/three/1925354346738733427?op=OPEN=bob=4096=0 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1926) at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1921) at java.security.AccessController.doPrivileged(Native Method) at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1920) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1490) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:664) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.connect(WebHdfsFileSystem.java:1997) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:740) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:584) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:615) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:611) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.read(WebHdfsFileSystem.java:1946) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.read(WebHdfsFileSystem.java:1804) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.read(WebHdfsFileSystem.java:1799) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.server.namenode.TestAuditLogs.testAuditWebHdfsDenied(TestAuditLogs.java:249) {noformat} *FYR* https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/273/testReport/junit/ > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: >
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15789216#comment-15789216 ] Haohui Mai commented on HDFS-11280: --- Commit to trunk, branch-2, branch-2.8 and branch-2.7. Thanks [~zshao] for the contributions. > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11280) Allow WebHDFS to reuse HTTP connections to NN
[ https://issues.apache.org/jira/browse/HDFS-11280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15789107#comment-15789107 ] Hudson commented on HDFS-11280: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11060 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11060/]) HDFS-11280. Allow WebHDFS to reuse HTTP connections to NN. Contributed (wheat9: rev b811a1c14d00ab236158ab75fad1fe41364045a4) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > Allow WebHDFS to reuse HTTP connections to NN > - > > Key: HDFS-11280 > URL: https://issues.apache.org/jira/browse/HDFS-11280 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.7.3, 2.6.5, 3.0.0-alpha1 >Reporter: Zheng Shao >Assignee: Zheng Shao > Attachments: HDFS-11280.for.2.7.and.below.patch, > HDFS-11280.for.2.8.and.beyond.2.patch, HDFS-11280.for.2.8.and.beyond.3.patch, > HDFS-11280.for.2.8.and.beyond.4.patch, HDFS-11280.for.2.8.and.beyond.patch > > > WebHDFSClient calls "conn.disconnect()", which disconnects from the NameNode. > When we use webhdfs as the source in distcp, this used up all ephemeral > ports on the client side since all closed connections continue to occupy the > port with TIME_WAIT status for some time. > According to http://tinyurl.com/java7-http-keepalive, we should call > conn.getInputStream().close() instead to make sure the connection is kept > alive. This will get rid of the ephemeral port problem. > Manual steps used to verify the bug fix: > 1. Build original hadoop jar. > 2. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows a big number (100s). > 3. Build hadoop jar with this diff. > 4. Try out distcp from webhdfs as source, and "netstat -n | grep TIME_WAIT | > grep -c 50070" on the local machine shows 0. > 5. The explanation: distcp's client side does a lot of directory scanning, > which would create and close a lot of connections to the namenode HTTP port. > Reference: > 2.7 and below: > https://github.com/apache/hadoop/blob/branch-2.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L743 > 2.8 and above: > https://github.com/apache/hadoop/blob/branch-2.8/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java#L898 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org