[jira] [Commented] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
[ https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469912#comment-16469912 ] Ajay Kumar commented on HADOOP-15250: - Attached a patch for 3.1 branch. > Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong > -- > > Key: HADOOP-15250 > URL: https://issues.apache.org/jira/browse/HADOOP-15250 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, net >Affects Versions: 2.7.3, 2.9.0, 3.0.0 > Environment: Multihome cluster with split DNS and rDNS lookup of > localhost returning non-routable IPAddr >Reporter: Greg Senia >Assignee: Ajay Kumar >Priority: Critical > Fix For: 3.2.0 > > Attachments: HADOOP-15250-branch-3.1.patch, HADOOP-15250.00.patch, > HADOOP-15250.01.patch, HADOOP-15250.02.patch, HADOOP-15250.patch > > > We run our Hadoop clusters with two networks attached to each node. These > network are as follows a server network that is firewalled with firewalld > allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and > the HTTP YARN RM/ATS and MR History Server. The second network is the cluster > network on the second network interface this uses Jumbo frames and is open no > restrictions and allows all cluster traffic to flow between nodes. > > To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the > traffic is originating from nodes with cluster networks we return the > internal DNS record for the nodes. This all works fine with all the > multi-homing features added to Hadoop 2.x > Some logic around views: > a. The internal view is used by cluster machines when performing lookups. So > hosts on the cluster network should get answers from the internal view in DNS > b. The external view is used by non-local-cluster machines when performing > lookups. So hosts not on the cluster network should get answers from the > external view in DNS > > So this brings me to our problem. We created some firewall rules to allow > inbound traffic from each clusters server network to allow distcp to occur. > But we noticed a problem almost immediately that when YARN attempted to talk > to the Remote Cluster it was binding outgoing traffic to the cluster network > interface which IS NOT routable. So after researching the code we noticed the > following in NetUtils.java and Client.java > Basically in Client.java it looks as if it takes whatever the hostname is and > attempts to bind to whatever the hostname is resolved to. This is not valid > in a multi-homed network with one routable interface and one non routable > interface. After reading through the java.net.Socket documentation it is > valid to perform socket.bind(null) which will allow the OS routing table and > DNS to send the traffic to the correct interface. I will also attach the > nework traces and a test patch for 2.7.x and 3.x code base. I have this test > fix below in my Hadoop Test Cluster. > Client.java: > > |/*| > | | * Bind the socket to the host specified in the principal name of the| > | | * client, to ensure Server matching address of the client connection| > | | * to host name in principal passed.| > | | */| > | |InetSocketAddress bindAddr = null;| > | |if (ticket != null && ticket.hasKerberosCredentials()) {| > | |KerberosInfo krbInfo =| > | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);| > | |if (krbInfo != null) {| > | |String principal = ticket.getUserName();| > | |String host = SecurityUtil.getHostFromPrincipal(principal);| > | |// If host name is a valid local address then bind socket to it| > | |{color:#FF}*InetAddress localAddr = > NetUtils.getLocalInetAddress(host);*{color}| > |{color:#FF} ** {color}|if (localAddr != null) {| > | |this.socket.setReuseAddress(true);| > | |if (LOG.isDebugEnabled()) {| > | |LOG.debug("Binding " + principal + " to " + localAddr);| > | |}| > | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*| > | *{color:#FF}{color}* |*{color:#FF}}{color}*| > | |}| > | |}| > > So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows > correctly out the correct interfaces: > > diff --git > a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > > b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > index e1be271..c5b4a42 100644 > --- > a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > +++ > b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > @@ -305,6 +305,9 @@ > public static final String IP
[jira] [Updated] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
[ https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HADOOP-15250: Attachment: HADOOP-15250-branch-3.1.patch > Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong > -- > > Key: HADOOP-15250 > URL: https://issues.apache.org/jira/browse/HADOOP-15250 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, net >Affects Versions: 2.7.3, 2.9.0, 3.0.0 > Environment: Multihome cluster with split DNS and rDNS lookup of > localhost returning non-routable IPAddr >Reporter: Greg Senia >Assignee: Ajay Kumar >Priority: Critical > Fix For: 3.2.0 > > Attachments: HADOOP-15250-branch-3.1.patch, HADOOP-15250.00.patch, > HADOOP-15250.01.patch, HADOOP-15250.02.patch, HADOOP-15250.patch > > > We run our Hadoop clusters with two networks attached to each node. These > network are as follows a server network that is firewalled with firewalld > allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and > the HTTP YARN RM/ATS and MR History Server. The second network is the cluster > network on the second network interface this uses Jumbo frames and is open no > restrictions and allows all cluster traffic to flow between nodes. > > To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the > traffic is originating from nodes with cluster networks we return the > internal DNS record for the nodes. This all works fine with all the > multi-homing features added to Hadoop 2.x > Some logic around views: > a. The internal view is used by cluster machines when performing lookups. So > hosts on the cluster network should get answers from the internal view in DNS > b. The external view is used by non-local-cluster machines when performing > lookups. So hosts not on the cluster network should get answers from the > external view in DNS > > So this brings me to our problem. We created some firewall rules to allow > inbound traffic from each clusters server network to allow distcp to occur. > But we noticed a problem almost immediately that when YARN attempted to talk > to the Remote Cluster it was binding outgoing traffic to the cluster network > interface which IS NOT routable. So after researching the code we noticed the > following in NetUtils.java and Client.java > Basically in Client.java it looks as if it takes whatever the hostname is and > attempts to bind to whatever the hostname is resolved to. This is not valid > in a multi-homed network with one routable interface and one non routable > interface. After reading through the java.net.Socket documentation it is > valid to perform socket.bind(null) which will allow the OS routing table and > DNS to send the traffic to the correct interface. I will also attach the > nework traces and a test patch for 2.7.x and 3.x code base. I have this test > fix below in my Hadoop Test Cluster. > Client.java: > > |/*| > | | * Bind the socket to the host specified in the principal name of the| > | | * client, to ensure Server matching address of the client connection| > | | * to host name in principal passed.| > | | */| > | |InetSocketAddress bindAddr = null;| > | |if (ticket != null && ticket.hasKerberosCredentials()) {| > | |KerberosInfo krbInfo =| > | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);| > | |if (krbInfo != null) {| > | |String principal = ticket.getUserName();| > | |String host = SecurityUtil.getHostFromPrincipal(principal);| > | |// If host name is a valid local address then bind socket to it| > | |{color:#FF}*InetAddress localAddr = > NetUtils.getLocalInetAddress(host);*{color}| > |{color:#FF} ** {color}|if (localAddr != null) {| > | |this.socket.setReuseAddress(true);| > | |if (LOG.isDebugEnabled()) {| > | |LOG.debug("Binding " + principal + " to " + localAddr);| > | |}| > | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*| > | *{color:#FF}{color}* |*{color:#FF}}{color}*| > | |}| > | |}| > > So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows > correctly out the correct interfaces: > > diff --git > a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > > b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > index e1be271..c5b4a42 100644 > --- > a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > +++ > b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > @@ -305,6 +305,9 @@ > public static final String IPC_CLIENT_FALLBACK_TO_SIMPLE_AUTH_ALLOWED_KEY
[jira] [Created] (HADOOP-15453) hadoop fs -count -v report "-count: Illegal option -v"
zhoutai.zt created HADOOP-15453: --- Summary: hadoop fs -count -v report "-count: Illegal option -v" Key: HADOOP-15453 URL: https://issues.apache.org/jira/browse/HADOOP-15453 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.7.2 Reporter: zhoutai.zt [hadoop@Hadoop1 bin]$ ./hadoop fs -count -q -h -v SparkHis* -count: Illegal option -v Reading the source code, can't find the -v option. {code:java} // code placeholder private static final String OPTION_QUOTA = "q"; private static final String OPTION_HUMAN = "h"; public static final String NAME = "count"; public static final String USAGE = "[-" + OPTION_QUOTA + "] [-" + OPTION_HUMAN + "] ..."; {code} BUT the document states the -v option. [http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/FileSystemShell.html#count] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems
[ https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469857#comment-16469857 ] genericqa commented on HADOOP-1: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 35 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 47s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-tools {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 4s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 32m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 32m 59s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 39s{color} | {color:orange} root: The patch generated 8 new + 7 unchanged - 0 fixed = 15 total (was 7) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 29s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 7s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-tools {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} hadoop-project in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 54s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 20s{color} | {color:red} hadoop-ftp in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}117m 34s{color} | {color:red} hadoop-tools in the patch failed
[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector
[ https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469554#comment-16469554 ] Hudson commented on HADOOP-15356: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14152 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14152/]) HADOOP-15356. Make HTTP timeout configurable in ADLS connector. (mackrorysd: rev 1cfe7506f7e9aff808af4ec0e57639130a6d0f35) * (edit) hadoop-tools/hadoop-azure-datalake/src/test/java/org/apache/hadoop/fs/adl/live/AdlStorageConfiguration.java * (edit) hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/fs/adl/AdlFileSystem.java * (edit) hadoop-tools/hadoop-azure-datalake/src/site/markdown/troubleshooting_adl.md * (add) hadoop-tools/hadoop-azure-datalake/src/test/java/org/apache/hadoop/fs/adl/live/TestAdlSdkConfiguration.java * (edit) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * (edit) hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/fs/adl/AdlConfKeys.java > Make HTTP timeout configurable in ADLS Connector > > > Key: HADOOP-15356 > URL: https://issues.apache.org/jira/browse/HADOOP-15356 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/adl >Reporter: Atul Sikaria >Assignee: Atul Sikaria >Priority: Major > Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, > HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch > > > Currently the HTTP timeout for the connections to ADLS are not configurable > in Hadoop. This patch enables the timeouts to be configurable based on a > core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has > default value of 60 seconds - any optimizations to that setting can now be > done in Hadoop through core-site. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector
[ https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15356: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Make HTTP timeout configurable in ADLS Connector > > > Key: HADOOP-15356 > URL: https://issues.apache.org/jira/browse/HADOOP-15356 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/adl >Reporter: Atul Sikaria >Assignee: Atul Sikaria >Priority: Major > Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, > HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch > > > Currently the HTTP timeout for the connections to ADLS are not configurable > in Hadoop. This patch enables the timeouts to be configurable based on a > core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has > default value of 60 seconds - any optimizations to that setting can now be > done in Hadoop through core-site. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
[ https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469414#comment-16469414 ] Arpit Agarwal commented on HADOOP-15250: [~billie.rinaldi] please go ahead, thanks. > Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong > -- > > Key: HADOOP-15250 > URL: https://issues.apache.org/jira/browse/HADOOP-15250 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, net >Affects Versions: 2.7.3, 2.9.0, 3.0.0 > Environment: Multihome cluster with split DNS and rDNS lookup of > localhost returning non-routable IPAddr >Reporter: Greg Senia >Assignee: Ajay Kumar >Priority: Critical > Fix For: 3.2.0 > > Attachments: HADOOP-15250.00.patch, HADOOP-15250.01.patch, > HADOOP-15250.02.patch, HADOOP-15250.patch > > > We run our Hadoop clusters with two networks attached to each node. These > network are as follows a server network that is firewalled with firewalld > allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and > the HTTP YARN RM/ATS and MR History Server. The second network is the cluster > network on the second network interface this uses Jumbo frames and is open no > restrictions and allows all cluster traffic to flow between nodes. > > To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the > traffic is originating from nodes with cluster networks we return the > internal DNS record for the nodes. This all works fine with all the > multi-homing features added to Hadoop 2.x > Some logic around views: > a. The internal view is used by cluster machines when performing lookups. So > hosts on the cluster network should get answers from the internal view in DNS > b. The external view is used by non-local-cluster machines when performing > lookups. So hosts not on the cluster network should get answers from the > external view in DNS > > So this brings me to our problem. We created some firewall rules to allow > inbound traffic from each clusters server network to allow distcp to occur. > But we noticed a problem almost immediately that when YARN attempted to talk > to the Remote Cluster it was binding outgoing traffic to the cluster network > interface which IS NOT routable. So after researching the code we noticed the > following in NetUtils.java and Client.java > Basically in Client.java it looks as if it takes whatever the hostname is and > attempts to bind to whatever the hostname is resolved to. This is not valid > in a multi-homed network with one routable interface and one non routable > interface. After reading through the java.net.Socket documentation it is > valid to perform socket.bind(null) which will allow the OS routing table and > DNS to send the traffic to the correct interface. I will also attach the > nework traces and a test patch for 2.7.x and 3.x code base. I have this test > fix below in my Hadoop Test Cluster. > Client.java: > > |/*| > | | * Bind the socket to the host specified in the principal name of the| > | | * client, to ensure Server matching address of the client connection| > | | * to host name in principal passed.| > | | */| > | |InetSocketAddress bindAddr = null;| > | |if (ticket != null && ticket.hasKerberosCredentials()) {| > | |KerberosInfo krbInfo =| > | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);| > | |if (krbInfo != null) {| > | |String principal = ticket.getUserName();| > | |String host = SecurityUtil.getHostFromPrincipal(principal);| > | |// If host name is a valid local address then bind socket to it| > | |{color:#FF}*InetAddress localAddr = > NetUtils.getLocalInetAddress(host);*{color}| > |{color:#FF} ** {color}|if (localAddr != null) {| > | |this.socket.setReuseAddress(true);| > | |if (LOG.isDebugEnabled()) {| > | |LOG.debug("Binding " + principal + " to " + localAddr);| > | |}| > | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*| > | *{color:#FF}{color}* |*{color:#FF}}{color}*| > | |}| > | |}| > > So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows > correctly out the correct interfaces: > > diff --git > a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > > b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > index e1be271..c5b4a42 100644 > --- > a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > +++ > b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > @@ -305,6 +305,9 @@ > public static final String IPC_CLIENT_FALLBA
[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector
[ https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469413#comment-16469413 ] Aaron Fabbri commented on HADOOP-15356: --- Thanks for adding the extra validation of the timeout config value. +1, ship it. > Make HTTP timeout configurable in ADLS Connector > > > Key: HADOOP-15356 > URL: https://issues.apache.org/jira/browse/HADOOP-15356 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/adl >Reporter: Atul Sikaria >Assignee: Atul Sikaria >Priority: Major > Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, > HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch > > > Currently the HTTP timeout for the connections to ADLS are not configurable > in Hadoop. This patch enables the timeouts to be configurable based on a > core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has > default value of 60 seconds - any optimizations to that setting can now be > done in Hadoop through core-site. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems
[ https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Waldmann updated HADOOP-1: Attachment: (was: HADOOP-1.15.patch) > New implementation of ftp and sftp filesystems > -- > > Key: HADOOP-1 > URL: https://issues.apache.org/jira/browse/HADOOP-1 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.0 >Reporter: Lukas Waldmann >Assignee: Lukas Waldmann >Priority: Major > Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, > HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, > HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, > HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, > HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, > HADOOP-1.patch > > > Current implementation of FTP and SFTP filesystems have severe limitations > and performance issues when dealing with high number of files. Mine patch > solve those issues and integrate both filesystems such a way that most of the > core functionality is common for both and therefore simplifying the > maintainability. > The core features: > * Support for HTTP/SOCKS proxies > * Support for passive FTP > * Support for explicit FTPS (SSL/TLS) > * Support of connection pooling - new connection is not created for every > single command but reused from the pool. > For huge number of files it shows order of magnitude performance improvement > over not pooled connections. > * Caching of directory trees. For ftp you always need to list whole > directory whenever you ask information about particular file. > Again for huge number of files it shows order of magnitude performance > improvement over not cached connections. > * Support of keep alive (NOOP) messages to avoid connection drops > * Support for Unix style or regexp wildcard glob - useful for listing a > particular files across whole directory tree > * Support for reestablishing broken ftp data transfers - can happen > surprisingly often > * Support for sftp private keys (including pass phrase) > * Support for keeping passwords, private keys and pass phrase in the jceks > key stores -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems
[ https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Waldmann updated HADOOP-1: Attachment: HADOOP-1.15.patch > New implementation of ftp and sftp filesystems > -- > > Key: HADOOP-1 > URL: https://issues.apache.org/jira/browse/HADOOP-1 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.0 >Reporter: Lukas Waldmann >Assignee: Lukas Waldmann >Priority: Major > Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, > HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, > HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, > HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, > HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, > HADOOP-1.patch > > > Current implementation of FTP and SFTP filesystems have severe limitations > and performance issues when dealing with high number of files. Mine patch > solve those issues and integrate both filesystems such a way that most of the > core functionality is common for both and therefore simplifying the > maintainability. > The core features: > * Support for HTTP/SOCKS proxies > * Support for passive FTP > * Support for explicit FTPS (SSL/TLS) > * Support of connection pooling - new connection is not created for every > single command but reused from the pool. > For huge number of files it shows order of magnitude performance improvement > over not pooled connections. > * Caching of directory trees. For ftp you always need to list whole > directory whenever you ask information about particular file. > Again for huge number of files it shows order of magnitude performance > improvement over not cached connections. > * Support of keep alive (NOOP) messages to avoid connection drops > * Support for Unix style or regexp wildcard glob - useful for listing a > particular files across whole directory tree > * Support for reestablishing broken ftp data transfers - can happen > surprisingly often > * Support for sftp private keys (including pass phrase) > * Support for keeping passwords, private keys and pass phrase in the jceks > key stores -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector
[ https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469354#comment-16469354 ] genericqa commented on HADOOP-15356: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 45s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 26s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 38s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s{color} | {color:green} hadoop-azure-datalake in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}140m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HADOOP-15356 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12922654/HADOOP-15356.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 4cfcb516161c 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 343b51d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1
[jira] [Commented] (HADOOP-15441) After HADOOP-14987, encryption zone operations print unnecessary INFO logs
[ https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469312#comment-16469312 ] Gabor Bota commented on HADOOP-15441: - Sure [~shahrs87], I've updated both. > After HADOOP-14987, encryption zone operations print unnecessary INFO logs > -- > > Key: HADOOP-15441 > URL: https://issues.apache.org/jira/browse/HADOOP-15441 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Minor > Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch > > > It looks like after HADOOP-14987, any encryption zone operations prints extra > INFO log messages as follows: > {code:java} > $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/ > 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: > https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: > kms://ht...@hadoop3-1.example.com:16000/kms created. > {code} > It might make sense to make it a DEBUG message instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15441) After HADOOP-14987, encryption zone operations print unnecessary INFO logs
[ https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HADOOP-15441: Description: It looks like after HADOOP-14987, any encryption zone operations prints extra INFO log messages as follows: {code:java} $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/ 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: kms://ht...@hadoop3-1.example.com:16000/kms created. {code} It might make sense to make it a DEBUG message instead. was: It looks like after HADOOP-14445, any encryption zone operations prints extra INFO log messages as follows: {code:java} $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/ 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: kms://ht...@hadoop3-1.example.com:16000/kms created. {code} It might make sense to make it a DEBUG message instead. > After HADOOP-14987, encryption zone operations print unnecessary INFO logs > -- > > Key: HADOOP-15441 > URL: https://issues.apache.org/jira/browse/HADOOP-15441 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Minor > Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch > > > It looks like after HADOOP-14987, any encryption zone operations prints extra > INFO log messages as follows: > {code:java} > $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/ > 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: > https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: > kms://ht...@hadoop3-1.example.com:16000/kms created. > {code} > It might make sense to make it a DEBUG message instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15441) After HADOOP-14987, encryption zone operations print unnecessary INFO logs
[ https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HADOOP-15441: Summary: After HADOOP-14987, encryption zone operations print unnecessary INFO logs (was: After HADOOP-14445, encryption zone operations print unnecessary INFO logs) > After HADOOP-14987, encryption zone operations print unnecessary INFO logs > -- > > Key: HADOOP-15441 > URL: https://issues.apache.org/jira/browse/HADOOP-15441 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Minor > Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch > > > It looks like after HADOOP-14445, any encryption zone operations prints extra > INFO log messages as follows: > {code:java} > $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/ > 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: > https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: > kms://ht...@hadoop3-1.example.com:16000/kms created. > {code} > It might make sense to make it a DEBUG message instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
[ https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469306#comment-16469306 ] Billie Rinaldi commented on HADOOP-15250: - I'd be interested in getting this patch committed to branch-3.1 as well. Would anyone have an objection to this? > Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong > -- > > Key: HADOOP-15250 > URL: https://issues.apache.org/jira/browse/HADOOP-15250 > Project: Hadoop Common > Issue Type: Improvement > Components: ipc, net >Affects Versions: 2.7.3, 2.9.0, 3.0.0 > Environment: Multihome cluster with split DNS and rDNS lookup of > localhost returning non-routable IPAddr >Reporter: Greg Senia >Assignee: Ajay Kumar >Priority: Critical > Fix For: 3.2.0 > > Attachments: HADOOP-15250.00.patch, HADOOP-15250.01.patch, > HADOOP-15250.02.patch, HADOOP-15250.patch > > > We run our Hadoop clusters with two networks attached to each node. These > network are as follows a server network that is firewalled with firewalld > allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and > the HTTP YARN RM/ATS and MR History Server. The second network is the cluster > network on the second network interface this uses Jumbo frames and is open no > restrictions and allows all cluster traffic to flow between nodes. > > To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the > traffic is originating from nodes with cluster networks we return the > internal DNS record for the nodes. This all works fine with all the > multi-homing features added to Hadoop 2.x > Some logic around views: > a. The internal view is used by cluster machines when performing lookups. So > hosts on the cluster network should get answers from the internal view in DNS > b. The external view is used by non-local-cluster machines when performing > lookups. So hosts not on the cluster network should get answers from the > external view in DNS > > So this brings me to our problem. We created some firewall rules to allow > inbound traffic from each clusters server network to allow distcp to occur. > But we noticed a problem almost immediately that when YARN attempted to talk > to the Remote Cluster it was binding outgoing traffic to the cluster network > interface which IS NOT routable. So after researching the code we noticed the > following in NetUtils.java and Client.java > Basically in Client.java it looks as if it takes whatever the hostname is and > attempts to bind to whatever the hostname is resolved to. This is not valid > in a multi-homed network with one routable interface and one non routable > interface. After reading through the java.net.Socket documentation it is > valid to perform socket.bind(null) which will allow the OS routing table and > DNS to send the traffic to the correct interface. I will also attach the > nework traces and a test patch for 2.7.x and 3.x code base. I have this test > fix below in my Hadoop Test Cluster. > Client.java: > > |/*| > | | * Bind the socket to the host specified in the principal name of the| > | | * client, to ensure Server matching address of the client connection| > | | * to host name in principal passed.| > | | */| > | |InetSocketAddress bindAddr = null;| > | |if (ticket != null && ticket.hasKerberosCredentials()) {| > | |KerberosInfo krbInfo =| > | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);| > | |if (krbInfo != null) {| > | |String principal = ticket.getUserName();| > | |String host = SecurityUtil.getHostFromPrincipal(principal);| > | |// If host name is a valid local address then bind socket to it| > | |{color:#FF}*InetAddress localAddr = > NetUtils.getLocalInetAddress(host);*{color}| > |{color:#FF} ** {color}|if (localAddr != null) {| > | |this.socket.setReuseAddress(true);| > | |if (LOG.isDebugEnabled()) {| > | |LOG.debug("Binding " + principal + " to " + localAddr);| > | |}| > | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*| > | *{color:#FF}{color}* |*{color:#FF}}{color}*| > | |}| > | |}| > > So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows > correctly out the correct interfaces: > > diff --git > a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > > b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > index e1be271..c5b4a42 100644 > --- > a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java > +++ > b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems
[ https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469288#comment-16469288 ] genericqa commented on HADOOP-1: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 35 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-tools {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 18s{color} | {color:orange} root: The patch generated 6 new + 7 unchanged - 0 fixed = 13 total (was 7) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 28s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 8s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-tools {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s{color} | {color:green} hadoop-project in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 5s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 15s{color} | {color:red} hadoop-ftp in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 26s{color} | {color:red} hadoop-tools in the patch failed
[jira] [Commented] (HADOOP-15450) Avoid fsync storm triggered by DiskChecker and handle disk full situation
[ https://issues.apache.org/jira/browse/HADOOP-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469246#comment-16469246 ] genericqa commented on HADOOP-15450: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 3s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 37s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 0s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 0s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 40s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 0m 39s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 25s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 44s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HADOOP-15450 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12922548/HADOOP-15450.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6672782dca89 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 343b51d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | mvninstall | https://builds.apache.org/job/PreCommit-HADOOP-Build/14604/artifact/out/patch-mvninstall-hadoop-common-project_hadoop-common.txt | | compile | https://builds.apache.org/job/PreCommit-HADOOP-Build/14604/artifact/out/patch-compile-root.txt | | javac | https://builds.apache.org/job/PreCommit-HADOOP-Build/14604/artifact/out/patch-compile-root.txt | | mvnsite | https://builds.apache.org/job/PreCommit-HADOOP-Build/14604/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-com
[jira] [Commented] (HADOOP-15441) After HADOOP-14445, encryption zone operations print unnecessary INFO logs
[ https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469124#comment-16469124 ] Xiaoyu Yao commented on HADOOP-15441: - Thanks [~shahrs87], +1 for v1 patch. > After HADOOP-14445, encryption zone operations print unnecessary INFO logs > -- > > Key: HADOOP-15441 > URL: https://issues.apache.org/jira/browse/HADOOP-15441 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Minor > Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch > > > It looks like after HADOOP-14445, any encryption zone operations prints extra > INFO log messages as follows: > {code:java} > $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/ > 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: > https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: > kms://ht...@hadoop3-1.example.com:16000/kms created. > {code} > It might make sense to make it a DEBUG message instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector
[ https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468997#comment-16468997 ] Sean Mackrory commented on HADOOP-15356: {quote}what if someone sets adl.http.timeout to 0? \{quote} My guess is absolutely nothing :) Immediate timeouts? Just noticed we're silent when not setting the timeout and we should probably log that the config is invalid in some way and not getting used, so adding a LOG.info as well as doing that for values of zero or less. > Make HTTP timeout configurable in ADLS Connector > > > Key: HADOOP-15356 > URL: https://issues.apache.org/jira/browse/HADOOP-15356 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/adl >Reporter: Atul Sikaria >Assignee: Atul Sikaria >Priority: Major > Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, > HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch > > > Currently the HTTP timeout for the connections to ADLS are not configurable > in Hadoop. This patch enables the timeouts to be configurable based on a > core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has > default value of 60 seconds - any optimizations to that setting can now be > done in Hadoop through core-site. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector
[ https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Mackrory updated HADOOP-15356: --- Attachment: HADOOP-15356.005.patch > Make HTTP timeout configurable in ADLS Connector > > > Key: HADOOP-15356 > URL: https://issues.apache.org/jira/browse/HADOOP-15356 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/adl >Reporter: Atul Sikaria >Assignee: Atul Sikaria >Priority: Major > Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, > HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch > > > Currently the HTTP timeout for the connections to ADLS are not configurable > in Hadoop. This patch enables the timeouts to be configurable based on a > core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has > default value of 60 seconds - any optimizations to that setting can now be > done in Hadoop through core-site. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15450) Avoid fsync storm triggered by DiskChecker and handle disk full situation
[ https://issues.apache.org/jira/browse/HADOOP-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HADOOP-15450: --- Status: Patch Available (was: Open) > Avoid fsync storm triggered by DiskChecker and handle disk full situation > - > > Key: HADOOP-15450 > URL: https://issues.apache.org/jira/browse/HADOOP-15450 > Project: Hadoop Common > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Arpit Agarwal >Priority: Blocker > Attachments: HADOOP-15450.01.patch > > > Fix disk checker issues reported by [~kihwal] in HADOOP-13738 > There are non-hdfs users of DiskChecker, who use it proactively, not just on > failures. This was fine before, but now it incurs heavy I/O due to > introduction of fsync() in the code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15441) After HADOOP-14445, encryption zone operations print unnecessary INFO logs
[ https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468974#comment-16468974 ] Rushabh S Shah edited comment on HADOOP-15441 at 5/9/18 3:41 PM: - Thanks [~gabor.bota] for the updated patch. {quote}Since we are using slf4j, we don't really need {{if(LOG.isDebugEnabled())}}. {quote} Thanks [~xyao] for pointing this out. I am happy with v1 of the patch. If everyone is ok, I will go ahead and commit v1 of patch today evening. Gabor: can you please update the description and title of the jira. was (Author: shahrs87): Thanks [~gabor.bota] for the updated patch. {quote}Since we are using slf4j, we don't really need {{if(LOG.isDebugEnabled())}}. {quote} Thanks [~xyao] for pointing this out. I am happy with v1 of the patch. If everyone is ok, I will go ahead and commit v1 of patch today evening. Gabor: can you please update the description and title of the patch. > After HADOOP-14445, encryption zone operations print unnecessary INFO logs > -- > > Key: HADOOP-15441 > URL: https://issues.apache.org/jira/browse/HADOOP-15441 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Minor > Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch > > > It looks like after HADOOP-14445, any encryption zone operations prints extra > INFO log messages as follows: > {code:java} > $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/ > 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: > https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: > kms://ht...@hadoop3-1.example.com:16000/kms created. > {code} > It might make sense to make it a DEBUG message instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15441) After HADOOP-14445, encryption zone operations print unnecessary INFO logs
[ https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468974#comment-16468974 ] Rushabh S Shah commented on HADOOP-15441: - Thanks [~gabor.bota] for the updated patch. {quote}Since we are using slf4j, we don't really need {{if(LOG.isDebugEnabled())}}. {quote} Thanks [~xyao] for pointing this out. I am happy with v1 of the patch. If everyone is ok, I will go ahead and commit v1 of patch today evening. Gabor: can you please update the description and title of the patch. > After HADOOP-14445, encryption zone operations print unnecessary INFO logs > -- > > Key: HADOOP-15441 > URL: https://issues.apache.org/jira/browse/HADOOP-15441 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Minor > Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch > > > It looks like after HADOOP-14445, any encryption zone operations prints extra > INFO log messages as follows: > {code:java} > $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/ > 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: > https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: > kms://ht...@hadoop3-1.example.com:16000/kms created. > {code} > It might make sense to make it a DEBUG message instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-12631) Can't use windows network drive
[ https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468964#comment-16468964 ] Jimmy Lu edited comment on HADOOP-12631 at 5/9/18 3:32 PM: --- Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call {{java.net.URI#normalize}}, which is known to be buggy and destroys UNC path. was (Author: yuhta): Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call {{ java.net.URI#normalize}}, which is known to be buggy and destroys UNC path. > Can't use windows network drive > --- > > Key: HADOOP-12631 > URL: https://issues.apache.org/jira/browse/HADOOP-12631 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.6.0 >Reporter: tian >Priority: Minor > > When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash will be > normalised to single slash. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-12631) Can't use windows network drive
[ https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468964#comment-16468964 ] Jimmy Lu edited comment on HADOOP-12631 at 5/9/18 3:31 PM: --- Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call {{ java.net.URI#normalize}}, which is known to be buggy and destroys UNC path. was (Author: yuhta): Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call {{URI#normalize}}, which is known to be buggy and destroys UNC path. > Can't use windows network drive > --- > > Key: HADOOP-12631 > URL: https://issues.apache.org/jira/browse/HADOOP-12631 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.6.0 >Reporter: tian >Priority: Minor > > When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash will be > normalised to single slash. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12631) Can't use windows network drive
[ https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468964#comment-16468964 ] Jimmy Lu commented on HADOOP-12631: --- Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call {{URI#normalize}}, which is known to be buggy and destroys UNC path. > Can't use windows network drive > --- > > Key: HADOOP-12631 > URL: https://issues.apache.org/jira/browse/HADOOP-12631 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.6.0 >Reporter: tian >Priority: Minor > > When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash will be > normalised to single slash. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-12631) Can't use windows network drive
[ https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468956#comment-16468956 ] Jimmy Lu edited comment on HADOOP-12631 at 5/9/18 3:17 PM: --- Hi [~cnauroth], I need to access the same URI from both Windows and Linux, which requires leading double slash in path. Is there a way to workaround this bug? was (Author: yuhta): Hi [~cnauroth], I need to access the same URI from both Windows and Linux, which requires leading double slash in path. Is there a way to workaround this issue? > Can't use windows network drive > --- > > Key: HADOOP-12631 > URL: https://issues.apache.org/jira/browse/HADOOP-12631 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.6.0 >Reporter: tian >Priority: Minor > > When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash will be > normalised to single slash. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12631) Can't use windows network drive
[ https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468956#comment-16468956 ] Jimmy Lu commented on HADOOP-12631: --- Hi [~cnauroth], I need to access the same URI from both Windows and Linux, which requires leading double slash in path. Is there a way to workaround this issue? > Can't use windows network drive > --- > > Key: HADOOP-12631 > URL: https://issues.apache.org/jira/browse/HADOOP-12631 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.6.0 >Reporter: tian >Priority: Minor > > When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash will be > normalised to single slash. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15441) After HADOOP-14445, encryption zone operations print unnecessary INFO logs
[ https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468944#comment-16468944 ] Xiaoyu Yao commented on HADOOP-15441: - Thanks [~gabor.bota] for the fix. The patch v2 looks good to me. Since we are using slf4j, we don't really need {{if(LOG.isDebugEnabled())}}. > After HADOOP-14445, encryption zone operations print unnecessary INFO logs > -- > > Key: HADOOP-15441 > URL: https://issues.apache.org/jira/browse/HADOOP-15441 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Minor > Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch > > > It looks like after HADOOP-14445, any encryption zone operations prints extra > INFO log messages as follows: > {code:java} > $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/ > 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: > https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: > kms://ht...@hadoop3-1.example.com:16000/kms created. > {code} > It might make sense to make it a DEBUG message instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems
[ https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Waldmann updated HADOOP-1: Attachment: (was: HADOOP-1.15.patch) > New implementation of ftp and sftp filesystems > -- > > Key: HADOOP-1 > URL: https://issues.apache.org/jira/browse/HADOOP-1 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.0 >Reporter: Lukas Waldmann >Assignee: Lukas Waldmann >Priority: Major > Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, > HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, > HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, > HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, > HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, > HADOOP-1.patch > > > Current implementation of FTP and SFTP filesystems have severe limitations > and performance issues when dealing with high number of files. Mine patch > solve those issues and integrate both filesystems such a way that most of the > core functionality is common for both and therefore simplifying the > maintainability. > The core features: > * Support for HTTP/SOCKS proxies > * Support for passive FTP > * Support for explicit FTPS (SSL/TLS) > * Support of connection pooling - new connection is not created for every > single command but reused from the pool. > For huge number of files it shows order of magnitude performance improvement > over not pooled connections. > * Caching of directory trees. For ftp you always need to list whole > directory whenever you ask information about particular file. > Again for huge number of files it shows order of magnitude performance > improvement over not cached connections. > * Support of keep alive (NOOP) messages to avoid connection drops > * Support for Unix style or regexp wildcard glob - useful for listing a > particular files across whole directory tree > * Support for reestablishing broken ftp data transfers - can happen > surprisingly often > * Support for sftp private keys (including pass phrase) > * Support for keeping passwords, private keys and pass phrase in the jceks > key stores -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems
[ https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Waldmann updated HADOOP-1: Attachment: HADOOP-1.15.patch > New implementation of ftp and sftp filesystems > -- > > Key: HADOOP-1 > URL: https://issues.apache.org/jira/browse/HADOOP-1 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.0 >Reporter: Lukas Waldmann >Assignee: Lukas Waldmann >Priority: Major > Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, > HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, > HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, > HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, > HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, > HADOOP-1.patch > > > Current implementation of FTP and SFTP filesystems have severe limitations > and performance issues when dealing with high number of files. Mine patch > solve those issues and integrate both filesystems such a way that most of the > core functionality is common for both and therefore simplifying the > maintainability. > The core features: > * Support for HTTP/SOCKS proxies > * Support for passive FTP > * Support for explicit FTPS (SSL/TLS) > * Support of connection pooling - new connection is not created for every > single command but reused from the pool. > For huge number of files it shows order of magnitude performance improvement > over not pooled connections. > * Caching of directory trees. For ftp you always need to list whole > directory whenever you ask information about particular file. > Again for huge number of files it shows order of magnitude performance > improvement over not cached connections. > * Support of keep alive (NOOP) messages to avoid connection drops > * Support for Unix style or regexp wildcard glob - useful for listing a > particular files across whole directory tree > * Support for reestablishing broken ftp data transfers - can happen > surprisingly often > * Support for sftp private keys (including pass phrase) > * Support for keeping passwords, private keys and pass phrase in the jceks > key stores -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems
[ https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468826#comment-16468826 ] genericqa commented on HADOOP-1: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 35 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 58s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 52s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-tools {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 47s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 53s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 19s{color} | {color:orange} root: The patch generated 6 new + 7 unchanged - 0 fixed = 13 total (was 7) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 30s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 6s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-tools {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s{color} | {color:green} hadoop-project in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 56s{color} | {color:red} hadoop-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m 10s{color} | {color:red} hadoop-ftp in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 41s{color} | {color:red} hadoop-tools in the patch failed. {color
[jira] [Updated] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data
[ https://issues.apache.org/jira/browse/HADOOP-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jincheng updated HADOOP-15452: -- Description: RT, when reducers tasks fetch data from mapper tasks, it met ArrayIndexOutOfBoundsException, here is stackTrace: {code:java} org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202) {code} Anyone has ideas? was: RT, when reducers tasks fetch data from mapper tasks, it met ArrayIndexOutOfBoundsException, here is stackTrace: {code:java} Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202) {code} Anyone has ideas? > Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch > map output data > - > > Key: HADOOP-15452 > URL: https://issues.apache.org/jira/browse/HADOOP-15452 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 2.6.0 > Environment: hadoop -2.6.0-cdh5.4.4 >Reporter: jincheng >Priority: Major > > RT, when reducers tasks fetch data from mapper tasks, it met > ArrayIndexOutOfBoundsException, here is stackTrace: > {code:java} > org.apache.hadoop.mapred.YarnChild: Exception running child : > org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in > shuffle in fetcher#1 > at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > Caused by: java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) > at > org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) > at > org.apache.ha
[jira] [Updated] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data
[ https://issues.apache.org/jira/browse/HADOOP-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jincheng updated HADOOP-15452: -- Description: RT, when reducers tasks fetch data from mapper tasks, it met ArrayIndexOutOfBoundsException, here is stackTrace: {code:java} Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202) {code} Anyone has ideas? was: RT, when reducers tasks fetch data from mapper tasks, it met ArrayIndexOutOfBoundsException, here is stackTrace: {code:java} Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202) {code} Anyone has ideas? > Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch > map output data > - > > Key: HADOOP-15452 > URL: https://issues.apache.org/jira/browse/HADOOP-15452 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 2.6.0 > Environment: hadoop -2.6.0-cdh5.4.4 >Reporter: jincheng >Priority: Major > > RT, when reducers tasks fetch data from mapper tasks, it met > ArrayIndexOutOfBoundsException, here is stackTrace: > {code:java} > Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in > shuffle in fetcher#1 > at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > Caused by: java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) > > at > org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) > > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) > > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) > at > org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) > > at > org.ap
[jira] [Updated] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data
[ https://issues.apache.org/jira/browse/HADOOP-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jincheng updated HADOOP-15452: -- Description: RT, when reducers tasks fetch data from mapper tasks, it met ArrayIndexOutOfBoundsException, here is stackTrace: {code:java} Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202) {code} Anyone has ideas? was: RT, when reducers tasks fetch data from mapper tasks, it met ArrayIndexOutOfBoundsException, here is stackTrace: {code:java} Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202) {code} Anyone has ideas? > Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch > map output data > - > > Key: HADOOP-15452 > URL: https://issues.apache.org/jira/browse/HADOOP-15452 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 2.6.0 > Environment: hadoop -2.6.0-cdh5.4.4 >Reporter: jincheng >Priority: Major > > RT, when reducers tasks fetch data from mapper tasks, it met > ArrayIndexOutOfBoundsException, here is stackTrace: > {code:java} > Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in > shuffle in fetcher#1 > at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at > java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > Caused by: java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) > > at > org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) > > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) > > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) > at > org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) > > at > org.apac
[jira] [Updated] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data
[ https://issues.apache.org/jira/browse/HADOOP-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jincheng updated HADOOP-15452: -- Description: RT, when reducers tasks fetch data from mapper tasks, it met ArrayIndexOutOfBoundsException, here is stackTrace: {code:java} Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202) {code} Anyone has ideas? was: RT, when reducers tasks fetch data from mapper tasks, it met ArrayIndexOutOfBoundsException, here is stackTrace: {code:java} // code placeholder {code} Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202) Anyone has ideas? > Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch > map output data > - > > Key: HADOOP-15452 > URL: https://issues.apache.org/jira/browse/HADOOP-15452 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 2.6.0 > Environment: hadoop -2.6.0-cdh5.4.4 >Reporter: jincheng >Priority: Major > > RT, when reducers tasks fetch data from mapper tasks, it met > ArrayIndexOutOfBoundsException, here is stackTrace: > {code:java} > Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in > shuffle in fetcher#1 at > org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at > java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) > Caused by: java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) > at > org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at > org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) > at > org.apach
[jira] [Created] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data
jincheng created HADOOP-15452: - Summary: Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data Key: HADOOP-15452 URL: https://issues.apache.org/jira/browse/HADOOP-15452 Project: Hadoop Common Issue Type: Bug Components: common Affects Versions: 2.6.0 Environment: hadoop -2.6.0-cdh5.4.4 Reporter: jincheng RT, when reducers tasks fetch data from mapper tasks, it met ArrayIndexOutOfBoundsException, here is stackTrace: {code:java} // code placeholder {code} Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202) Anyone has ideas? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems
[ https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Waldmann updated HADOOP-1: Attachment: HADOOP-1.15.patch > New implementation of ftp and sftp filesystems > -- > > Key: HADOOP-1 > URL: https://issues.apache.org/jira/browse/HADOOP-1 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.0 >Reporter: Lukas Waldmann >Assignee: Lukas Waldmann >Priority: Major > Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, > HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, > HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, > HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, > HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, > HADOOP-1.patch > > > Current implementation of FTP and SFTP filesystems have severe limitations > and performance issues when dealing with high number of files. Mine patch > solve those issues and integrate both filesystems such a way that most of the > core functionality is common for both and therefore simplifying the > maintainability. > The core features: > * Support for HTTP/SOCKS proxies > * Support for passive FTP > * Support for explicit FTPS (SSL/TLS) > * Support of connection pooling - new connection is not created for every > single command but reused from the pool. > For huge number of files it shows order of magnitude performance improvement > over not pooled connections. > * Caching of directory trees. For ftp you always need to list whole > directory whenever you ask information about particular file. > Again for huge number of files it shows order of magnitude performance > improvement over not cached connections. > * Support of keep alive (NOOP) messages to avoid connection drops > * Support for Unix style or regexp wildcard glob - useful for listing a > particular files across whole directory tree > * Support for reestablishing broken ftp data transfers - can happen > surprisingly often > * Support for sftp private keys (including pass phrase) > * Support for keeping passwords, private keys and pass phrase in the jceks > key stores -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems
[ https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukas Waldmann updated HADOOP-1: Attachment: (was: HADOOP-1.15.patch) > New implementation of ftp and sftp filesystems > -- > > Key: HADOOP-1 > URL: https://issues.apache.org/jira/browse/HADOOP-1 > Project: Hadoop Common > Issue Type: New Feature > Components: fs >Affects Versions: 2.8.0 >Reporter: Lukas Waldmann >Assignee: Lukas Waldmann >Priority: Major > Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, > HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, > HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, > HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, > HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, > HADOOP-1.patch > > > Current implementation of FTP and SFTP filesystems have severe limitations > and performance issues when dealing with high number of files. Mine patch > solve those issues and integrate both filesystems such a way that most of the > core functionality is common for both and therefore simplifying the > maintainability. > The core features: > * Support for HTTP/SOCKS proxies > * Support for passive FTP > * Support for explicit FTPS (SSL/TLS) > * Support of connection pooling - new connection is not created for every > single command but reused from the pool. > For huge number of files it shows order of magnitude performance improvement > over not pooled connections. > * Caching of directory trees. For ftp you always need to list whole > directory whenever you ask information about particular file. > Again for huge number of files it shows order of magnitude performance > improvement over not cached connections. > * Support of keep alive (NOOP) messages to avoid connection drops > * Support for Unix style or regexp wildcard glob - useful for listing a > particular files across whole directory tree > * Support for reestablishing broken ftp data transfers - can happen > surprisingly often > * Support for sftp private keys (including pass phrase) > * Support for keeping passwords, private keys and pass phrase in the jceks > key stores -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org