[jira] [Commented] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong

2018-05-09 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469912#comment-16469912
 ] 

Ajay Kumar commented on HADOOP-15250:
-

Attached a patch for 3.1 branch.

> Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
> --
>
> Key: HADOOP-15250
> URL: https://issues.apache.org/jira/browse/HADOOP-15250
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, net
>Affects Versions: 2.7.3, 2.9.0, 3.0.0
> Environment: Multihome cluster with split DNS and rDNS lookup of 
> localhost returning non-routable IPAddr
>Reporter: Greg Senia
>Assignee: Ajay Kumar
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: HADOOP-15250-branch-3.1.patch, HADOOP-15250.00.patch, 
> HADOOP-15250.01.patch, HADOOP-15250.02.patch, HADOOP-15250.patch
>
>
> We run our Hadoop clusters with two networks attached to each node. These 
> network are as follows a server network that is firewalled with firewalld 
> allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and 
> the HTTP YARN RM/ATS and MR History Server. The second network is the cluster 
> network on the second network interface this uses Jumbo frames and is open no 
> restrictions and allows all cluster traffic to flow between nodes. 
>  
> To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the 
> traffic is originating from nodes with cluster networks we return the 
> internal DNS record for the nodes. This all works fine with all the 
> multi-homing features added to Hadoop 2.x
>  Some logic around views:
> a. The internal view is used by cluster machines when performing lookups. So 
> hosts on the cluster network should get answers from the internal view in DNS
> b. The external view is used by non-local-cluster machines when performing 
> lookups. So hosts not on the cluster network should get answers from the 
> external view in DNS
>  
> So this brings me to our problem. We created some firewall rules to allow 
> inbound traffic from each clusters server network to allow distcp to occur. 
> But we noticed a problem almost immediately that when YARN attempted to talk 
> to the Remote Cluster it was binding outgoing traffic to the cluster network 
> interface which IS NOT routable. So after researching the code we noticed the 
> following in NetUtils.java and Client.java 
> Basically in Client.java it looks as if it takes whatever the hostname is and 
> attempts to bind to whatever the hostname is resolved to. This is not valid 
> in a multi-homed network with one routable interface and one non routable 
> interface. After reading through the java.net.Socket documentation it is 
> valid to perform socket.bind(null) which will allow the OS routing table and 
> DNS to send the traffic to the correct interface. I will also attach the 
> nework traces and a test patch for 2.7.x and 3.x code base. I have this test 
> fix below in my Hadoop Test Cluster.
> Client.java:
>       
> |/*|
> | | * Bind the socket to the host specified in the principal name of the|
> | | * client, to ensure Server matching address of the client connection|
> | | * to host name in principal passed.|
> | | */|
> | |InetSocketAddress bindAddr = null;|
> | |if (ticket != null && ticket.hasKerberosCredentials()) {|
> | |KerberosInfo krbInfo =|
> | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);|
> | |if (krbInfo != null) {|
> | |String principal = ticket.getUserName();|
> | |String host = SecurityUtil.getHostFromPrincipal(principal);|
> | |// If host name is a valid local address then bind socket to it|
> | |{color:#FF}*InetAddress localAddr = 
> NetUtils.getLocalInetAddress(host);*{color}|
> |{color:#FF} ** {color}|if (localAddr != null) {|
> | |this.socket.setReuseAddress(true);|
> | |if (LOG.isDebugEnabled()) {|
> | |LOG.debug("Binding " + principal + " to " + localAddr);|
> | |}|
> | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*|
> | *{color:#FF}{color}* |*{color:#FF}}{color}*|
> | |}|
> | |}|
>  
> So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows 
> correctly out the correct interfaces:
>  
> diff --git 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
>  
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> index e1be271..c5b4a42 100644
> --- 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> +++ 
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> @@ -305,6 +305,9 @@
>    public static final String  IP

[jira] [Updated] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong

2018-05-09 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-15250:

Attachment: HADOOP-15250-branch-3.1.patch

> Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
> --
>
> Key: HADOOP-15250
> URL: https://issues.apache.org/jira/browse/HADOOP-15250
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, net
>Affects Versions: 2.7.3, 2.9.0, 3.0.0
> Environment: Multihome cluster with split DNS and rDNS lookup of 
> localhost returning non-routable IPAddr
>Reporter: Greg Senia
>Assignee: Ajay Kumar
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: HADOOP-15250-branch-3.1.patch, HADOOP-15250.00.patch, 
> HADOOP-15250.01.patch, HADOOP-15250.02.patch, HADOOP-15250.patch
>
>
> We run our Hadoop clusters with two networks attached to each node. These 
> network are as follows a server network that is firewalled with firewalld 
> allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and 
> the HTTP YARN RM/ATS and MR History Server. The second network is the cluster 
> network on the second network interface this uses Jumbo frames and is open no 
> restrictions and allows all cluster traffic to flow between nodes. 
>  
> To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the 
> traffic is originating from nodes with cluster networks we return the 
> internal DNS record for the nodes. This all works fine with all the 
> multi-homing features added to Hadoop 2.x
>  Some logic around views:
> a. The internal view is used by cluster machines when performing lookups. So 
> hosts on the cluster network should get answers from the internal view in DNS
> b. The external view is used by non-local-cluster machines when performing 
> lookups. So hosts not on the cluster network should get answers from the 
> external view in DNS
>  
> So this brings me to our problem. We created some firewall rules to allow 
> inbound traffic from each clusters server network to allow distcp to occur. 
> But we noticed a problem almost immediately that when YARN attempted to talk 
> to the Remote Cluster it was binding outgoing traffic to the cluster network 
> interface which IS NOT routable. So after researching the code we noticed the 
> following in NetUtils.java and Client.java 
> Basically in Client.java it looks as if it takes whatever the hostname is and 
> attempts to bind to whatever the hostname is resolved to. This is not valid 
> in a multi-homed network with one routable interface and one non routable 
> interface. After reading through the java.net.Socket documentation it is 
> valid to perform socket.bind(null) which will allow the OS routing table and 
> DNS to send the traffic to the correct interface. I will also attach the 
> nework traces and a test patch for 2.7.x and 3.x code base. I have this test 
> fix below in my Hadoop Test Cluster.
> Client.java:
>       
> |/*|
> | | * Bind the socket to the host specified in the principal name of the|
> | | * client, to ensure Server matching address of the client connection|
> | | * to host name in principal passed.|
> | | */|
> | |InetSocketAddress bindAddr = null;|
> | |if (ticket != null && ticket.hasKerberosCredentials()) {|
> | |KerberosInfo krbInfo =|
> | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);|
> | |if (krbInfo != null) {|
> | |String principal = ticket.getUserName();|
> | |String host = SecurityUtil.getHostFromPrincipal(principal);|
> | |// If host name is a valid local address then bind socket to it|
> | |{color:#FF}*InetAddress localAddr = 
> NetUtils.getLocalInetAddress(host);*{color}|
> |{color:#FF} ** {color}|if (localAddr != null) {|
> | |this.socket.setReuseAddress(true);|
> | |if (LOG.isDebugEnabled()) {|
> | |LOG.debug("Binding " + principal + " to " + localAddr);|
> | |}|
> | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*|
> | *{color:#FF}{color}* |*{color:#FF}}{color}*|
> | |}|
> | |}|
>  
> So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows 
> correctly out the correct interfaces:
>  
> diff --git 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
>  
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> index e1be271..c5b4a42 100644
> --- 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> +++ 
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> @@ -305,6 +305,9 @@
>    public static final String  IPC_CLIENT_FALLBACK_TO_SIMPLE_AUTH_ALLOWED_KEY 

[jira] [Created] (HADOOP-15453) hadoop fs -count -v report "-count: Illegal option -v"

2018-05-09 Thread zhoutai.zt (JIRA)
zhoutai.zt created HADOOP-15453:
---

 Summary: hadoop fs -count -v report "-count: Illegal option -v"
 Key: HADOOP-15453
 URL: https://issues.apache.org/jira/browse/HADOOP-15453
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.2
Reporter: zhoutai.zt


[hadoop@Hadoop1 bin]$ ./hadoop fs -count -q -h -v SparkHis*
-count: Illegal option -v

 

Reading the source code, can't find the -v option.
{code:java}
// code placeholder
private static final String OPTION_QUOTA = "q";
private static final String OPTION_HUMAN = "h";
public static final String NAME = "count";
public static final String USAGE =
"[-" + OPTION_QUOTA + "] [-" + OPTION_HUMAN + "]  ...";
{code}
BUT the document states the -v option.

[http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/FileSystemShell.html#count]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-05-09 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469857#comment-16469857
 ] 

genericqa commented on HADOOP-1:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 35 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-tools {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
4s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 32m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 32m 
59s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 39s{color} | {color:orange} root: The patch generated 8 new + 7 unchanged - 
0 fixed = 15 total (was 7) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
29s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
7s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-tools {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
54s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 20s{color} 
| {color:red} hadoop-ftp in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}117m 34s{color} 
| {color:red} hadoop-tools in the patch failed

[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector

2018-05-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469554#comment-16469554
 ] 

Hudson commented on HADOOP-15356:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14152 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14152/])
HADOOP-15356. Make HTTP timeout configurable in ADLS connector. (mackrorysd: 
rev 1cfe7506f7e9aff808af4ec0e57639130a6d0f35)
* (edit) 
hadoop-tools/hadoop-azure-datalake/src/test/java/org/apache/hadoop/fs/adl/live/AdlStorageConfiguration.java
* (edit) 
hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/fs/adl/AdlFileSystem.java
* (edit) 
hadoop-tools/hadoop-azure-datalake/src/site/markdown/troubleshooting_adl.md
* (add) 
hadoop-tools/hadoop-azure-datalake/src/test/java/org/apache/hadoop/fs/adl/live/TestAdlSdkConfiguration.java
* (edit) hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
* (edit) 
hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/fs/adl/AdlConfKeys.java


> Make HTTP timeout configurable in ADLS Connector
> 
>
> Key: HADOOP-15356
> URL: https://issues.apache.org/jira/browse/HADOOP-15356
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Reporter: Atul Sikaria
>Assignee: Atul Sikaria
>Priority: Major
> Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, 
> HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch
>
>
> Currently the HTTP timeout for the connections to ADLS are not configurable 
> in Hadoop. This patch enables the timeouts to be configurable based on a 
> core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has 
> default value of 60 seconds - any optimizations to that setting can now be 
> done in Hadoop through core-site.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector

2018-05-09 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-15356:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Make HTTP timeout configurable in ADLS Connector
> 
>
> Key: HADOOP-15356
> URL: https://issues.apache.org/jira/browse/HADOOP-15356
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Reporter: Atul Sikaria
>Assignee: Atul Sikaria
>Priority: Major
> Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, 
> HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch
>
>
> Currently the HTTP timeout for the connections to ADLS are not configurable 
> in Hadoop. This patch enables the timeouts to be configurable based on a 
> core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has 
> default value of 60 seconds - any optimizations to that setting can now be 
> done in Hadoop through core-site.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong

2018-05-09 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469414#comment-16469414
 ] 

Arpit Agarwal commented on HADOOP-15250:


[~billie.rinaldi] please go ahead, thanks.

> Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
> --
>
> Key: HADOOP-15250
> URL: https://issues.apache.org/jira/browse/HADOOP-15250
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, net
>Affects Versions: 2.7.3, 2.9.0, 3.0.0
> Environment: Multihome cluster with split DNS and rDNS lookup of 
> localhost returning non-routable IPAddr
>Reporter: Greg Senia
>Assignee: Ajay Kumar
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: HADOOP-15250.00.patch, HADOOP-15250.01.patch, 
> HADOOP-15250.02.patch, HADOOP-15250.patch
>
>
> We run our Hadoop clusters with two networks attached to each node. These 
> network are as follows a server network that is firewalled with firewalld 
> allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and 
> the HTTP YARN RM/ATS and MR History Server. The second network is the cluster 
> network on the second network interface this uses Jumbo frames and is open no 
> restrictions and allows all cluster traffic to flow between nodes. 
>  
> To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the 
> traffic is originating from nodes with cluster networks we return the 
> internal DNS record for the nodes. This all works fine with all the 
> multi-homing features added to Hadoop 2.x
>  Some logic around views:
> a. The internal view is used by cluster machines when performing lookups. So 
> hosts on the cluster network should get answers from the internal view in DNS
> b. The external view is used by non-local-cluster machines when performing 
> lookups. So hosts not on the cluster network should get answers from the 
> external view in DNS
>  
> So this brings me to our problem. We created some firewall rules to allow 
> inbound traffic from each clusters server network to allow distcp to occur. 
> But we noticed a problem almost immediately that when YARN attempted to talk 
> to the Remote Cluster it was binding outgoing traffic to the cluster network 
> interface which IS NOT routable. So after researching the code we noticed the 
> following in NetUtils.java and Client.java 
> Basically in Client.java it looks as if it takes whatever the hostname is and 
> attempts to bind to whatever the hostname is resolved to. This is not valid 
> in a multi-homed network with one routable interface and one non routable 
> interface. After reading through the java.net.Socket documentation it is 
> valid to perform socket.bind(null) which will allow the OS routing table and 
> DNS to send the traffic to the correct interface. I will also attach the 
> nework traces and a test patch for 2.7.x and 3.x code base. I have this test 
> fix below in my Hadoop Test Cluster.
> Client.java:
>       
> |/*|
> | | * Bind the socket to the host specified in the principal name of the|
> | | * client, to ensure Server matching address of the client connection|
> | | * to host name in principal passed.|
> | | */|
> | |InetSocketAddress bindAddr = null;|
> | |if (ticket != null && ticket.hasKerberosCredentials()) {|
> | |KerberosInfo krbInfo =|
> | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);|
> | |if (krbInfo != null) {|
> | |String principal = ticket.getUserName();|
> | |String host = SecurityUtil.getHostFromPrincipal(principal);|
> | |// If host name is a valid local address then bind socket to it|
> | |{color:#FF}*InetAddress localAddr = 
> NetUtils.getLocalInetAddress(host);*{color}|
> |{color:#FF} ** {color}|if (localAddr != null) {|
> | |this.socket.setReuseAddress(true);|
> | |if (LOG.isDebugEnabled()) {|
> | |LOG.debug("Binding " + principal + " to " + localAddr);|
> | |}|
> | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*|
> | *{color:#FF}{color}* |*{color:#FF}}{color}*|
> | |}|
> | |}|
>  
> So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows 
> correctly out the correct interfaces:
>  
> diff --git 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
>  
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> index e1be271..c5b4a42 100644
> --- 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> +++ 
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> @@ -305,6 +305,9 @@
>    public static final String  IPC_CLIENT_FALLBA

[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector

2018-05-09 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469413#comment-16469413
 ] 

Aaron Fabbri commented on HADOOP-15356:
---

Thanks for adding the extra validation of the timeout config value.  +1, ship 
it.

> Make HTTP timeout configurable in ADLS Connector
> 
>
> Key: HADOOP-15356
> URL: https://issues.apache.org/jira/browse/HADOOP-15356
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Reporter: Atul Sikaria
>Assignee: Atul Sikaria
>Priority: Major
> Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, 
> HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch
>
>
> Currently the HTTP timeout for the connections to ADLS are not configurable 
> in Hadoop. This patch enables the timeouts to be configurable based on a 
> core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has 
> default value of 60 seconds - any optimizations to that setting can now be 
> done in Hadoop through core-site.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-05-09 Thread Lukas Waldmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Waldmann updated HADOOP-1:

Attachment: (was: HADOOP-1.15.patch)

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, 
> HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, 
> HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, 
> HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-05-09 Thread Lukas Waldmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Waldmann updated HADOOP-1:

Attachment: HADOOP-1.15.patch

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, 
> HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, 
> HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, 
> HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector

2018-05-09 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469354#comment-16469354
 ] 

genericqa commented on HADOOP-15356:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
38s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
55s{color} | {color:green} hadoop-azure-datalake in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}140m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HADOOP-15356 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12922654/HADOOP-15356.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  findbugs  checkstyle  |
| uname | Linux 4cfcb516161c 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 343b51d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1

[jira] [Commented] (HADOOP-15441) After HADOOP-14987, encryption zone operations print unnecessary INFO logs

2018-05-09 Thread Gabor Bota (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469312#comment-16469312
 ] 

Gabor Bota commented on HADOOP-15441:
-

Sure [~shahrs87], I've updated both.

> After HADOOP-14987, encryption zone operations print unnecessary INFO logs
> --
>
> Key: HADOOP-15441
> URL: https://issues.apache.org/jira/browse/HADOOP-15441
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch
>
>
> It looks like after HADOOP-14987, any encryption zone operations prints extra 
> INFO log messages as follows:
> {code:java}
> $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/
> 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: 
> https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: 
> kms://ht...@hadoop3-1.example.com:16000/kms created.
> {code}
> It might make sense to make it a DEBUG message instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15441) After HADOOP-14987, encryption zone operations print unnecessary INFO logs

2018-05-09 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-15441:

Description: 
It looks like after HADOOP-14987, any encryption zone operations prints extra 
INFO log messages as follows:
{code:java}
$ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/
18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: 
https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: 
kms://ht...@hadoop3-1.example.com:16000/kms created.
{code}

It might make sense to make it a DEBUG message instead.

  was:
It looks like after HADOOP-14445, any encryption zone operations prints extra 
INFO log messages as follows:
{code:java}
$ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/
18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: 
https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: 
kms://ht...@hadoop3-1.example.com:16000/kms created.
{code}

It might make sense to make it a DEBUG message instead.


> After HADOOP-14987, encryption zone operations print unnecessary INFO logs
> --
>
> Key: HADOOP-15441
> URL: https://issues.apache.org/jira/browse/HADOOP-15441
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch
>
>
> It looks like after HADOOP-14987, any encryption zone operations prints extra 
> INFO log messages as follows:
> {code:java}
> $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/
> 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: 
> https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: 
> kms://ht...@hadoop3-1.example.com:16000/kms created.
> {code}
> It might make sense to make it a DEBUG message instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15441) After HADOOP-14987, encryption zone operations print unnecessary INFO logs

2018-05-09 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-15441:

Summary: After HADOOP-14987, encryption zone operations print unnecessary 
INFO logs  (was: After HADOOP-14445, encryption zone operations print 
unnecessary INFO logs)

> After HADOOP-14987, encryption zone operations print unnecessary INFO logs
> --
>
> Key: HADOOP-15441
> URL: https://issues.apache.org/jira/browse/HADOOP-15441
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch
>
>
> It looks like after HADOOP-14445, any encryption zone operations prints extra 
> INFO log messages as follows:
> {code:java}
> $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/
> 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: 
> https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: 
> kms://ht...@hadoop3-1.example.com:16000/kms created.
> {code}
> It might make sense to make it a DEBUG message instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong

2018-05-09 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469306#comment-16469306
 ] 

Billie Rinaldi commented on HADOOP-15250:
-

I'd be interested in getting this patch committed to branch-3.1 as well. Would 
anyone have an objection to this?

> Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
> --
>
> Key: HADOOP-15250
> URL: https://issues.apache.org/jira/browse/HADOOP-15250
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, net
>Affects Versions: 2.7.3, 2.9.0, 3.0.0
> Environment: Multihome cluster with split DNS and rDNS lookup of 
> localhost returning non-routable IPAddr
>Reporter: Greg Senia
>Assignee: Ajay Kumar
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: HADOOP-15250.00.patch, HADOOP-15250.01.patch, 
> HADOOP-15250.02.patch, HADOOP-15250.patch
>
>
> We run our Hadoop clusters with two networks attached to each node. These 
> network are as follows a server network that is firewalled with firewalld 
> allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and 
> the HTTP YARN RM/ATS and MR History Server. The second network is the cluster 
> network on the second network interface this uses Jumbo frames and is open no 
> restrictions and allows all cluster traffic to flow between nodes. 
>  
> To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the 
> traffic is originating from nodes with cluster networks we return the 
> internal DNS record for the nodes. This all works fine with all the 
> multi-homing features added to Hadoop 2.x
>  Some logic around views:
> a. The internal view is used by cluster machines when performing lookups. So 
> hosts on the cluster network should get answers from the internal view in DNS
> b. The external view is used by non-local-cluster machines when performing 
> lookups. So hosts not on the cluster network should get answers from the 
> external view in DNS
>  
> So this brings me to our problem. We created some firewall rules to allow 
> inbound traffic from each clusters server network to allow distcp to occur. 
> But we noticed a problem almost immediately that when YARN attempted to talk 
> to the Remote Cluster it was binding outgoing traffic to the cluster network 
> interface which IS NOT routable. So after researching the code we noticed the 
> following in NetUtils.java and Client.java 
> Basically in Client.java it looks as if it takes whatever the hostname is and 
> attempts to bind to whatever the hostname is resolved to. This is not valid 
> in a multi-homed network with one routable interface and one non routable 
> interface. After reading through the java.net.Socket documentation it is 
> valid to perform socket.bind(null) which will allow the OS routing table and 
> DNS to send the traffic to the correct interface. I will also attach the 
> nework traces and a test patch for 2.7.x and 3.x code base. I have this test 
> fix below in my Hadoop Test Cluster.
> Client.java:
>       
> |/*|
> | | * Bind the socket to the host specified in the principal name of the|
> | | * client, to ensure Server matching address of the client connection|
> | | * to host name in principal passed.|
> | | */|
> | |InetSocketAddress bindAddr = null;|
> | |if (ticket != null && ticket.hasKerberosCredentials()) {|
> | |KerberosInfo krbInfo =|
> | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);|
> | |if (krbInfo != null) {|
> | |String principal = ticket.getUserName();|
> | |String host = SecurityUtil.getHostFromPrincipal(principal);|
> | |// If host name is a valid local address then bind socket to it|
> | |{color:#FF}*InetAddress localAddr = 
> NetUtils.getLocalInetAddress(host);*{color}|
> |{color:#FF} ** {color}|if (localAddr != null) {|
> | |this.socket.setReuseAddress(true);|
> | |if (LOG.isDebugEnabled()) {|
> | |LOG.debug("Binding " + principal + " to " + localAddr);|
> | |}|
> | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*|
> | *{color:#FF}{color}* |*{color:#FF}}{color}*|
> | |}|
> | |}|
>  
> So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows 
> correctly out the correct interfaces:
>  
> diff --git 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
>  
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> index e1be271..c5b4a42 100644
> --- 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> +++ 
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java

[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-05-09 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469288#comment-16469288
 ] 

genericqa commented on HADOOP-1:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 35 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-tools {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 
55s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 18s{color} | {color:orange} root: The patch generated 6 new + 7 unchanged - 
0 fixed = 13 total (was 7) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
28s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
8s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-tools {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
27s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m  
5s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 15s{color} 
| {color:red} hadoop-ftp in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 26s{color} 
| {color:red} hadoop-tools in the patch failed

[jira] [Commented] (HADOOP-15450) Avoid fsync storm triggered by DiskChecker and handle disk full situation

2018-05-09 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469246#comment-16469246
 ] 

genericqa commented on HADOOP-15450:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
37s{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m  
0s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m  0s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
40s{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  0m 
39s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 44s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HADOOP-15450 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12922548/HADOOP-15450.01.patch 
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6672782dca89 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 343b51d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14604/artifact/out/patch-mvninstall-hadoop-common-project_hadoop-common.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14604/artifact/out/patch-compile-root.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14604/artifact/out/patch-compile-root.txt
 |
| mvnsite | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14604/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-com

[jira] [Commented] (HADOOP-15441) After HADOOP-14445, encryption zone operations print unnecessary INFO logs

2018-05-09 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469124#comment-16469124
 ] 

Xiaoyu Yao commented on HADOOP-15441:
-

Thanks [~shahrs87], +1 for v1 patch. 

> After HADOOP-14445, encryption zone operations print unnecessary INFO logs
> --
>
> Key: HADOOP-15441
> URL: https://issues.apache.org/jira/browse/HADOOP-15441
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch
>
>
> It looks like after HADOOP-14445, any encryption zone operations prints extra 
> INFO log messages as follows:
> {code:java}
> $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/
> 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: 
> https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: 
> kms://ht...@hadoop3-1.example.com:16000/kms created.
> {code}
> It might make sense to make it a DEBUG message instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector

2018-05-09 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468997#comment-16468997
 ] 

Sean Mackrory commented on HADOOP-15356:


{quote}what if someone sets adl.http.timeout to 0? \{quote}

My guess is absolutely nothing :) Immediate timeouts? Just noticed we're silent 
when not setting the timeout and we should probably log that the config is 
invalid in some way and not getting used, so adding a LOG.info as well as doing 
that for values of zero or less.

> Make HTTP timeout configurable in ADLS Connector
> 
>
> Key: HADOOP-15356
> URL: https://issues.apache.org/jira/browse/HADOOP-15356
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Reporter: Atul Sikaria
>Assignee: Atul Sikaria
>Priority: Major
> Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, 
> HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch
>
>
> Currently the HTTP timeout for the connections to ADLS are not configurable 
> in Hadoop. This patch enables the timeouts to be configurable based on a 
> core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has 
> default value of 60 seconds - any optimizations to that setting can now be 
> done in Hadoop through core-site.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector

2018-05-09 Thread Sean Mackrory (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-15356:
---
Attachment: HADOOP-15356.005.patch

> Make HTTP timeout configurable in ADLS Connector
> 
>
> Key: HADOOP-15356
> URL: https://issues.apache.org/jira/browse/HADOOP-15356
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Reporter: Atul Sikaria
>Assignee: Atul Sikaria
>Priority: Major
> Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch, 
> HADOOP-15356.003.patch, HADOOP-15356.004.patch, HADOOP-15356.005.patch
>
>
> Currently the HTTP timeout for the connections to ADLS are not configurable 
> in Hadoop. This patch enables the timeouts to be configurable based on a 
> core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has 
> default value of 60 seconds - any optimizations to that setting can now be 
> done in Hadoop through core-site.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15450) Avoid fsync storm triggered by DiskChecker and handle disk full situation

2018-05-09 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HADOOP-15450:
---
Status: Patch Available  (was: Open)

> Avoid fsync storm triggered by DiskChecker and handle disk full situation
> -
>
> Key: HADOOP-15450
> URL: https://issues.apache.org/jira/browse/HADOOP-15450
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Arpit Agarwal
>Priority: Blocker
> Attachments: HADOOP-15450.01.patch
>
>
> Fix disk checker issues reported by [~kihwal] in HADOOP-13738
> There are non-hdfs users of DiskChecker, who use it proactively, not just on 
> failures. This was fine before, but now it incurs heavy I/O due to 
> introduction of fsync() in the code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15441) After HADOOP-14445, encryption zone operations print unnecessary INFO logs

2018-05-09 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468974#comment-16468974
 ] 

Rushabh S Shah edited comment on HADOOP-15441 at 5/9/18 3:41 PM:
-

Thanks [~gabor.bota] for the updated patch.
{quote}Since we are using slf4j, we don't really need 
{{if(LOG.isDebugEnabled())}}.
 {quote}
Thanks [~xyao] for pointing this out.
I am happy with v1 of the patch. If everyone is ok, I will go ahead and commit 
v1 of patch today evening.
Gabor: can you please update the description and title of the jira.


was (Author: shahrs87):
Thanks [~gabor.bota] for the updated patch.
{quote}Since we are using slf4j, we don't really need 
{{if(LOG.isDebugEnabled())}}.
 {quote}
Thanks [~xyao] for pointing this out.
I am happy with v1 of the patch. If everyone is ok, I will go ahead and commit 
v1 of patch today evening.
Gabor: can you please update the description and title of the patch.

> After HADOOP-14445, encryption zone operations print unnecessary INFO logs
> --
>
> Key: HADOOP-15441
> URL: https://issues.apache.org/jira/browse/HADOOP-15441
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch
>
>
> It looks like after HADOOP-14445, any encryption zone operations prints extra 
> INFO log messages as follows:
> {code:java}
> $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/
> 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: 
> https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: 
> kms://ht...@hadoop3-1.example.com:16000/kms created.
> {code}
> It might make sense to make it a DEBUG message instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15441) After HADOOP-14445, encryption zone operations print unnecessary INFO logs

2018-05-09 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468974#comment-16468974
 ] 

Rushabh S Shah commented on HADOOP-15441:
-

Thanks [~gabor.bota] for the updated patch.
{quote}Since we are using slf4j, we don't really need 
{{if(LOG.isDebugEnabled())}}.
 {quote}
Thanks [~xyao] for pointing this out.
I am happy with v1 of the patch. If everyone is ok, I will go ahead and commit 
v1 of patch today evening.
Gabor: can you please update the description and title of the patch.

> After HADOOP-14445, encryption zone operations print unnecessary INFO logs
> --
>
> Key: HADOOP-15441
> URL: https://issues.apache.org/jira/browse/HADOOP-15441
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch
>
>
> It looks like after HADOOP-14445, any encryption zone operations prints extra 
> INFO log messages as follows:
> {code:java}
> $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/
> 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: 
> https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: 
> kms://ht...@hadoop3-1.example.com:16000/kms created.
> {code}
> It might make sense to make it a DEBUG message instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-12631) Can't use windows network drive

2018-05-09 Thread Jimmy Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468964#comment-16468964
 ] 

Jimmy Lu edited comment on HADOOP-12631 at 5/9/18 3:32 PM:
---

Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call 
{{java.net.URI#normalize}}, which is known to be buggy and destroys UNC path.


was (Author: yuhta):
Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call 
{{ java.net.URI#normalize}}, which is known to be buggy and destroys UNC path.

> Can't use windows network drive
> ---
>
> Key: HADOOP-12631
> URL: https://issues.apache.org/jira/browse/HADOOP-12631
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.6.0
>Reporter: tian
>Priority: Minor
>
> When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash  will be 
> normalised to single slash. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-12631) Can't use windows network drive

2018-05-09 Thread Jimmy Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468964#comment-16468964
 ] 

Jimmy Lu edited comment on HADOOP-12631 at 5/9/18 3:31 PM:
---

Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call 
{{ java.net.URI#normalize}}, which is known to be buggy and destroys UNC path.


was (Author: yuhta):
Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call 
{{URI#normalize}}, which is known to be buggy and destroys UNC path.

> Can't use windows network drive
> ---
>
> Key: HADOOP-12631
> URL: https://issues.apache.org/jira/browse/HADOOP-12631
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.6.0
>Reporter: tian
>Priority: Minor
>
> When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash  will be 
> normalised to single slash. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12631) Can't use windows network drive

2018-05-09 Thread Jimmy Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468964#comment-16468964
 ] 

Jimmy Lu commented on HADOOP-12631:
---

Hi [~cnauroth], in {{org.apache.hadoop.fs.Path#initialize}} you shouldn't call 
{{URI#normalize}}, which is known to be buggy and destroys UNC path.

> Can't use windows network drive
> ---
>
> Key: HADOOP-12631
> URL: https://issues.apache.org/jira/browse/HADOOP-12631
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.6.0
>Reporter: tian
>Priority: Minor
>
> When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash  will be 
> normalised to single slash. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-12631) Can't use windows network drive

2018-05-09 Thread Jimmy Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468956#comment-16468956
 ] 

Jimmy Lu edited comment on HADOOP-12631 at 5/9/18 3:17 PM:
---

Hi [~cnauroth], I need to access the same URI from both Windows and Linux, 
which requires leading double slash in path.  Is there a way to workaround this 
bug?


was (Author: yuhta):
Hi [~cnauroth], I need to access the same URI from both Windows and Linux, 
which requires leading double slash in path.  Is there a way to workaround this 
issue?

> Can't use windows network drive
> ---
>
> Key: HADOOP-12631
> URL: https://issues.apache.org/jira/browse/HADOOP-12631
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.6.0
>Reporter: tian
>Priority: Minor
>
> When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash  will be 
> normalised to single slash. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12631) Can't use windows network drive

2018-05-09 Thread Jimmy Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468956#comment-16468956
 ] 

Jimmy Lu commented on HADOOP-12631:
---

Hi [~cnauroth], I need to access the same URI from both Windows and Linux, 
which requires leading double slash in path.  Is there a way to workaround this 
issue?

> Can't use windows network drive
> ---
>
> Key: HADOOP-12631
> URL: https://issues.apache.org/jira/browse/HADOOP-12631
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.6.0
>Reporter: tian
>Priority: Minor
>
> When we create a Path like "\\SIMPLESHARE\MyHome$", the double slash  will be 
> normalised to single slash. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15441) After HADOOP-14445, encryption zone operations print unnecessary INFO logs

2018-05-09 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468944#comment-16468944
 ] 

Xiaoyu Yao commented on HADOOP-15441:
-

Thanks [~gabor.bota] for the fix. The patch v2 looks good to me.

Since we are using slf4j, we don't really need {{if(LOG.isDebugEnabled())}}.

> After HADOOP-14445, encryption zone operations print unnecessary INFO logs
> --
>
> Key: HADOOP-15441
> URL: https://issues.apache.org/jira/browse/HADOOP-15441
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Minor
> Attachments: HADOOP-15441.001.patch, HADOOP-15441.002.patch
>
>
> It looks like after HADOOP-14445, any encryption zone operations prints extra 
> INFO log messages as follows:
> {code:java}
> $ hdfs dfs -copyFromLocal /etc/krb5.conf /scale/
> 18/05/02 11:54:55 INFO kms.KMSClientProvider: KMSClientProvider for KMS url: 
> https://hadoop3-1.example.com:16000/kms/v1/ delegation token service: 
> kms://ht...@hadoop3-1.example.com:16000/kms created.
> {code}
> It might make sense to make it a DEBUG message instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-05-09 Thread Lukas Waldmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Waldmann updated HADOOP-1:

Attachment: (was: HADOOP-1.15.patch)

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, 
> HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, 
> HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, 
> HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-05-09 Thread Lukas Waldmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Waldmann updated HADOOP-1:

Attachment: HADOOP-1.15.patch

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, 
> HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, 
> HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, 
> HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-05-09 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468826#comment-16468826
 ] 

genericqa commented on HADOOP-1:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 35 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
58s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 52s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-tools {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 
53s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 19s{color} | {color:orange} root: The patch generated 6 new + 7 unchanged - 
0 fixed = 13 total (was 7) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
30s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
6s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-tools {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 56s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m 10s{color} 
| {color:red} hadoop-ftp in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 41s{color} 
| {color:red} hadoop-tools in the patch failed. {color

[jira] [Updated] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data

2018-05-09 Thread jincheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jincheng updated HADOOP-15452:
--
Description: 
RT, when reducers tasks fetch  data from mapper tasks, it met 
ArrayIndexOutOfBoundsException, here is stackTrace:
{code:java}
org.apache.hadoop.mapred.YarnChild: Exception running child : 
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle 
in fetcher#1
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199)
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549)
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202)
{code}
 

Anyone has ideas?

  was:
RT, when reducers tasks fetch  data from mapper tasks, it met 
ArrayIndexOutOfBoundsException, here is stackTrace:
{code:java}
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#1 
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
 Caused by: java.lang.ArrayIndexOutOfBoundsException 
at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
 
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
 
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
 
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) 
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) 
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202)

{code}
 

Anyone has ideas?


> Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch 
> map output data
> -
>
> Key: HADOOP-15452
> URL: https://issues.apache.org/jira/browse/HADOOP-15452
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.6.0
> Environment: hadoop -2.6.0-cdh5.4.4
>Reporter: jincheng
>Priority: Major
>
> RT, when reducers tasks fetch  data from mapper tasks, it met 
> ArrayIndexOutOfBoundsException, here is stackTrace:
> {code:java}
> org.apache.hadoop.mapred.YarnChild: Exception running child : 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#1
>   at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
>   at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>   at 
> org.apache.ha

[jira] [Updated] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data

2018-05-09 Thread jincheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jincheng updated HADOOP-15452:
--
Description: 
RT, when reducers tasks fetch  data from mapper tasks, it met 
ArrayIndexOutOfBoundsException, here is stackTrace:
{code:java}
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#1 
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
 Caused by: java.lang.ArrayIndexOutOfBoundsException 
at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
 
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
 
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
 
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) 
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) 
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202)

{code}
 

Anyone has ideas?

  was:
RT, when reducers tasks fetch  data from mapper tasks, it met 
ArrayIndexOutOfBoundsException, here is stackTrace:
{code:java}
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#1 
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at 
java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
 Caused by: java.lang.ArrayIndexOutOfBoundsException 
at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
 
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
 
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
 
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) 
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) 
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202)

{code}
 

Anyone has ideas?


> Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch 
> map output data
> -
>
> Key: HADOOP-15452
> URL: https://issues.apache.org/jira/browse/HADOOP-15452
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.6.0
> Environment: hadoop -2.6.0-cdh5.4.4
>Reporter: jincheng
>Priority: Major
>
> RT, when reducers tasks fetch  data from mapper tasks, it met 
> ArrayIndexOutOfBoundsException, here is stackTrace:
> {code:java}
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#1 
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) 
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>  Caused by: java.lang.ArrayIndexOutOfBoundsException 
> at 
> org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
>  
> at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>  
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>  
> at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) 
> at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
>  
> at 
> org.ap

[jira] [Updated] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data

2018-05-09 Thread jincheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jincheng updated HADOOP-15452:
--
Description: 
RT, when reducers tasks fetch  data from mapper tasks, it met 
ArrayIndexOutOfBoundsException, here is stackTrace:
{code:java}
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#1 
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at 
java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
 Caused by: java.lang.ArrayIndexOutOfBoundsException 
at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
 
at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
 
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
 
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) 
at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) 
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202)

{code}
 

Anyone has ideas?

  was:
RT, when reducers tasks fetch  data from mapper tasks, it met 
ArrayIndexOutOfBoundsException, here is stackTrace:
{code:java}
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#1 at 
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at 
java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
 Caused by: java.lang.ArrayIndexOutOfBoundsException 
at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
 at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
 at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
 at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
 at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) 
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202)

{code}
 

Anyone has ideas?


> Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch 
> map output data
> -
>
> Key: HADOOP-15452
> URL: https://issues.apache.org/jira/browse/HADOOP-15452
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.6.0
> Environment: hadoop -2.6.0-cdh5.4.4
>Reporter: jincheng
>Priority: Major
>
> RT, when reducers tasks fetch  data from mapper tasks, it met 
> ArrayIndexOutOfBoundsException, here is stackTrace:
> {code:java}
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#1 
> at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) 
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at 
> java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>  Caused by: java.lang.ArrayIndexOutOfBoundsException 
> at 
> org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
>  
> at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>  
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>  
> at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) 
> at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
>  
> at 
> org.apac

[jira] [Updated] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data

2018-05-09 Thread jincheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jincheng updated HADOOP-15452:
--
Description: 
RT, when reducers tasks fetch  data from mapper tasks, it met 
ArrayIndexOutOfBoundsException, here is stackTrace:
{code:java}
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#1 at 
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at 
java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
 Caused by: java.lang.ArrayIndexOutOfBoundsException 
at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
 at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
 at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
 at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
 at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) 
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202)

{code}
 

Anyone has ideas?

  was:
RT, when reducers tasks fetch  data from mapper tasks, it met 
ArrayIndexOutOfBoundsException, here is stackTrace:
{code:java}
// code placeholder
{code}
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#1 at 
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at 
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at 
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: 
java.lang.ArrayIndexOutOfBoundsException at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
 at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
 at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
 at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
 at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) 
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202)

 

Anyone has ideas?


> Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch 
> map output data
> -
>
> Key: HADOOP-15452
> URL: https://issues.apache.org/jira/browse/HADOOP-15452
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.6.0
> Environment: hadoop -2.6.0-cdh5.4.4
>Reporter: jincheng
>Priority: Major
>
> RT, when reducers tasks fetch  data from mapper tasks, it met 
> ArrayIndexOutOfBoundsException, here is stackTrace:
> {code:java}
> Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
> shuffle in fetcher#1 at 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
>  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) 
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at 
> java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160)
>  Caused by: java.lang.ArrayIndexOutOfBoundsException 
> at 
> org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
>  at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>  at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
>  at 
> org.apach

[jira] [Created] (HADOOP-15452) Snappy Decpmpressor met ArrayIndexOutOfBoundsException when reduce task fetch map output data

2018-05-09 Thread jincheng (JIRA)
jincheng created HADOOP-15452:
-

 Summary: Snappy Decpmpressor met ArrayIndexOutOfBoundsException 
when reduce task fetch map output data
 Key: HADOOP-15452
 URL: https://issues.apache.org/jira/browse/HADOOP-15452
 Project: Hadoop Common
  Issue Type: Bug
  Components: common
Affects Versions: 2.6.0
 Environment: hadoop -2.6.0-cdh5.4.4
Reporter: jincheng


RT, when reducers tasks fetch  data from mapper tasks, it met 
ArrayIndexOutOfBoundsException, here is stackTrace:
{code:java}
// code placeholder
{code}
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#1 at 
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at 
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at 
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160) Caused by: 
java.lang.ArrayIndexOutOfBoundsException at 
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.setInput(SnappyDecompressor.java:111)
 at 
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104)
 at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
 at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at 
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:98)
 at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:549) 
at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:346) 
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:202)

 

Anyone has ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-05-09 Thread Lukas Waldmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Waldmann updated HADOOP-1:

Attachment: HADOOP-1.15.patch

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, 
> HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, 
> HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, 
> HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-05-09 Thread Lukas Waldmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Waldmann updated HADOOP-1:

Attachment: (was: HADOOP-1.15.patch)

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.15.patch, HADOOP-1.2.patch, HADOOP-1.3.patch, 
> HADOOP-1.4.patch, HADOOP-1.5.patch, HADOOP-1.6.patch, 
> HADOOP-1.7.patch, HADOOP-1.8.patch, HADOOP-1.9.patch, 
> HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org