[jira] [Commented] (HADOOP-15398) StagingTestBase uses methods not available in Mockito 1.8.5

2018-04-30 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459462#comment-16459462
 ] 

Akira Ajisaka commented on HADOOP-15398:


LGTM, would you update LICENSE.txt as well?
{noformat:title=LICENSE.txt}
Mockito 1.8.5
{noformat}

> StagingTestBase uses methods not available in Mockito 1.8.5
> ---
>
> Key: HADOOP-15398
> URL: https://issues.apache.org/jira/browse/HADOOP-15398
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Mohammad Arshad
>Assignee: Mohammad Arshad
>Priority: Major
> Attachments: HADOOP-15398.001.patch
>
>
> *Problem:* hadoop trunk compilation is failing
>  *Root Cause:*
>  compilation error is coming from 
> {{org.apache.hadoop.fs.s3a.commit.staging.StagingTestBase}}. Compilation 
> error is "The method getArgumentAt(int, Class) is 
> undefined for the type InvocationOnMock".
> StagingTestBase is using getArgumentAt(int, Class) method 
> which is not available in mockito-all 1.8.5 version. getArgumentAt(int, 
> Class) method is available only from version 2.0.0-beta
> *Expectations:*
>  Either mockito-all version to be upgraded or test case to be written only 
> with available functions in 1.8.5.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14178) Move Mockito up to version 2.x

2018-04-30 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HADOOP-14178:
---
Attachment: HADOOP-14178.013.patch

> Move Mockito up to version 2.x
> --
>
> Key: HADOOP-14178
> URL: https://issues.apache.org/jira/browse/HADOOP-14178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-14178.001.patch, HADOOP-14178.002.patch, 
> HADOOP-14178.003.patch, HADOOP-14178.004.patch, HADOOP-14178.005-wip.patch, 
> HADOOP-14178.005-wip2.patch, HADOOP-14178.005-wip3.patch, 
> HADOOP-14178.005-wip4.patch, HADOOP-14178.005-wip5.patch, 
> HADOOP-14178.005-wip6.patch, HADOOP-14178.005.patch, HADOOP-14178.006.patch, 
> HADOOP-14178.007.patch, HADOOP-14178.008.patch, HADOOP-14178.009.patch, 
> HADOOP-14178.010.patch, HADOOP-14178.011.patch, HADOOP-14178.012.patch, 
> HADOOP-14178.013.patch
>
>
> I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 
> since the switch to maven in 2011. 
> Mockito is now at version 2.1, [with lots of Java 8 
> support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. 
> That' s not just defining actions as closures, but in supporting Optional 
> types, mocking methods in interfaces, etc. 
> It's only used for testing, and, *provided there aren't regressions*, cost of 
> upgrade is low. The good news: test tools usually come with good test 
> coverage. The bad: mockito does go deep into java bytecodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x

2018-04-30 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459454#comment-16459454
 ] 

Akira Ajisaka commented on HADOOP-14178:


013 patch: rebased

> Move Mockito up to version 2.x
> --
>
> Key: HADOOP-14178
> URL: https://issues.apache.org/jira/browse/HADOOP-14178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-14178.001.patch, HADOOP-14178.002.patch, 
> HADOOP-14178.003.patch, HADOOP-14178.004.patch, HADOOP-14178.005-wip.patch, 
> HADOOP-14178.005-wip2.patch, HADOOP-14178.005-wip3.patch, 
> HADOOP-14178.005-wip4.patch, HADOOP-14178.005-wip5.patch, 
> HADOOP-14178.005-wip6.patch, HADOOP-14178.005.patch, HADOOP-14178.006.patch, 
> HADOOP-14178.007.patch, HADOOP-14178.008.patch, HADOOP-14178.009.patch, 
> HADOOP-14178.010.patch, HADOOP-14178.011.patch, HADOOP-14178.012.patch, 
> HADOOP-14178.013.patch
>
>
> I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 
> since the switch to maven in 2011. 
> Mockito is now at version 2.1, [with lots of Java 8 
> support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. 
> That' s not just defining actions as closures, but in supporting Optional 
> types, mocking methods in interfaces, etc. 
> It's only used for testing, and, *provided there aren't regressions*, cost of 
> upgrade is low. The good news: test tools usually come with good test 
> coverage. The bad: mockito does go deep into java bytecodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-10783) apache-commons-lang.jar 2.6 does not support FreeBSD -upgrade to 3.x needed

2018-04-30 Thread Takanobu Asanuma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma reassigned HADOOP-10783:
-

Assignee: Takanobu Asanuma

> apache-commons-lang.jar 2.6 does not support FreeBSD -upgrade to 3.x needed
> ---
>
> Key: HADOOP-10783
> URL: https://issues.apache.org/jira/browse/HADOOP-10783
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.1
>Reporter: Dmitry Sivachenko
>Assignee: Takanobu Asanuma
>Priority: Major
> Attachments: commons-lang3_1.patch
>
>
> Hadoop-2.4.1 ships with apache-commons.jar version 2.6.
> It does not support FreeBSD (IS_OS_UNIX returns False).
> This is fixed in recent versions of apache-commons.jar
> Please update apache-commons.jar to recent version so it correctly recognizes 
> FreeBSD as UNIX-like system.
> Right now I get in datanode's log:
> 2014-07-04 11:58:10,459 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.ShortCircui
> tRegistry: Disabling ShortCircuitRegistry
> java.io.IOException: The OS is not UNIX.
> at 
> org.apache.hadoop.io.nativeio.SharedFileDescriptorFactory.create(SharedFileDescriptorFactory.java:77)
> at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.(ShortCircuitRegistry.java:169)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:583)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:771)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:289)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1931)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1818)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1865)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2041)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2065)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-10783) apache-commons-lang.jar 2.6 does not support FreeBSD -upgrade to 3.x needed

2018-04-30 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459450#comment-16459450
 ] 

Takanobu Asanuma commented on HADOOP-10783:
---

Thanks for working on this, all. I'd like to assign this jira to myself and 
take over your work.

> apache-commons-lang.jar 2.6 does not support FreeBSD -upgrade to 3.x needed
> ---
>
> Key: HADOOP-10783
> URL: https://issues.apache.org/jira/browse/HADOOP-10783
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.1
>Reporter: Dmitry Sivachenko
>Priority: Major
> Attachments: commons-lang3_1.patch
>
>
> Hadoop-2.4.1 ships with apache-commons.jar version 2.6.
> It does not support FreeBSD (IS_OS_UNIX returns False).
> This is fixed in recent versions of apache-commons.jar
> Please update apache-commons.jar to recent version so it correctly recognizes 
> FreeBSD as UNIX-like system.
> Right now I get in datanode's log:
> 2014-07-04 11:58:10,459 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.ShortCircui
> tRegistry: Disabling ShortCircuitRegistry
> java.io.IOException: The OS is not UNIX.
> at 
> org.apache.hadoop.io.nativeio.SharedFileDescriptorFactory.create(SharedFileDescriptorFactory.java:77)
> at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.(ShortCircuitRegistry.java:169)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:583)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:771)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:289)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1931)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1818)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1865)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2041)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2065)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459354#comment-16459354
 ] 

genericqa commented on HADOOP-15430:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 55s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 
47s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
4m 21s{color} | {color:orange} root: The patch generated 1 new + 39 unchanged - 
0 fixed = 40 total (was 39) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
31s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
45s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HADOOP-15430 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12921309/HADOOP-15430-001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e895a4959d83 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / fc074a3 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Commented] (HADOOP-15239) S3ABlockOutputStream.flush() be no-op when stream closed

2018-04-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459259#comment-16459259
 ] 

Hudson commented on HADOOP-15239:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14094 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14094/])
HADOOP-15239 S3ABlockOutputStream.flush() be no-op when stream closed.  
(fabbri: rev 919865a34bd5c3c99603993a0410846a97975869)
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java
* (add) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3ABlockOutputStream.java


> S3ABlockOutputStream.flush() be no-op when stream closed
> 
>
> Key: HADOOP-15239
> URL: https://issues.apache.org/jira/browse/HADOOP-15239
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Trivial
> Fix For: 3.2.0
>
> Attachments: HADOOP-15239.001.patch, HADOOP-15239.002.patch
>
>
> when you call flush() on a closed S3A output stream, you get a stack trace. 
> This can cause problems in code with race conditions across threads, e.g. 
> FLINK-8543. 
> we could make it log@warn "stream closed" rather than raise an IOE. It's just 
> a hint, after all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15395) DefaultImpersonationProvider fails to parse proxy user config if username has . in it

2018-04-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459250#comment-16459250
 ] 

genericqa commented on HADOOP-15395:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 41m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 40m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 40m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
8m 31s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 44s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}158m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.shell.TestCopyFromLocal |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HADOOP-15395 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12921301/HADOOP-15395.01.patch 
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 45c8fbcace74 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / fc074a3 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14540/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14540/testReport/ |
| Max. process+thread count | 1517 (vs. ulimit of 1) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 

[jira] [Updated] (HADOOP-15239) S3ABlockOutputStream.flush() be no-op when stream closed

2018-04-30 Thread Aaron Fabbri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Fabbri updated HADOOP-15239:
--
   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk after the usual testing.. Thank you for the patch 
[~gabor.bota].

> S3ABlockOutputStream.flush() be no-op when stream closed
> 
>
> Key: HADOOP-15239
> URL: https://issues.apache.org/jira/browse/HADOOP-15239
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 2.9.0, 2.8.3, 2.7.5, 3.0.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Trivial
> Fix For: 3.2.0
>
> Attachments: HADOOP-15239.001.patch, HADOOP-15239.002.patch
>
>
> when you call flush() on a closed S3A output stream, you get a stack trace. 
> This can cause problems in code with race conditions across threads, e.g. 
> FLINK-8543. 
> we could make it log@warn "stream closed" rather than raise an IOE. It's just 
> a hint, after all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15428) s3guard bucket-info -unguarded will guard bucket if FS is set to do this automatically

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459191#comment-16459191
 ] 

Steve Loughran commented on HADOOP-15428:
-

Actually I'm confused now. If the bucket has s3guard enabled, then yes, it 
should be an error if this is set but the DDB table is there

I'll need to do some more CLI Experiments. What do others think?

[~lqjack]: while you are looking at S3Guard code,HADOOP-15430 has just 
surfaced

> s3guard bucket-info -unguarded will guard bucket if FS is set to do this 
> automatically
> --
>
> Key: HADOOP-15428
> URL: https://issues.apache.org/jira/browse/HADOOP-15428
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Major
>
> If you call hadoop s3guard bucket-info on a bucket where the fs is set to 
> create a s3guard table on demand, then the DDB table is automatically 
> created. As a result
> the {{bucket-info -unguarded}} option cannot be used, and the call has 
> significant side effects (i.e. it can run up bills)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459177#comment-16459177
 ] 

Steve Loughran commented on HADOOP-15430:
-

Patch submitted

test run: nowhere

this patch doesn't try to fix things, it adds validation with the built up 
query item that all keys != "". I can't trigger that in an IT test, but on the 
CLI it produces the stack above. 

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15430-001.patch
>
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15430:

Status: Patch Available  (was: Open)

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15430-001.patch
>
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong

2018-04-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459127#comment-16459127
 ] 

genericqa commented on HADOOP-15250:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 18s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 53s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch 
generated 4 new + 229 unchanged - 0 fixed = 233 total (was 229) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  6s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
10s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}117m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | HADOOP-15250 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12921290/HADOOP-15250.00.patch 
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux f36bc011d84f 4.4.0-121-generic #145-Ubuntu SMP Fri Apr 13 
13:47:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9b09555 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14539/artifact/out/diff-checkstyle-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14539/testReport/ |
| Max. process+thread count | 1371 (vs. ulimit of 1) |
| modules | C: hadoop-common-project/hadoop-common U: 

[jira] [Commented] (HADOOP-15408) HADOOP-14445 broke Spark.

2018-04-30 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459107#comment-16459107
 ] 

Xiao Chen commented on HADOOP-15408:


I was saying this because in patch 1, TestKMS was changed to let the tests 
pass. Otherwise, in the test you will run into the equivalent of new jar seeing 
kms-dt, which can't decodeIdentifier. And...

bq. Trying to understand why we want to decode TokenIdentifier ?
Because [it is a public 
API|https://github.com/apache/hadoop/blob/branch-3.0.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java#L169]...

> HADOOP-14445 broke Spark.
> -
>
> Key: HADOOP-15408
> URL: https://issues.apache.org/jira/browse/HADOOP-15408
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Rushabh S Shah
>Priority: Blocker
> Attachments: HADOOP-15408-trunk.001.patch, split.patch, 
> split.prelim.patch
>
>
> Spark bundles hadoop related jars in their package.
>  Spark expects backwards compatibility between minor versions.
>  Their job failed after we deployed HADOOP-14445 in our test cluster.
> {noformat}
> 2018-04-20 21:09:53,245 INFO [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
> 2018-04-20 21:09:53,273 ERROR [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.util.ServiceConfigurationError: 
> org.apache.hadoop.security.token.TokenIdentifier: Provider 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$
> KMSLegacyDelegationTokenIdentifier could not be instantiated
> at java.util.ServiceLoader.fail(ServiceLoader.java:232)
> at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> at 
> org.apache.hadoop.security.token.Token.getClassForIdentifier(Token.java:117)
> at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:138)
> at org.apache.hadoop.security.token.Token.identifierToString(Token.java:393)
> at org.apache.hadoop.security.token.Token.toString(Token.java:413)
> at java.lang.String.valueOf(String.java:2994)
> at 
> org.apache.commons.logging.impl.SLF4JLocationAwareLog.info(SLF4JLocationAwareLog.java:155)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1634)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1583)
> Caused by: java.lang.NoSuchFieldError: TOKEN_LEGACY_KIND
> at 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$KMSLegacyDelegationTokenIdentifier.(KMSDelegationToken.java:64)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at java.lang.Class.newInstance(Class.java:442)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
> ... 10 more
> 2018-04-20 21:09:53,278 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1
> {noformat}
> Their classpath looks like 
> {{\{...:hadoop-common-pre-HADOOP-14445.jar:.:hadoop-common-with-HADOOP-14445.jar:\}}}
> This is because the container loaded {{KMSDelegationToken}} class from an 
> older jar and {{KMSLegacyDelegationTokenIdentifier}} from new jar and it 
> fails when {{KMSLegacyDelegationTokenIdentifier}} wants to read 
> {{TOKEN_LEGACY_KIND}} from {{KMSDelegationToken}} which doesn't exist before.
>  Cc [~xiaochen]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15430:

Attachment: HADOOP-15430-001.patch

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-15430-001.patch
>
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459095#comment-16459095
 ] 

Steve Loughran commented on HADOOP-15430:
-

yeah, that test shows we've been close to this. I'm about to submit a patch, 
but I can't replicate it with a unit tests which doesn't go near fsshell, and I 
haven't put one together yet

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Blocker
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong

2018-04-30 Thread Greg Senia (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459056#comment-16459056
 ] 

Greg Senia commented on HADOOP-15250:
-

[~ajayydv] Feel free to run with it!!!

> Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
> --
>
> Key: HADOOP-15250
> URL: https://issues.apache.org/jira/browse/HADOOP-15250
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, net
>Affects Versions: 2.7.3, 2.9.0, 3.0.0
> Environment: Multihome cluster with split DNS and rDNS lookup of 
> localhost returning non-routable IPAddr
>Reporter: Greg Senia
>Priority: Critical
> Attachments: HADOOP-15250.00.patch, HADOOP-15250.patch
>
>
> We run our Hadoop clusters with two networks attached to each node. These 
> network are as follows a server network that is firewalled with firewalld 
> allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and 
> the HTTP YARN RM/ATS and MR History Server. The second network is the cluster 
> network on the second network interface this uses Jumbo frames and is open no 
> restrictions and allows all cluster traffic to flow between nodes. 
>  
> To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the 
> traffic is originating from nodes with cluster networks we return the 
> internal DNS record for the nodes. This all works fine with all the 
> multi-homing features added to Hadoop 2.x
>  Some logic around views:
> a. The internal view is used by cluster machines when performing lookups. So 
> hosts on the cluster network should get answers from the internal view in DNS
> b. The external view is used by non-local-cluster machines when performing 
> lookups. So hosts not on the cluster network should get answers from the 
> external view in DNS
>  
> So this brings me to our problem. We created some firewall rules to allow 
> inbound traffic from each clusters server network to allow distcp to occur. 
> But we noticed a problem almost immediately that when YARN attempted to talk 
> to the Remote Cluster it was binding outgoing traffic to the cluster network 
> interface which IS NOT routable. So after researching the code we noticed the 
> following in NetUtils.java and Client.java 
> Basically in Client.java it looks as if it takes whatever the hostname is and 
> attempts to bind to whatever the hostname is resolved to. This is not valid 
> in a multi-homed network with one routable interface and one non routable 
> interface. After reading through the java.net.Socket documentation it is 
> valid to perform socket.bind(null) which will allow the OS routing table and 
> DNS to send the traffic to the correct interface. I will also attach the 
> nework traces and a test patch for 2.7.x and 3.x code base. I have this test 
> fix below in my Hadoop Test Cluster.
> Client.java:
>       
> |/*|
> | | * Bind the socket to the host specified in the principal name of the|
> | | * client, to ensure Server matching address of the client connection|
> | | * to host name in principal passed.|
> | | */|
> | |InetSocketAddress bindAddr = null;|
> | |if (ticket != null && ticket.hasKerberosCredentials()) {|
> | |KerberosInfo krbInfo =|
> | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);|
> | |if (krbInfo != null) {|
> | |String principal = ticket.getUserName();|
> | |String host = SecurityUtil.getHostFromPrincipal(principal);|
> | |// If host name is a valid local address then bind socket to it|
> | |{color:#FF}*InetAddress localAddr = 
> NetUtils.getLocalInetAddress(host);*{color}|
> |{color:#FF} ** {color}|if (localAddr != null) {|
> | |this.socket.setReuseAddress(true);|
> | |if (LOG.isDebugEnabled()) {|
> | |LOG.debug("Binding " + principal + " to " + localAddr);|
> | |}|
> | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*|
> | *{color:#FF}{color}* |*{color:#FF}}{color}*|
> | |}|
> | |}|
>  
> So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows 
> correctly out the correct interfaces:
>  
> diff --git 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
>  
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> index e1be271..c5b4a42 100644
> --- 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> +++ 
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> @@ -305,6 +305,9 @@
>    public static final String  IPC_CLIENT_FALLBACK_TO_SIMPLE_AUTH_ALLOWED_KEY 
> = "ipc.client.fallback-to-simple-auth-allowed";
>    public static final boolean 
> 

[jira] [Commented] (HADOOP-15395) DefaultImpersonationProvider fails to parse proxy user config if username has . in it

2018-04-30 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459018#comment-16459018
 ] 

Ajay Kumar commented on HADOOP-15395:
-

[~msingh] thanks for review, fixed checkstyle issues in patch v1.

> DefaultImpersonationProvider fails to parse proxy user config if username has 
> . in it
> -
>
> Key: HADOOP-15395
> URL: https://issues.apache.org/jira/browse/HADOOP-15395
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HADOOP-15395.00.patch, HADOOP-15395.01.patch
>
>
> DefaultImpersonationProvider fails to parse proxy user config if username has 
> . in it. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15395) DefaultImpersonationProvider fails to parse proxy user config if username has . in it

2018-04-30 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-15395:

Attachment: HADOOP-15395.01.patch

> DefaultImpersonationProvider fails to parse proxy user config if username has 
> . in it
> -
>
> Key: HADOOP-15395
> URL: https://issues.apache.org/jira/browse/HADOOP-15395
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HADOOP-15395.00.patch, HADOOP-15395.01.patch
>
>
> DefaultImpersonationProvider fails to parse proxy user config if username has 
> . in it. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15427) hadoop shell complains needlessly about "ERROR: Tools helper .. not found"

2018-04-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458977#comment-16458977
 ] 

Allen Wittenauer edited comment on HADOOP-15427 at 4/30/18 8:07 PM:


Did you try turning it off and on?  :)

I just did a fresh install of trunk from the tarball.  No error.  Is your 
install actually correct?  No missing parts? 
${HADOOP_LIBEXEC_DIR}/tools/hadoop-aws.sh exists?


was (Author: aw):
Did you try turning it off and on?  :)

I just did a fresh install of trunk from the tarball.  No error.  Is your 
install actually correct?  No missing parts? 
${HADOOP_HOME}/libexec/tools/hadoop-aws.sh exists?

> hadoop shell complains needlessly about "ERROR: Tools helper .. not found"
> --
>
> Key: HADOOP-15427
> URL: https://issues.apache.org/jira/browse/HADOOP-15427
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Minor
>
> toolshelper.sh prints error messages like
> {code}
> ERROR: Tools helper...hadoop/libexec/tools/hadoop-aws.sh was not found.
> {code}
> even when they aren't neede,d here in the case of hadoop s3guard shell 
> commands.
> Can I downgrade these to hadoop_debug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15427) hadoop shell complains needlessly about "ERROR: Tools helper .. not found"

2018-04-30 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458977#comment-16458977
 ] 

Allen Wittenauer commented on HADOOP-15427:
---

Did you try turning it off and on?  :)

I just did a fresh install of trunk from the tarball.  No error.  Is your 
install actually correct?  No missing parts? 
${HADOOP_HOME}/libexec/tools/hadoop-aws.sh exists?

> hadoop shell complains needlessly about "ERROR: Tools helper .. not found"
> --
>
> Key: HADOOP-15427
> URL: https://issues.apache.org/jira/browse/HADOOP-15427
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Minor
>
> toolshelper.sh prints error messages like
> {code}
> ERROR: Tools helper...hadoop/libexec/tools/hadoop-aws.sh was not found.
> {code}
> even when they aren't neede,d here in the case of hadoop s3guard shell 
> commands.
> Can I downgrade these to hadoop_debug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458970#comment-16458970
 ] 

Aaron Fabbri commented on HADOOP-15430:
---

I guess we need a test case for the CLI mkdir, as a similar case had a 
regression before (HADOOP-14428).  

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Blocker
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong

2018-04-30 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458963#comment-16458963
 ] 

Ajay Kumar commented on HADOOP-15250:
-

[~gss2002], If you don't mind I would like to continue this patch.
[~gss2002], [~ste...@apache.org], [~arpitagarwal] Requesting review of new 
patch when possible.

> Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
> --
>
> Key: HADOOP-15250
> URL: https://issues.apache.org/jira/browse/HADOOP-15250
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, net
>Affects Versions: 2.7.3, 2.9.0, 3.0.0
> Environment: Multihome cluster with split DNS and rDNS lookup of 
> localhost returning non-routable IPAddr
>Reporter: Greg Senia
>Priority: Critical
> Attachments: HADOOP-15250.00.patch, HADOOP-15250.patch
>
>
> We run our Hadoop clusters with two networks attached to each node. These 
> network are as follows a server network that is firewalled with firewalld 
> allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and 
> the HTTP YARN RM/ATS and MR History Server. The second network is the cluster 
> network on the second network interface this uses Jumbo frames and is open no 
> restrictions and allows all cluster traffic to flow between nodes. 
>  
> To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the 
> traffic is originating from nodes with cluster networks we return the 
> internal DNS record for the nodes. This all works fine with all the 
> multi-homing features added to Hadoop 2.x
>  Some logic around views:
> a. The internal view is used by cluster machines when performing lookups. So 
> hosts on the cluster network should get answers from the internal view in DNS
> b. The external view is used by non-local-cluster machines when performing 
> lookups. So hosts not on the cluster network should get answers from the 
> external view in DNS
>  
> So this brings me to our problem. We created some firewall rules to allow 
> inbound traffic from each clusters server network to allow distcp to occur. 
> But we noticed a problem almost immediately that when YARN attempted to talk 
> to the Remote Cluster it was binding outgoing traffic to the cluster network 
> interface which IS NOT routable. So after researching the code we noticed the 
> following in NetUtils.java and Client.java 
> Basically in Client.java it looks as if it takes whatever the hostname is and 
> attempts to bind to whatever the hostname is resolved to. This is not valid 
> in a multi-homed network with one routable interface and one non routable 
> interface. After reading through the java.net.Socket documentation it is 
> valid to perform socket.bind(null) which will allow the OS routing table and 
> DNS to send the traffic to the correct interface. I will also attach the 
> nework traces and a test patch for 2.7.x and 3.x code base. I have this test 
> fix below in my Hadoop Test Cluster.
> Client.java:
>       
> |/*|
> | | * Bind the socket to the host specified in the principal name of the|
> | | * client, to ensure Server matching address of the client connection|
> | | * to host name in principal passed.|
> | | */|
> | |InetSocketAddress bindAddr = null;|
> | |if (ticket != null && ticket.hasKerberosCredentials()) {|
> | |KerberosInfo krbInfo =|
> | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);|
> | |if (krbInfo != null) {|
> | |String principal = ticket.getUserName();|
> | |String host = SecurityUtil.getHostFromPrincipal(principal);|
> | |// If host name is a valid local address then bind socket to it|
> | |{color:#FF}*InetAddress localAddr = 
> NetUtils.getLocalInetAddress(host);*{color}|
> |{color:#FF} ** {color}|if (localAddr != null) {|
> | |this.socket.setReuseAddress(true);|
> | |if (LOG.isDebugEnabled()) {|
> | |LOG.debug("Binding " + principal + " to " + localAddr);|
> | |}|
> | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*|
> | *{color:#FF}{color}* |*{color:#FF}}{color}*|
> | |}|
> | |}|
>  
> So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows 
> correctly out the correct interfaces:
>  
> diff --git 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
>  
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> index e1be271..c5b4a42 100644
> --- 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> +++ 
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> @@ -305,6 +305,9 @@
>    public static final String  

[jira] [Commented] (HADOOP-15414) Job submit not work well on HDFS Federation with Transparent Encryption feature

2018-04-30 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458962#comment-16458962
 ] 

Daryn Sharp commented on HADOOP-15414:
--

I'm chasing KMS load issues, will check this out soon.  I need to reacquaint 
myself with the history of the design (it took many people a lot of work).  I 
do know one of the intentions is a filesystem subclasses should not be directly 
modifying the credentials.  This patch circumvents the goal.

> Job submit not work well on HDFS Federation with Transparent Encryption 
> feature
> ---
>
> Key: HADOOP-15414
> URL: https://issues.apache.org/jira/browse/HADOOP-15414
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Reporter: He Xiaoqiao
>Priority: Major
> Attachments: HADOOP-15414-trunk.001.patch, 
> HADOOP-15414-trunk.002.patch
>
>
> When submit sample MapReduce job WordCount which read/write path under 
> encryption zone on HDFS Federation in security mode to YARN, task throws 
> exception as below:
> {code:java}
> 18/04/26 16:07:26 INFO mapreduce.Job: Task Id : attempt_JOBID_m_TASKID_0, 
> Status : FAILED
> Error: java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)
> at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:489)
> at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:776)
> at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388)
> at 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1468)
> at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:1538)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:306)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:300)
> at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:161)
> at 
> org.apache.hadoop.fs.viewfs.ChRootedFileSystem.open(ChRootedFileSystem.java:258)
> at 
> org.apache.hadoop.fs.viewfs.ViewFileSystem.open(ViewFileSystem.java:424)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:793)
> at 
> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:85)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:552)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:823)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1690)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> Caused by: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)
> at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332)
> at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128)
> at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:322)
> at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:483)
> at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:478)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1690)
> at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:478)
> ... 21 more
> Caused by: GSSException: No valid credentials provided (Mechanism level: 
> Failed to find 

[jira] [Updated] (HADOOP-15250) Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong

2018-04-30 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-15250:

Attachment: HADOOP-15250.00.patch

> Split-DNS MultiHomed Server Network Cluster Network IPC Client Bind Addr Wrong
> --
>
> Key: HADOOP-15250
> URL: https://issues.apache.org/jira/browse/HADOOP-15250
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, net
>Affects Versions: 2.7.3, 2.9.0, 3.0.0
> Environment: Multihome cluster with split DNS and rDNS lookup of 
> localhost returning non-routable IPAddr
>Reporter: Greg Senia
>Priority: Critical
> Attachments: HADOOP-15250.00.patch, HADOOP-15250.patch
>
>
> We run our Hadoop clusters with two networks attached to each node. These 
> network are as follows a server network that is firewalled with firewalld 
> allowing inbound traffic: only SSH and things like Knox and Hiveserver2 and 
> the HTTP YARN RM/ATS and MR History Server. The second network is the cluster 
> network on the second network interface this uses Jumbo frames and is open no 
> restrictions and allows all cluster traffic to flow between nodes. 
>  
> To resolve DNS within the Hadoop Cluster we use DNS Views via BIND so if the 
> traffic is originating from nodes with cluster networks we return the 
> internal DNS record for the nodes. This all works fine with all the 
> multi-homing features added to Hadoop 2.x
>  Some logic around views:
> a. The internal view is used by cluster machines when performing lookups. So 
> hosts on the cluster network should get answers from the internal view in DNS
> b. The external view is used by non-local-cluster machines when performing 
> lookups. So hosts not on the cluster network should get answers from the 
> external view in DNS
>  
> So this brings me to our problem. We created some firewall rules to allow 
> inbound traffic from each clusters server network to allow distcp to occur. 
> But we noticed a problem almost immediately that when YARN attempted to talk 
> to the Remote Cluster it was binding outgoing traffic to the cluster network 
> interface which IS NOT routable. So after researching the code we noticed the 
> following in NetUtils.java and Client.java 
> Basically in Client.java it looks as if it takes whatever the hostname is and 
> attempts to bind to whatever the hostname is resolved to. This is not valid 
> in a multi-homed network with one routable interface and one non routable 
> interface. After reading through the java.net.Socket documentation it is 
> valid to perform socket.bind(null) which will allow the OS routing table and 
> DNS to send the traffic to the correct interface. I will also attach the 
> nework traces and a test patch for 2.7.x and 3.x code base. I have this test 
> fix below in my Hadoop Test Cluster.
> Client.java:
>       
> |/*|
> | | * Bind the socket to the host specified in the principal name of the|
> | | * client, to ensure Server matching address of the client connection|
> | | * to host name in principal passed.|
> | | */|
> | |InetSocketAddress bindAddr = null;|
> | |if (ticket != null && ticket.hasKerberosCredentials()) {|
> | |KerberosInfo krbInfo =|
> | |remoteId.getProtocol().getAnnotation(KerberosInfo.class);|
> | |if (krbInfo != null) {|
> | |String principal = ticket.getUserName();|
> | |String host = SecurityUtil.getHostFromPrincipal(principal);|
> | |// If host name is a valid local address then bind socket to it|
> | |{color:#FF}*InetAddress localAddr = 
> NetUtils.getLocalInetAddress(host);*{color}|
> |{color:#FF} ** {color}|if (localAddr != null) {|
> | |this.socket.setReuseAddress(true);|
> | |if (LOG.isDebugEnabled()) {|
> | |LOG.debug("Binding " + principal + " to " + localAddr);|
> | |}|
> | |*{color:#FF}bindAddr = new InetSocketAddress(localAddr, 0);{color}*|
> | *{color:#FF}{color}* |*{color:#FF}}{color}*|
> | |}|
> | |}|
>  
> So in my Hadoop 2.7.x Cluster I made the following changes and traffic flows 
> correctly out the correct interfaces:
>  
> diff --git 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
>  
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> index e1be271..c5b4a42 100644
> --- 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> +++ 
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java
> @@ -305,6 +305,9 @@
>    public static final String  IPC_CLIENT_FALLBACK_TO_SIMPLE_AUTH_ALLOWED_KEY 
> = "ipc.client.fallback-to-simple-auth-allowed";
>    public static final boolean 
> IPC_CLIENT_FALLBACK_TO_SIMPLE_AUTH_ALLOWED_DEFAULT = false;
>  
> 

[jira] [Commented] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458951#comment-16458951
 ] 

Steve Loughran commented on HADOOP-15430:
-

And with some checking of the query before it is launched
{code}
java.lang.IllegalStateException: Empty string value for attribute child
at 
com.google.common.base.Preconditions.checkState(Preconditions.java:172)
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadataDynamoDBTranslation.pathToKey(PathMetadataDynamoDBTranslation.java:284)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.innerGet(DynamoDBMetadataStore.java:459)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.lambda$get$2(DynamoDBMetadataStore.java:439)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.get(DynamoDBMetadataStore.java:438)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2110)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:2037)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:2007)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2326)
at 
org.apache.hadoop.fs.shell.Mkdir.processNonexistentPath(Mkdir.java:77)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:288)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:270)
at 
org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:120)
at org.apache.hadoop.fs.shell.Command.run(Command.java:177)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
{code}
I can't seem to recreate it in a test case though, only on the CLI

> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Blocker
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15408) HADOOP-14445 broke Spark.

2018-04-30 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458911#comment-16458911
 ] 

Rushabh S Shah commented on HADOOP-15408:
-

bq. In other words, even though the bytes are the same for both kinds, 
Token#decodeIdentifier() is broken by patch 1,
Trying to understand why we want to decode TokenIdentifier ?

{quote}
In other words, even though the bytes are the same for both kinds, 
Token#decodeIdentifier() is broken by patch 1, because service loader cannot 
find a class to decode kms-dt.
{quote}
I assume you are saying this because of the stack trace that I pasted in 
description.
It failed in our build because we had {{KMSLegacyDelegationTokenIdentifier}} in 
{{hadoop-common-project/hadoop-common/src/main/resources/META-INF/services/org.apache.hadoop.security.token.TokenIdentifier}}
 file and service loader was unable to find 
{{KMSLegacyDelegationTokenIdentifier}}.
But after removing {{KMSLegacyDelegationTokenIdentifier}} from 
{{hadoop-common-project/hadoop-common/src/main/resources/META-INF/services/org.apache.hadoop.security.token.TokenIdentifier}},
 the job is able to run successfully.

> HADOOP-14445 broke Spark.
> -
>
> Key: HADOOP-15408
> URL: https://issues.apache.org/jira/browse/HADOOP-15408
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Rushabh S Shah
>Priority: Blocker
> Attachments: HADOOP-15408-trunk.001.patch, split.patch, 
> split.prelim.patch
>
>
> Spark bundles hadoop related jars in their package.
>  Spark expects backwards compatibility between minor versions.
>  Their job failed after we deployed HADOOP-14445 in our test cluster.
> {noformat}
> 2018-04-20 21:09:53,245 INFO [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
> 2018-04-20 21:09:53,273 ERROR [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.util.ServiceConfigurationError: 
> org.apache.hadoop.security.token.TokenIdentifier: Provider 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$
> KMSLegacyDelegationTokenIdentifier could not be instantiated
> at java.util.ServiceLoader.fail(ServiceLoader.java:232)
> at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> at 
> org.apache.hadoop.security.token.Token.getClassForIdentifier(Token.java:117)
> at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:138)
> at org.apache.hadoop.security.token.Token.identifierToString(Token.java:393)
> at org.apache.hadoop.security.token.Token.toString(Token.java:413)
> at java.lang.String.valueOf(String.java:2994)
> at 
> org.apache.commons.logging.impl.SLF4JLocationAwareLog.info(SLF4JLocationAwareLog.java:155)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1634)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1583)
> Caused by: java.lang.NoSuchFieldError: TOKEN_LEGACY_KIND
> at 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$KMSLegacyDelegationTokenIdentifier.(KMSDelegationToken.java:64)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at java.lang.Class.newInstance(Class.java:442)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
> ... 10 more
> 2018-04-20 21:09:53,278 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1
> {noformat}
> Their classpath looks like 
> {{\{...:hadoop-common-pre-HADOOP-14445.jar:.:hadoop-common-with-HADOOP-14445.jar:\}}}
> This is because the container loaded {{KMSDelegationToken}} class from an 
> older jar and {{KMSLegacyDelegationTokenIdentifier}} from new jar and it 
> fails when {{KMSLegacyDelegationTokenIdentifier}} wants to read 
> {{TOKEN_LEGACY_KIND}} from {{KMSDelegationToken}} which doesn't exist before.
>  Cc [~xiaochen]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15421) Stabilise/formalise the JSON _SUCCESS format used in the S3A committers

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458869#comment-16458869
 ] 

Steve Loughran edited comment on HADOOP-15421 at 4/30/18 6:48 PM:
--

Currently
{code}
{
  "name" : "org.apache.hadoop.fs.s3a.commit.files.SuccessData/1",
  "timestamp" : 1525099686641,
  "date" : "Mon Apr 30 14:48:06 UTC 2018",
  "hostname" : "stevel",
  "committer" : "directory",
  "description" : "Task committer attempt_1525098749694_0003_m_00_0",
  "metrics" : {
"stream_write_block_uploads" : 0,
"files_created" : 0,
"S3guard_metadatastore_put_path_latencyNumOps" : 0,
"stream_write_block_uploads_aborted" : 0,
"committer_commits_reverted" : 0,
"op_open" : 0,
"stream_closed" : 0,
"committer_magic_files_created" : 0,
"object_copy_requests" : 0,
"s3guard_metadatastore_initialization" : 1,
"S3guard_metadatastore_put_path_latency90thPercentileLatency" : 0,
"stream_write_block_uploads_committed" : 0,
"S3guard_metadatastore_throttle_rate75thPercentileFrequency (Hz)" : 0,
"S3guard_metadatastore_throttle_rate90thPercentileFrequency (Hz)" : 0,
"committer_bytes_committed" : 5017,
"op_create" : 0,
"stream_read_fully_operations" : 0,
"committer_commits_completed" : 1,
"object_put_requests_active" : 0,
"s3guard_metadatastore_retry" : 0,
"stream_write_block_uploads_active" : 0,
"stream_opened" : 0,
"S3guard_metadatastore_throttle_rate95thPercentileFrequency (Hz)" : 0,
"op_create_non_recursive" : 0,
"object_continue_list_requests" : 0,
"committer_jobs_completed" : 1,
"S3guard_metadatastore_put_path_latency50thPercentileLatency" : 0,
"stream_close_operations" : 0,
"stream_read_operations" : 0,
"object_delete_requests" : 1,
"fake_directories_deleted" : 4,
"stream_aborted" : 0,
"op_rename" : 0,
"object_multipart_aborted" : 0,
"committer_commits_created" : 0,
"op_get_file_status" : 2,
"s3guard_metadatastore_put_path_request" : 1,
"committer_commits_failed" : 0,
"stream_bytes_read_in_close" : 0,
"op_glob_status" : 0,
"stream_read_exceptions" : 0,
"op_exists" : 2,
"S3guard_metadatastore_throttle_rate50thPercentileFrequency (Hz)" : 0,
"S3guard_metadatastore_put_path_latency95thPercentileLatency" : 0,
"stream_write_block_uploads_pending" : 0,
"directories_created" : 0,
"S3guard_metadatastore_throttle_rateNumEvents" : 0,
"S3guard_metadatastore_put_path_latency99thPercentileLatency" : 0,
"stream_bytes_backwards_on_seek" : 0,
"stream_bytes_read" : 0,
"stream_write_total_data" : 0,
"committer_jobs_failed" : 0,
"stream_read_operations_incomplete" : 0,
"files_copied_bytes" : 0,
"op_delete" : 0,
"object_put_bytes_pending" : 0,
"stream_write_block_uploads_data_pending" : 0,
"op_list_located_status" : 0,
"object_list_requests" : 2,
"stream_forward_seek_operations" : 0,
"committer_tasks_completed" : 0,
"committer_commits_aborted" : 0,
"object_metadata_requests" : 4,
"object_put_requests_completed" : 0,
"stream_seek_operations" : 0,
"op_list_status" : 0,
"store_io_throttled" : 0,
"stream_write_failures" : 0,
"op_get_file_checksum" : 0,
"files_copied" : 0,
"ignored_errors" : 0,
"committer_bytes_uploaded" : 0,
"committer_tasks_failed" : 0,
"stream_bytes_skipped_on_seek" : 0,
"op_list_files" : 0,
"files_deleted" : 0,
"stream_bytes_discarded_in_abort" : 0,
"op_mkdirs" : 0,
"op_copy_from_local_file" : 0,
"op_is_directory" : 0,
"s3guard_metadatastore_throttled" : 0,
"S3guard_metadatastore_put_path_latency75thPercentileLatency" : 0,
"stream_write_total_time" : 0,
"stream_backward_seek_operations" : 0,
"object_put_requests" : 0,
"object_put_bytes" : 0,
"directories_deleted" : 0,
"op_is_file" : 0,
"S3guard_metadatastore_throttle_rate99thPercentileFrequency (Hz)" : 0
  },
  "diagnostics" : {
"fs.s3a.metadatastore.impl" : 
"org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore",
"fs.s3a.committer.magic.enabled" : "true",
"fs.s3a.metadatastore.authoritative" : "false"
  },
  "filenames" : [ "/hwdev-steve-ireland/mr_job_dir/output/part-r-0" ]
}
{code}


was (Author: ste...@apache.org):
Currently
{code}
{
  "name" : "org.apache.hadoop.fs.s3a.commit.files.SuccessData/1",
  "timestamp" : 1525099686641,
  "date" : "Mon Apr 30 14:48:06 UTC 2018",
  "hostname" : "stevel",
  "committer" : "directory",
  "description" : "Task committer attempt_1525098749694_0003_m_00_0",
  "metrics" : {
"stream_write_block_uploads" : 0,
"files_created" : 0,
"S3guard_metadatastore_put_path_latencyNumOps" : 0,
"stream_write_block_uploads_aborted" : 0,
"committer_commits_reverted" : 0,
"op_open" : 0,
"stream_closed" : 0,
"committer_magic_files_created" 

[jira] [Commented] (HADOOP-15421) Stabilise/formalise the JSON _SUCCESS format used in the S3A committers

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458869#comment-16458869
 ] 

Steve Loughran commented on HADOOP-15421:
-

Currently
{code}
{
  "name" : "org.apache.hadoop.fs.s3a.commit.files.SuccessData/1",
  "timestamp" : 1525099686641,
  "date" : "Mon Apr 30 14:48:06 UTC 2018",
  "hostname" : "stevel",
  "committer" : "directory",
  "description" : "Task committer attempt_1525098749694_0003_m_00_0",
  "metrics" : {
"stream_write_block_uploads" : 0,
"files_created" : 0,
"S3guard_metadatastore_put_path_latencyNumOps" : 0,
"stream_write_block_uploads_aborted" : 0,
"committer_commits_reverted" : 0,
"op_open" : 0,
"stream_closed" : 0,
"committer_magic_files_created" : 0,
"object_copy_requests" : 0,
"s3guard_metadatastore_initialization" : 1,
"S3guard_metadatastore_put_path_latency90thPercentileLatency" : 0,
"stream_write_block_uploads_committed" : 0,
"S3guard_metadatastore_throttle_rate75thPercentileFrequency (Hz)" : 0,
"S3guard_metadatastore_throttle_rate90thPercentileFrequency (Hz)" : 0,
"committer_bytes_committed" : 5017,
"op_create" : 0,
"stream_read_fully_operations" : 0,
"committer_commits_completed" : 1,
"object_put_requests_active" : 0,
"s3guard_metadatastore_retry" : 0,
"stream_write_block_uploads_active" : 0,
"stream_opened" : 0,
"S3guard_metadatastore_throttle_rate95thPercentileFrequency (Hz)" : 0,
"op_create_non_recursive" : 0,
"object_continue_list_requests" : 0,
"committer_jobs_completed" : 1,
"S3guard_metadatastore_put_path_latency50thPercentileLatency" : 0,
"stream_close_operations" : 0,
"stream_read_operations" : 0,
"object_delete_requests" : 1,
"fake_directories_deleted" : 4,
"stream_aborted" : 0,
"op_rename" : 0,
"object_multipart_aborted" : 0,
"committer_commits_created" : 0,
"op_get_file_status" : 2,
"s3guard_metadatastore_put_path_request" : 1,
"committer_commits_failed" : 0,
"stream_bytes_read_in_close" : 0,
"op_glob_status" : 0,
"stream_read_exceptions" : 0,
"op_exists" : 2,
"S3guard_metadatastore_throttle_rate50thPercentileFrequency (Hz)" : 0,
"S3guard_metadatastore_put_path_latency95thPercentileLatency" : 0,
"stream_write_block_uploads_pending" : 0,
"directories_created" : 0,
"S3guard_metadatastore_throttle_rateNumEvents" : 0,
"S3guard_metadatastore_put_path_latency99thPercentileLatency" : 0,
"stream_bytes_backwards_on_seek" : 0,
"stream_bytes_read" : 0,
"stream_write_total_data" : 0,
"committer_jobs_failed" : 0,
"stream_read_operations_incomplete" : 0,
"files_copied_bytes" : 0,
"op_delete" : 0,
"object_put_bytes_pending" : 0,
"stream_write_block_uploads_data_pending" : 0,
"op_list_located_status" : 0,
"object_list_requests" : 2,
"stream_forward_seek_operations" : 0,
"committer_tasks_completed" : 0,
"committer_commits_aborted" : 0,
"object_metadata_requests" : 4,
"object_put_requests_completed" : 0,
"stream_seek_operations" : 0,
"op_list_status" : 0,
"store_io_throttled" : 0,
"stream_write_failures" : 0,
"op_get_file_checksum" : 0,
"files_copied" : 0,
"ignored_errors" : 0,
"committer_bytes_uploaded" : 0,
"committer_tasks_failed" : 0,
"stream_bytes_skipped_on_seek" : 0,
"op_list_files" : 0,
"files_deleted" : 0,
"stream_bytes_discarded_in_abort" : 0,
"op_mkdirs" : 0,
"op_copy_from_local_file" : 0,
"op_is_directory" : 0,
"s3guard_metadatastore_throttled" : 0,
"S3guard_metadatastore_put_path_latency75thPercentileLatency" : 0,
"stream_write_total_time" : 0,
"stream_backward_seek_operations" : 0,
"object_put_requests" : 0,
"object_put_bytes" : 0,
"directories_deleted" : 0,
"op_is_file" : 0,
"S3guard_metadatastore_throttle_rate99thPercentileFrequency (Hz)" : 0
  },
  "diagnostics" : {
"fs.s3a.metadatastore.impl" : 
"org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore",
"fs.s3a.committer.magic.enabled" : "true",
"fs.s3a.metadatastore.authoritative" : "false"
  },
  "filenames" : [ "/hwdev-steve-ireland/mr_job_dir/output/part-r-0" ]
}

> Stabilise/formalise the JSON _SUCCESS format used in the S3A committers
> ---
>
> Key: HADOOP-15421
> URL: https://issues.apache.org/jira/browse/HADOOP-15421
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Priority: Major
>
> the S3A committers rely on an atomic PUT to save a JSON summary of the job to 
> the dest FS, containing files, statistics, etc. This is for internal testing, 
> but it turns out to be useful for spark integration testing, Hive, etc.
> IBM's 

[jira] [Commented] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458861#comment-16458861
 ] 

Steve Loughran commented on HADOOP-15430:
-

And with a patch to fs.shell to log stack traces, you get the stack
{code}
AttributeValue may not contain an empty string (Service: AmazonDynamoDBv2; 
Status Code: 400; Error Code: ValidationException; Request ID: 
0NGJDAHOTK2A51KRR5MVVVTHPFVV4KQNSO5AEMVJF66Q9ASUAAJG): One or more parameter 
values were invalid: An AttributeValue may not contain an empty string 
(Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; 
Request ID: 0NGJDAHOTK2A51KRR5MVVVTHPFVV4KQNSO5AEMVJF66Q9ASUAAJG)
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateDynamoDBException(S3AUtils.java:389)
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:181)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.get(DynamoDBMetadataStore.java:438)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2110)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:2037)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:2007)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2326)
at 
org.apache.hadoop.fs.shell.Mkdir.processNonexistentPath(Mkdir.java:77)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:288)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:270)
at 
org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:120)
at org.apache.hadoop.fs.shell.Command.run(Command.java:177)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
Caused by: com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: One 
or more parameter values were invalid: An AttributeValue may not contain an 
empty string (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: 
ValidationException; Request ID: 
0NGJDAHOTK2A51KRR5MVVVTHPFVV4KQNSO5AEMVJF66Q9ASUAAJG)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:2925)
at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:2901)
at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeGetItem(AmazonDynamoDBClient.java:1640)
at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.getItem(AmazonDynamoDBClient.java:1616)
at 
com.amazonaws.services.dynamodbv2.document.internal.GetItemImpl.doLoadItem(GetItemImpl.java:77)
at 
com.amazonaws.services.dynamodbv2.document.internal.GetItemImpl.getItem(GetItemImpl.java:66)
at 
com.amazonaws.services.dynamodbv2.document.Table.getItem(Table.java:608)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.getConsistentItem(DynamoDBMetadataStore.java:423)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.innerGet(DynamoDBMetadataStore.java:459)
at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.lambda$get$2(DynamoDBMetadataStore.java:439)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
... 15 more
mkdir: get on s3a://bucket/output/5/: 
com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: One or more 
parameter values were invalid: An AttributeValue may not contain an empty 
string (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: 
ValidationException; Request ID: 
0NGJDAHOTK2A51KRR5MVVVTHPFVV4KQNSO5AEMVJF66Q9ASUAAJG): One or more parameter 
values were invalid: An 

[jira] [Commented] (HADOOP-15429) unsynchronized index causes DataInputByteBuffer$Buffer.read hangs

2018-04-30 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458852#comment-16458852
 ] 

Chris Douglas commented on HADOOP-15429:


{{DataInputByteBuffer}} is not threadsafe. What is the use case?

> unsynchronized index causes DataInputByteBuffer$Buffer.read hangs
> -
>
> Key: HADOOP-15429
> URL: https://issues.apache.org/jira/browse/HADOOP-15429
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.23.0
>Reporter: John Doe
>Priority: Minor
>
> In DataInputByteBuffer$Buffer class, the fields bidx and buffers, etc are 
> unsynchronized when used in read() and reset() function. In certain 
> circumstances, e.g., the reset() is invoked in a loop, the unsynchronized 
> bidx and buffers can trigger a concurrency bug.
> Here is the code snippet.
> {code:java}
> ByteBuffer[] buffers = new ByteBuffer[0];
> int bidx, pos, length;
> @Override
> public int read(byte[] b, int off, int len) {
>   if (bidx >= buffers.length) {
> return -1;
>   }
>   int cur = 0;
>   do {
> int rem = Math.min(len, buffers[bidx].remaining());
> buffers[bidx].get(b, off, rem);
> cur += rem;
> off += rem;
> len -= rem;
>   } while (len > 0 && ++bidx < buffers.length); //bidx is unsynchronized
>   pos += cur;
>   return cur;
> }
> public void reset(ByteBuffer[] buffers) {//if one thread keeps calling 
> reset() in a loop
>   bidx = pos = length = 0;
>   this.buffers = buffers;
>   for (ByteBuffer b : buffers) {
> length += b.remaining();
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15408) HADOOP-14445 broke Spark.

2018-04-30 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458842#comment-16458842
 ] 

Xiao Chen commented on HADOOP-15408:


Thanks for the explanation, that feels better. :)

But as patch 1 stands, the test shows exactly how kms-dt won't work with the 
token identifier, right?
In other words, even though the bytes are the same for both kinds, 
{{Token#decodeIdentifier()}} is broken by patch 1, because service loader 
cannot find a class to decode {{kms-dt}}.

> HADOOP-14445 broke Spark.
> -
>
> Key: HADOOP-15408
> URL: https://issues.apache.org/jira/browse/HADOOP-15408
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Rushabh S Shah
>Priority: Blocker
> Attachments: HADOOP-15408-trunk.001.patch, split.patch, 
> split.prelim.patch
>
>
> Spark bundles hadoop related jars in their package.
>  Spark expects backwards compatibility between minor versions.
>  Their job failed after we deployed HADOOP-14445 in our test cluster.
> {noformat}
> 2018-04-20 21:09:53,245 INFO [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
> 2018-04-20 21:09:53,273 ERROR [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.util.ServiceConfigurationError: 
> org.apache.hadoop.security.token.TokenIdentifier: Provider 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$
> KMSLegacyDelegationTokenIdentifier could not be instantiated
> at java.util.ServiceLoader.fail(ServiceLoader.java:232)
> at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> at 
> org.apache.hadoop.security.token.Token.getClassForIdentifier(Token.java:117)
> at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:138)
> at org.apache.hadoop.security.token.Token.identifierToString(Token.java:393)
> at org.apache.hadoop.security.token.Token.toString(Token.java:413)
> at java.lang.String.valueOf(String.java:2994)
> at 
> org.apache.commons.logging.impl.SLF4JLocationAwareLog.info(SLF4JLocationAwareLog.java:155)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1634)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1583)
> Caused by: java.lang.NoSuchFieldError: TOKEN_LEGACY_KIND
> at 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$KMSLegacyDelegationTokenIdentifier.(KMSDelegationToken.java:64)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at java.lang.Class.newInstance(Class.java:442)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
> ... 10 more
> 2018-04-20 21:09:53,278 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1
> {noformat}
> Their classpath looks like 
> {{\{...:hadoop-common-pre-HADOOP-14445.jar:.:hadoop-common-with-HADOOP-14445.jar:\}}}
> This is because the container loaded {{KMSDelegationToken}} class from an 
> older jar and {{KMSLegacyDelegationTokenIdentifier}} from new jar and it 
> fails when {{KMSLegacyDelegationTokenIdentifier}} wants to read 
> {{TOKEN_LEGACY_KIND}} from {{KMSDelegationToken}} which doesn't exist before.
>  Cc [~xiaochen]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458834#comment-16458834
 ] 

Steve Loughran commented on HADOOP-15430:
-

log (but not full stack). We have some tests for this at the FS level; maybe 
the shell is doing something
{code}
18/04/30 18:16:56 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://bucket/terasort/output/3  (qe/terasort/output/3)
18/04/30 18:16:56 DEBUG s3guard.DynamoDBMetadataStore: Get from table 
hwdev-steve-new in region us-west-1: s3a://bucket/terasort/output/3
18/04/30 18:16:56 DEBUG s3guard.DynamoDBMetadataStore: Get from table 
hwdev-steve-new in region us-west-1 returning for 
s3a://bucket/terasort/output/3: null
18/04/30 18:16:56 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 1 
 ->  1
18/04/30 18:16:56 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 1 
 ->  2
18/04/30 18:16:56 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1  -> 
 1
18/04/30 18:16:56 DEBUG s3a.S3AFileSystem: Not Found: 
s3a://bucket/terasort/output/3
18/04/30 18:16:56 DEBUG s3a.S3AStorageStatistics: op_exists += 1  ->  1
18/04/30 18:16:56 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1  ->  2
18/04/30 18:16:56 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://bucket/terasort/output  (qe/terasort/output)
18/04/30 18:16:56 DEBUG s3guard.DynamoDBMetadataStore: Get from table 
hwdev-steve-new in region us-west-1: s3a://bucket/terasort/output
18/04/30 18:16:57 DEBUG s3guard.DynamoDBMetadataStore: Get from table 
hwdev-steve-new in region us-west-1 returning for s3a://bucket/terasort/output: 
PathMetadata{fileStatus=FileStatus{path=s3a://bucket/terasort/output; 
isDirectory=true; modification_time=1525112217001; access_time=0; owner=hrt_qa; 
group=hrt_qa; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; 
isEncrypted=false; isErasureCoded=false}; isEmptyDirectory=UNKNOWN; 
isDeleted=false}
18/04/30 18:16:57 DEBUG s3a.S3AFileSystem: Making directory: 
s3a://bucket/terasort/output/3/
18/04/30 18:16:57 DEBUG s3a.S3AStorageStatistics: op_mkdirs += 1  ->  1
18/04/30 18:16:57 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1  ->  3
18/04/30 18:16:57 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://bucket/terasort/output/3/  (qe/terasort/output/3/)
18/04/30 18:16:57 DEBUG s3guard.DynamoDBMetadataStore: Get from table 
hwdev-steve-new in region us-west-1: s3a://bucket/terasort/output/3/
mkdir: get on s3a://bucket/terasort/output/3/: 
com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: One or more 
parameter values were invalid: An AttributeValue may not contain an empty 
string (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: 
ValidationException; Request ID: 
6M9S1ALKKSJ5AJV5A9EKDPROFBVV4KQNSO5AEMVJF66Q9ASUAAJG): One or more parameter 
values were invalid: An AttributeValue may not contain an empty string 
(Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; 
Request ID: 6M9S1ALKKSJ5AJV5A9EKDPROFBVV4KQNSO5AEMVJF66Q9ASUAAJG)
18/04/30 18:16:57 DEBUG s3a.S3AFileSystem: Filesystem s3a://hwdev-steve-new is 
closed
{code}


> hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard
> --
>
> Key: HADOOP-15430
> URL: https://issues.apache.org/jira/browse/HADOOP-15430
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Blocker
>
> if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
> ending in "/:. you get a DDB error "An AttributeValue may not contain an 
> empty string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15430) hadoop fs -mkdir -p path-ending-with-slash/ fails with s3guard

2018-04-30 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-15430:
---

 Summary: hadoop fs -mkdir -p path-ending-with-slash/ fails with 
s3guard
 Key: HADOOP-15430
 URL: https://issues.apache.org/jira/browse/HADOOP-15430
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.1.0
Reporter: Steve Loughran


if you call {{hadoop fs -mkdir -p path/}} on the command line with a path 
ending in "/:. you get a DDB error "An AttributeValue may not contain an empty 
string"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15408) HADOOP-14445 broke Spark.

2018-04-30 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458803#comment-16458803
 ] 

Rushabh S Shah commented on HADOOP-15408:
-

bq. so I assumed that was the st you saw...
Yes.

bq. patch 1 fixed the spark issue
Yes.
Just for clarification, patch 1 =  HADOOP-15408-trunk.001.patch

bq. patch 1 does not work with kms-dt
It does work with kms-dt. Just for reference, internally we only use {{kms-dt}} 
since we have uri in the token service since the beginning of EZ.

bq. split.patch does not fix the spark issue
I haven't tested internally with {{split.prelim.patch}}

> HADOOP-14445 broke Spark.
> -
>
> Key: HADOOP-15408
> URL: https://issues.apache.org/jira/browse/HADOOP-15408
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Rushabh S Shah
>Priority: Blocker
> Attachments: HADOOP-15408-trunk.001.patch, split.patch, 
> split.prelim.patch
>
>
> Spark bundles hadoop related jars in their package.
>  Spark expects backwards compatibility between minor versions.
>  Their job failed after we deployed HADOOP-14445 in our test cluster.
> {noformat}
> 2018-04-20 21:09:53,245 INFO [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
> 2018-04-20 21:09:53,273 ERROR [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.util.ServiceConfigurationError: 
> org.apache.hadoop.security.token.TokenIdentifier: Provider 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$
> KMSLegacyDelegationTokenIdentifier could not be instantiated
> at java.util.ServiceLoader.fail(ServiceLoader.java:232)
> at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> at 
> org.apache.hadoop.security.token.Token.getClassForIdentifier(Token.java:117)
> at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:138)
> at org.apache.hadoop.security.token.Token.identifierToString(Token.java:393)
> at org.apache.hadoop.security.token.Token.toString(Token.java:413)
> at java.lang.String.valueOf(String.java:2994)
> at 
> org.apache.commons.logging.impl.SLF4JLocationAwareLog.info(SLF4JLocationAwareLog.java:155)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1634)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1583)
> Caused by: java.lang.NoSuchFieldError: TOKEN_LEGACY_KIND
> at 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$KMSLegacyDelegationTokenIdentifier.(KMSDelegationToken.java:64)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at java.lang.Class.newInstance(Class.java:442)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
> ... 10 more
> 2018-04-20 21:09:53,278 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1
> {noformat}
> Their classpath looks like 
> {{\{...:hadoop-common-pre-HADOOP-14445.jar:.:hadoop-common-with-HADOOP-14445.jar:\}}}
> This is because the container loaded {{KMSDelegationToken}} class from an 
> older jar and {{KMSLegacyDelegationTokenIdentifier}} from new jar and it 
> fails when {{KMSLegacyDelegationTokenIdentifier}} wants to read 
> {{TOKEN_LEGACY_KIND}} from {{KMSDelegationToken}} which doesn't exist before.
>  Cc [~xiaochen]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15421) Stabilise/formalise the JSON _SUCCESS format used in the S3A committers

2018-04-30 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458794#comment-16458794
 ] 

Ryan Blue commented on HADOOP-15421:


I think this makes sense. As long as there is a _SUCCESS file, it may as well 
be used to pass the scope, i.e., which files were successful. What 
statistics/metrics are you adding to the file?

> Stabilise/formalise the JSON _SUCCESS format used in the S3A committers
> ---
>
> Key: HADOOP-15421
> URL: https://issues.apache.org/jira/browse/HADOOP-15421
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Priority: Major
>
> the S3A committers rely on an atomic PUT to save a JSON summary of the job to 
> the dest FS, containing files, statistics, etc. This is for internal testing, 
> but it turns out to be useful for spark integration testing, Hive, etc.
> IBM's stocator also generated a manifest.
> Proposed: come up with (an extensible) design that we are happy with as a 
> long lived format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15429) unsynchronized index causes DataInputByteBuffer$Buffer.read hangs

2018-04-30 Thread John Doe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Doe updated HADOOP-15429:
--
Description: 
In DataInputByteBuffer$Buffer class, the fields bidx and buffers, etc are 
unsynchronized when used in read() and reset() function. In certain 
circumstances, e.g., the reset() is invoked in a loop, the unsynchronized bidx 
and buffers can trigger a concurrency bug.
Here is the code snippet.
{code:java}
ByteBuffer[] buffers = new ByteBuffer[0];
int bidx, pos, length;

@Override
public int read(byte[] b, int off, int len) {
  if (bidx >= buffers.length) {
return -1;
  }
  int cur = 0;
  do {
int rem = Math.min(len, buffers[bidx].remaining());
buffers[bidx].get(b, off, rem);
cur += rem;
off += rem;
len -= rem;
  } while (len > 0 && ++bidx < buffers.length); //bidx is unsynchronized
  pos += cur;
  return cur;
}

public void reset(ByteBuffer[] buffers) {//if one thread keeps calling 
reset() in a loop
  bidx = pos = length = 0;
  this.buffers = buffers;
  for (ByteBuffer b : buffers) {
length += b.remaining();
  }
}
{code}

  was:
In DataInputByteBuffer$Buffer class, the fields bidx and buffers, etc are 
unsynchronized when used in read() and reset() function. In certain 
circumstances, e.g., the reset() is invoked in a loop, the unsynchronized bidx 
and buffers triggers a concurrency bug.
Here is the code snippet.

{code:java}
ByteBuffer[] buffers = new ByteBuffer[0];
int bidx, pos, length;

@Override
public int read(byte[] b, int off, int len) {
  if (bidx >= buffers.length) {
return -1;
  }
  int cur = 0;
  do {
int rem = Math.min(len, buffers[bidx].remaining());
buffers[bidx].get(b, off, rem);
cur += rem;
off += rem;
len -= rem;
  } while (len > 0 && ++bidx < buffers.length); //bidx is unsynchronized
  pos += cur;
  return cur;
}

public void reset(ByteBuffer[] buffers) {//if one thread keeps calling 
reset() in a loop
  bidx = pos = length = 0;
  this.buffers = buffers;
  for (ByteBuffer b : buffers) {
length += b.remaining();
  }
}
{code}



> unsynchronized index causes DataInputByteBuffer$Buffer.read hangs
> -
>
> Key: HADOOP-15429
> URL: https://issues.apache.org/jira/browse/HADOOP-15429
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.23.0
>Reporter: John Doe
>Priority: Minor
>
> In DataInputByteBuffer$Buffer class, the fields bidx and buffers, etc are 
> unsynchronized when used in read() and reset() function. In certain 
> circumstances, e.g., the reset() is invoked in a loop, the unsynchronized 
> bidx and buffers can trigger a concurrency bug.
> Here is the code snippet.
> {code:java}
> ByteBuffer[] buffers = new ByteBuffer[0];
> int bidx, pos, length;
> @Override
> public int read(byte[] b, int off, int len) {
>   if (bidx >= buffers.length) {
> return -1;
>   }
>   int cur = 0;
>   do {
> int rem = Math.min(len, buffers[bidx].remaining());
> buffers[bidx].get(b, off, rem);
> cur += rem;
> off += rem;
> len -= rem;
>   } while (len > 0 && ++bidx < buffers.length); //bidx is unsynchronized
>   pos += cur;
>   return cur;
> }
> public void reset(ByteBuffer[] buffers) {//if one thread keeps calling 
> reset() in a loop
>   bidx = pos = length = 0;
>   this.buffers = buffers;
>   for (ByteBuffer b : buffers) {
> length += b.remaining();
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15429) unsynchronized index causes DataInputByteBuffer$Buffer.read hangs

2018-04-30 Thread John Doe (JIRA)
John Doe created HADOOP-15429:
-

 Summary: unsynchronized index causes 
DataInputByteBuffer$Buffer.read hangs
 Key: HADOOP-15429
 URL: https://issues.apache.org/jira/browse/HADOOP-15429
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.23.0
Reporter: John Doe


In DataInputByteBuffer$Buffer class, the fields bidx and buffers, etc are 
unsynchronized when used in read() and reset() function. In certain 
circumstances, e.g., the reset() is invoked in a loop, the unsynchronized bidx 
and buffers triggers a concurrency bug.
Here is the code snippet.

{code:java}
ByteBuffer[] buffers = new ByteBuffer[0];
int bidx, pos, length;

@Override
public int read(byte[] b, int off, int len) {
  if (bidx >= buffers.length) {
return -1;
  }
  int cur = 0;
  do {
int rem = Math.min(len, buffers[bidx].remaining());
buffers[bidx].get(b, off, rem);
cur += rem;
off += rem;
len -= rem;
  } while (len > 0 && ++bidx < buffers.length); //bidx is unsynchronized
  pos += cur;
  return cur;
}

public void reset(ByteBuffer[] buffers) {//if one thread keeps calling 
reset() in a loop
  bidx = pos = length = 0;
  this.buffers = buffers;
  for (ByteBuffer b : buffers) {
length += b.remaining();
  }
}
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15408) HADOOP-14445 broke Spark.

2018-04-30 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458737#comment-16458737
 ] 

Xiao Chen commented on HADOOP-15408:


bq. I was asking that where did you see this stack trace ?
That was copied from the jira description, so I assumed that was the st you 
saw...

So to conclude, is my understanding correct? (listing below for easier pointing)
# patch 1 fixed the spark issue
# patch 1 does not work with kms-dt
# split.patch does not fix the spark issue

On a separate note, we've also had an hbase internal test failure last friday. 
I'm looking into that...

> HADOOP-14445 broke Spark.
> -
>
> Key: HADOOP-15408
> URL: https://issues.apache.org/jira/browse/HADOOP-15408
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Rushabh S Shah
>Priority: Blocker
> Attachments: HADOOP-15408-trunk.001.patch, split.patch, 
> split.prelim.patch
>
>
> Spark bundles hadoop related jars in their package.
>  Spark expects backwards compatibility between minor versions.
>  Their job failed after we deployed HADOOP-14445 in our test cluster.
> {noformat}
> 2018-04-20 21:09:53,245 INFO [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
> 2018-04-20 21:09:53,273 ERROR [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.util.ServiceConfigurationError: 
> org.apache.hadoop.security.token.TokenIdentifier: Provider 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$
> KMSLegacyDelegationTokenIdentifier could not be instantiated
> at java.util.ServiceLoader.fail(ServiceLoader.java:232)
> at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> at 
> org.apache.hadoop.security.token.Token.getClassForIdentifier(Token.java:117)
> at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:138)
> at org.apache.hadoop.security.token.Token.identifierToString(Token.java:393)
> at org.apache.hadoop.security.token.Token.toString(Token.java:413)
> at java.lang.String.valueOf(String.java:2994)
> at 
> org.apache.commons.logging.impl.SLF4JLocationAwareLog.info(SLF4JLocationAwareLog.java:155)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1634)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1583)
> Caused by: java.lang.NoSuchFieldError: TOKEN_LEGACY_KIND
> at 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$KMSLegacyDelegationTokenIdentifier.(KMSDelegationToken.java:64)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at java.lang.Class.newInstance(Class.java:442)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
> ... 10 more
> 2018-04-20 21:09:53,278 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1
> {noformat}
> Their classpath looks like 
> {{\{...:hadoop-common-pre-HADOOP-14445.jar:.:hadoop-common-with-HADOOP-14445.jar:\}}}
> This is because the container loaded {{KMSDelegationToken}} class from an 
> older jar and {{KMSLegacyDelegationTokenIdentifier}} from new jar and it 
> fails when {{KMSLegacyDelegationTokenIdentifier}} wants to read 
> {{TOKEN_LEGACY_KIND}} from {{KMSDelegationToken}} which doesn't exist before.
>  Cc [~xiaochen]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458732#comment-16458732
 ] 

Steve Loughran commented on HADOOP-15356:
-

oh, see you did the one I was thinking of, HADOOP-15342. Anyway, update is 
fine: this isn't going to break anything else.

FWIW, if you are trying debug ADL, somehting here for 
you:https://github.com/steveloughran/cloudstore/releases

> Make HTTP timeout configurable in ADLS Connector
> 
>
> Key: HADOOP-15356
> URL: https://issues.apache.org/jira/browse/HADOOP-15356
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Reporter: Atul Sikaria
>Assignee: Atul Sikaria
>Priority: Major
> Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch
>
>
> Currently the HTTP timeout for the connections to ADLS are not configurable 
> in Hadoop. This patch enables the timeouts to be configurable based on a 
> core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has 
> default value of 60 seconds - any optimizations to that setting can now be 
> done in Hadoop through core-site.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15408) HADOOP-14445 broke Spark.

2018-04-30 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458731#comment-16458731
 ] 

Rushabh S Shah commented on HADOOP-15408:
-

{quote}What exception did you see with this case?
{quote}
I didn't see any exceptions with this case.
{quote}Just going back to old approach.
{quote}
Sorry I wasn't clear with my previous comment. Let me be more clear this time.
 We internally fixed this issue with {{HADOOP-15408-trunk.001.patch}} and the 
spark jobs ran successfully after this fix.

Going back to [this 
comment|https://issues.apache.org/jira/browse/HADOOP-15408?focusedCommentId=16452594=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16452594],
 I was asking that where did you see this stack trace ?
 I assume you applied {{HADOOP-15408-trunk.001.patch}} patch and you 
encountered some failure which showed that stack trace.
Let me know if still I am not clear.

> HADOOP-14445 broke Spark.
> -
>
> Key: HADOOP-15408
> URL: https://issues.apache.org/jira/browse/HADOOP-15408
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Rushabh S Shah
>Priority: Blocker
> Attachments: HADOOP-15408-trunk.001.patch, split.patch, 
> split.prelim.patch
>
>
> Spark bundles hadoop related jars in their package.
>  Spark expects backwards compatibility between minor versions.
>  Their job failed after we deployed HADOOP-14445 in our test cluster.
> {noformat}
> 2018-04-20 21:09:53,245 INFO [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
> 2018-04-20 21:09:53,273 ERROR [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.util.ServiceConfigurationError: 
> org.apache.hadoop.security.token.TokenIdentifier: Provider 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$
> KMSLegacyDelegationTokenIdentifier could not be instantiated
> at java.util.ServiceLoader.fail(ServiceLoader.java:232)
> at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> at 
> org.apache.hadoop.security.token.Token.getClassForIdentifier(Token.java:117)
> at org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:138)
> at org.apache.hadoop.security.token.Token.identifierToString(Token.java:393)
> at org.apache.hadoop.security.token.Token.toString(Token.java:413)
> at java.lang.String.valueOf(String.java:2994)
> at 
> org.apache.commons.logging.impl.SLF4JLocationAwareLog.info(SLF4JLocationAwareLog.java:155)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1634)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1583)
> Caused by: java.lang.NoSuchFieldError: TOKEN_LEGACY_KIND
> at 
> org.apache.hadoop.crypto.key.kms.KMSDelegationToken$KMSLegacyDelegationTokenIdentifier.(KMSDelegationToken.java:64)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at java.lang.Class.newInstance(Class.java:442)
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
> ... 10 more
> 2018-04-20 21:09:53,278 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1
> {noformat}
> Their classpath looks like 
> {{\{...:hadoop-common-pre-HADOOP-14445.jar:.:hadoop-common-with-HADOOP-14445.jar:\}}}
> This is because the container loaded {{KMSDelegationToken}} class from an 
> older jar and {{KMSLegacyDelegationTokenIdentifier}} from new jar and it 
> fails when {{KMSLegacyDelegationTokenIdentifier}} wants to read 
> {{TOKEN_LEGACY_KIND}} from {{KMSDelegationToken}} which doesn't exist before.
>  Cc [~xiaochen]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15356) Make HTTP timeout configurable in ADLS Connector

2018-04-30 Thread Sean Mackrory (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458683#comment-16458683
 ] 

Sean Mackrory commented on HADOOP-15356:


{quote}I think there are some existing update ADL JIRAs{quote}

I don't see any open JIRAs to upgrade the SDK, but regardless: this requires 
features not currently in a released ADLS SDK, whereas the original patch would 
have worked on any SDK as far back as 2.2.4, so eventually, yes, we'll have to 
test & merge in a new SDK and then apply this patch.

> Make HTTP timeout configurable in ADLS Connector
> 
>
> Key: HADOOP-15356
> URL: https://issues.apache.org/jira/browse/HADOOP-15356
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/adl
>Reporter: Atul Sikaria
>Assignee: Atul Sikaria
>Priority: Major
> Attachments: HADOOP-15356.001.patch, HADOOP-15356.002.patch
>
>
> Currently the HTTP timeout for the connections to ADLS are not configurable 
> in Hadoop. This patch enables the timeouts to be configurable based on a 
> core-site config setting. Also, up the ADLS SDK version to 2.2.8, that has 
> default value of 60 seconds - any optimizations to that setting can now be 
> done in Hadoop through core-site.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-04-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458679#comment-16458679
 ] 

Íñigo Goiri commented on HADOOP-1:
--

I took a quick look at  [^HADOOP-1.14.patch] and I have a couple comments 
about the contract tests:
* It doesn't seem like Yetus was able to run the contract tests: 
[here|https://builds.apache.org/job/PreCommit-HADOOP-Build/14538/testReport/].
* I think the structure of the contracts goes out of the common practice for 
other contract tests (take FTPContract as an example):
** Instead of using FTPContractTestMixin, I would just do the FTPContract 
approach.
** Naming the tests as {{ITestContractDelete}} is technically correct but I 
would go with the way others define them and call them something like 
{{TestFTPContractDelete}}.
** Maybe it's a little long but it might be more clear to use 
{{FTPExtendedContract}}.

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.2.patch, HADOOP-1.3.patch, HADOOP-1.4.patch, 
> HADOOP-1.5.patch, HADOOP-1.6.patch, HADOOP-1.7.patch, 
> HADOOP-1.8.patch, HADOOP-1.9.patch, HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15428) s3guard bucket-info -unguarded will guard bucket if FS is set to do this automatically

2018-04-30 Thread lqjack (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458671#comment-16458671
 ] 

lqjack commented on HADOOP-15428:
-

https://github.com/apache/hadoop/pull/373

> s3guard bucket-info -unguarded will guard bucket if FS is set to do this 
> automatically
> --
>
> Key: HADOOP-15428
> URL: https://issues.apache.org/jira/browse/HADOOP-15428
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Major
>
> If you call hadoop s3guard bucket-info on a bucket where the fs is set to 
> create a s3guard table on demand, then the DDB table is automatically 
> created. As a result
> the {{bucket-info -unguarded}} option cannot be used, and the call has 
> significant side effects (i.e. it can run up bills)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15428) s3guard bucket-info -unguarded will guard bucket if FS is set to do this automatically

2018-04-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458670#comment-16458670
 ] 

ASF GitHub Bot commented on HADOOP-15428:
-

GitHub user lqjack opened a pull request:

https://github.com/apache/hadoop/pull/373

HADOOP-15428

fix error prompt

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lqjack/hadoop HADOOP-15428

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/373.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #373


commit 0eff92a927b6cdcf137dd467a5eaa52f23eab287
Author: lqjaclee 
Date:   2018-04-30T15:43:41Z

HADOOP-15428

fix error prompt




> s3guard bucket-info -unguarded will guard bucket if FS is set to do this 
> automatically
> --
>
> Key: HADOOP-15428
> URL: https://issues.apache.org/jira/browse/HADOOP-15428
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Major
>
> If you call hadoop s3guard bucket-info on a bucket where the fs is set to 
> create a s3guard table on demand, then the DDB table is automatically 
> created. As a result
> the {{bucket-info -unguarded}} option cannot be used, and the call has 
> significant side effects (i.e. it can run up bills)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-04-30 Thread Lukas Waldmann (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458635#comment-16458635
 ] 

Lukas Waldmann edited comment on HADOOP-1 at 4/30/18 2:55 PM:
--

3.1+: I understand but we are still on 2.7 baseline and I can't envisage we 
move to 3.1+ any time soon :) So I am kinda forced to keep backwards 
compatability


was (Author: luky):
3.1+: I understand but we are still on 2.7 baseline and I can envisage we move 
to 3.1+ any time soon :) So I am kinda forced to keep backwards compatability

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.2.patch, HADOOP-1.3.patch, HADOOP-1.4.patch, 
> HADOOP-1.5.patch, HADOOP-1.6.patch, HADOOP-1.7.patch, 
> HADOOP-1.8.patch, HADOOP-1.9.patch, HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-04-30 Thread Lukas Waldmann (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458635#comment-16458635
 ] 

Lukas Waldmann commented on HADOOP-1:
-

3.1+: I understand but we are still on 2.7 baseline and I can envisage we move 
to 3.1+ any time soon :) So I am kinda forced to keep backwards compatability

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.2.patch, HADOOP-1.3.patch, HADOOP-1.4.patch, 
> HADOOP-1.5.patch, HADOOP-1.6.patch, HADOOP-1.7.patch, 
> HADOOP-1.8.patch, HADOOP-1.9.patch, HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458634#comment-16458634
 ] 

Steve Loughran commented on HADOOP-1:
-

FWIW, I'd focus on Hadoop 3.1+; it's unlikely we'd be backporting to the 2.x 
line. As you note: dependencies.

Tests: I write more tests than production code these days. I certainly spend 
more of my life waiting for tests to finish than any other part of development, 
sad to say.

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.2.patch, HADOOP-1.3.patch, HADOOP-1.4.patch, 
> HADOOP-1.5.patch, HADOOP-1.6.patch, HADOOP-1.7.patch, 
> HADOOP-1.8.patch, HADOOP-1.9.patch, HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-04-30 Thread Lukas Waldmann (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458557#comment-16458557
 ] 

Lukas Waldmann commented on HADOOP-1:
-

Thank you Steve,

I certainly try to test it as much as possible - writing tests usually takes 
longer than writing code :)

I will try to prepare separate builds of the file system for different hadoop 
versions on gitlab so users are not dependent only on the latest trunk.

Unfortunately differences  between dependencies sometimes significantly 
influence functionality 

 

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.2.patch, HADOOP-1.3.patch, HADOOP-1.4.patch, 
> HADOOP-1.5.patch, HADOOP-1.6.patch, HADOOP-1.7.patch, 
> HADOOP-1.8.patch, HADOOP-1.9.patch, HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14444) New implementation of ftp and sftp filesystems

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458536#comment-16458536
 ] 

Steve Loughran commented on HADOOP-1:
-

[~luky] I haven't forgotten about this, it is one of the big reviews I need to 
sit down & go through, the other being the new WASB connector. Both take time 
and testing, and with the FTP connector I'm going to have to bring up a couple 
of FTP servers (linux, windows) to do some more testing against stuff

*if anyone else can help test this, please join in!*

> New implementation of ftp and sftp filesystems
> --
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Lukas Waldmann
>Assignee: Lukas Waldmann
>Priority: Major
> Attachments: HADOOP-1.10.patch, HADOOP-1.11.patch, 
> HADOOP-1.12.patch, HADOOP-1.13.patch, HADOOP-1.14.patch, 
> HADOOP-1.2.patch, HADOOP-1.3.patch, HADOOP-1.4.patch, 
> HADOOP-1.5.patch, HADOOP-1.6.patch, HADOOP-1.7.patch, 
> HADOOP-1.8.patch, HADOOP-1.9.patch, HADOOP-1.patch
>
>
> Current implementation of FTP and SFTP filesystems have severe limitations 
> and performance issues when dealing with high number of files. Mine patch 
> solve those issues and integrate both filesystems such a way that most of the 
> core functionality is common for both and therefore simplifying the 
> maintainability.
> The core features:
>  * Support for HTTP/SOCKS proxies
>  * Support for passive FTP
>  * Support for explicit FTPS (SSL/TLS)
>  * Support of connection pooling - new connection is not created for every 
> single command but reused from the pool.
>  For huge number of files it shows order of magnitude performance improvement 
> over not pooled connections.
>  * Caching of directory trees. For ftp you always need to list whole 
> directory whenever you ask information about particular file.
>  Again for huge number of files it shows order of magnitude performance 
> improvement over not cached connections.
>  * Support of keep alive (NOOP) messages to avoid connection drops
>  * Support for Unix style or regexp wildcard glob - useful for listing a 
> particular files across whole directory tree
>  * Support for reestablishing broken ftp data transfers - can happen 
> surprisingly often
>  * Support for sftp private keys (including pass phrase)
>  * Support for keeping passwords, private keys and pass phrase in the jceks 
> key stores



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15422) s3guard doesn't init when the secrets are in the s3a URI

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458535#comment-16458535
 ] 

Steve Loughran commented on HADOOP-15422:
-

bq. This is your punishment for putting secrets in your URI.

don't disagree. 
This patch isn't correct BTW; the stack traces I was seeing come from: valid 
credentials in conf, but secrets also in URI. DDB inited, but then URI 
comparisons failed.

What's really needed is for DDB to get the credential list off the FileSystem. 
Simple solution
* If FS is S3AFileSystem: Cast and call a (new? existing) method to get the 
credential list. Use that.
* If it isn't/coming up standalone, don't do that

This will allow s3guard to pick up login details from delegation tokens passed 
through the FS or any similar mechanism




> s3guard doesn't init when the secrets are in the s3a URI
> 
>
> Key: HADOOP-15422
> URL: https://issues.apache.org/jira/browse/HADOOP-15422
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-15422-001.patch
>
>
> If the AWS secrets are in the login, S3guard doesn't list the root dir



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15428) s3guard bucket-info -unguarded will guard bucket if FS is set to do this automatically

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458532#comment-16458532
 ] 

Steve Loughran commented on HADOOP-15428:
-

Here's why: I would expect the command to return "0", for the "condition is 
met". Instead it raises an exception and so returns an error code. You can't 
use it in scripts or tests.

> s3guard bucket-info -unguarded will guard bucket if FS is set to do this 
> automatically
> --
>
> Key: HADOOP-15428
> URL: https://issues.apache.org/jira/browse/HADOOP-15428
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Major
>
> If you call hadoop s3guard bucket-info on a bucket where the fs is set to 
> create a s3guard table on demand, then the DDB table is automatically 
> created. As a result
> the {{bucket-info -unguarded}} option cannot be used, and the call has 
> significant side effects (i.e. it can run up bills)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15427) hadoop shell complains needlessly about "ERROR: Tools helper .. not found"

2018-04-30 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458529#comment-16458529
 ] 

Steve Loughran commented on HADOOP-15427:
-

we see it when you type the 'hadoop s3guard' command. presumably as it's a 
hadoop-aws script, but lacks the same name...

> hadoop shell complains needlessly about "ERROR: Tools helper .. not found"
> --
>
> Key: HADOOP-15427
> URL: https://issues.apache.org/jira/browse/HADOOP-15427
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 3.1.0
>Reporter: Steve Loughran
>Priority: Minor
>
> toolshelper.sh prints error messages like
> {code}
> ERROR: Tools helper...hadoop/libexec/tools/hadoop-aws.sh was not found.
> {code}
> even when they aren't neede,d here in the case of hadoop s3guard shell 
> commands.
> Can I downgrade these to hadoop_debug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org