[jira] [Commented] (HADOOP-15321) Reduce the RPC Client max retries on timeouts

2018-03-16 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403284#comment-16403284
 ] 

Xiao Chen commented on HADOOP-15321:


Git blaming I see this default has been like this since 0.23. [~kihwal] do you 
have insights on why the default is as such?

> Reduce the RPC Client max retries on timeouts
> -
>
> Key: HADOOP-15321
> URL: https://issues.apache.org/jira/browse/HADOOP-15321
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
>
> Currently, the 
> [default|https://github.com/apache/hadoop/blob/branch-3.0.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java#L379]
>  number of retries when IPC client catch a {{ConnectTimeoutException}} is 45. 
> This seems unreasonably high.
> Given the IPC client timeout is by default 60 seconds, if a DN host is 
> shutdown the client will retry for 45 minutes until aborting. (If host is 
> there but process down, it would throw a connection refused immediately, 
> which is cool)
> Creating this Jira to discuss whether we can reduce that to a reasonable 
> number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15321) Reduce the RPC Client max retries on timeouts

2018-03-16 Thread Xiao Chen (JIRA)
Xiao Chen created HADOOP-15321:
--

 Summary: Reduce the RPC Client max retries on timeouts
 Key: HADOOP-15321
 URL: https://issues.apache.org/jira/browse/HADOOP-15321
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Reporter: Xiao Chen
Assignee: Xiao Chen


Currently, the 
[default|https://github.com/apache/hadoop/blob/branch-3.0.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java#L379]
 number of retries when IPC client catch a {{ConnectTimeoutException}} is 45. 
This seems unreasonably high.

Given the IPC client timeout is by default 60 seconds, if a DN host is shutdown 
the client will retry for 45 minutes until aborting. (If host is there but 
process down, it would throw a connection refused immediately, which is cool)

Creating this Jira to discuss whether we can reduce that to a reasonable number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15317) Improve NetworkTopology chooseRandom's loop

2018-03-16 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403275#comment-16403275
 ] 

Xiao Chen commented on HADOOP-15317:


Discussed with [~eddyxu] about this a little bit, uploaded 
[^HADOOP-15317.02.patch]

We have several options:
 # Similar to patch 1 to use a data structure to track the choose results so we 
can terminate
 # Use a ridiculous big number like (10 * numOfDatanodes) to guarantee 
termination
 # Instead of random choosing in every loop, we only randomize the initial 
index, and then loop through every node exactly once
 # Random once to get a number N, and only choose the Nth valid node

#1 will result in potentially largest memory consumption, #2 is minimal change 
but could be the most CPU-heavy since it's just adding a termination to the 
current while loop.

#3 and #4 should be the least resource-heavy. But #3 has the problem that the 
probability of each node being chosen isn't evenly distributed. Specifically, 
the node immediately after excluded node(s) will have a higher probability of 
being chosen. Patch went with #4, whose only draw back is to make best case 
slower.

Perhaps we can have a threshold (e.g. when availableNodes > numInScopeNodes / 
2, we go with the simple stupid random loop in existing code, so #4's best case 
slowness is covered.

Also thanks for the review [~ajayydv],  comments addressed.

> Improve NetworkTopology chooseRandom's loop
> ---
>
> Key: HADOOP-15317
> URL: https://issues.apache.org/jira/browse/HADOOP-15317
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HADOOP-15317.01.patch, HADOOP-15317.02.patch
>
>
> Recently we found a postmortem case where the ANN seems to be in an infinite 
> loop. From the logs it seems it just went through a rolling restart, and DNs 
> are getting registered.
> Later the NN become unresponsive, and from the stacktrace it's inside a 
> do-while loop inside {{NetworkTopology#chooseRandom}} - part of what's done 
> in HDFS-10320.
> Going through the code and logs I'm not able to come up with any theory 
> (thought about incorrect locking, or the Node object being modified outside 
> of NetworkTopology, both seem impossible) why this is happening, but we 
> should eliminate this loop.
> stacktrace:
> {noformat}
>  Stack:
> java.util.HashMap.hash(HashMap.java:338)
> java.util.HashMap.containsKey(HashMap.java:595)
> java.util.HashSet.contains(HashSet.java:203)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:786)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:732)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseDataNode(BlockPlacementPolicyDefault.java:757)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:692)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:666)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:573)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:461)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:368)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:243)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:115)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4AdditionalDatanode(BlockManager.java:1596)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:3599)
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:717)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15317) Improve NetworkTopology chooseRandom's loop

2018-03-16 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HADOOP-15317:
---
Attachment: HADOOP-15317.02.patch

> Improve NetworkTopology chooseRandom's loop
> ---
>
> Key: HADOOP-15317
> URL: https://issues.apache.org/jira/browse/HADOOP-15317
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HADOOP-15317.01.patch, HADOOP-15317.02.patch
>
>
> Recently we found a postmortem case where the ANN seems to be in an infinite 
> loop. From the logs it seems it just went through a rolling restart, and DNs 
> are getting registered.
> Later the NN become unresponsive, and from the stacktrace it's inside a 
> do-while loop inside {{NetworkTopology#chooseRandom}} - part of what's done 
> in HDFS-10320.
> Going through the code and logs I'm not able to come up with any theory 
> (thought about incorrect locking, or the Node object being modified outside 
> of NetworkTopology, both seem impossible) why this is happening, but we 
> should eliminate this loop.
> stacktrace:
> {noformat}
>  Stack:
> java.util.HashMap.hash(HashMap.java:338)
> java.util.HashMap.containsKey(HashMap.java:595)
> java.util.HashSet.contains(HashSet.java:203)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:786)
> org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:732)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseDataNode(BlockPlacementPolicyDefault.java:757)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:692)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:666)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:573)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:461)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:368)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:243)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:115)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4AdditionalDatanode(BlockManager.java:1596)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:3599)
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:717)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12760) sun.misc.Cleaner has moved to a new location in OpenJDK 9

2018-03-16 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403178#comment-16403178
 ] 

Akira Ajisaka commented on HADOOP-12760:


Unfortunately, we cannot take the existing approach since sun.misc.Cleaner no 
longer exists in Java9. Compiling the above code fails by NoClassDefFoundError.

> sun.misc.Cleaner has moved to a new location in OpenJDK 9
> -
>
> Key: HADOOP-12760
> URL: https://issues.apache.org/jira/browse/HADOOP-12760
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Chris Hegarty
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-12760.00.patch, HADOOP-12760.01.patch, 
> HADOOP-12760.02.patch, HADOOP-12760.03.patch, HADOOP-12760.04.patch, 
> HADOOP-12760.05.patch, HADOOP-12760.06.patch
>
>
> This is a heads-up: there are upcoming changes in JDK 9 that will require, at 
> least, a small update to org.apache.hadoop.crypto.CryptoStreamUtils & 
> org.apache.hadoop.io.nativeio.NativeIO.
> OpenJDK issue no. 8148117: "Move sun.misc.Cleaner to jdk.internal.ref" [1], 
> will move the Cleaner class from sun.misc to jdk.internal.ref. There is 
> ongoing discussion about the possibility of providing a public supported API, 
> maybe in the JDK 9 timeframe, for releasing NIO direct buffer native memory, 
> see the core-libs-dev mail thread [2]. At the very least CryptoStreamUtils & 
> NativeIO [3] should be updated to have knowledge of the new location of the 
> JDK Cleaner.
> [1] https://bugs.openjdk.java.net/browse/JDK-8148117
> [2] 
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038243.html
> [3] https://github.com/apache/hadoop/search?utf8=✓=sun.misc.Cleaner



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15314) Scheme assertion in S3Guard DynamoDBMetadataStore::checkPath is unnecessarily restrictive

2018-03-16 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403083#comment-16403083
 ] 

Aaron Fabbri commented on HADOOP-15314:
---

The DynamoDB implementation (DynamoDBMetadataStore) does not persist the scheme 
(just /bucket/path/etc) so this could be feasible.  We'd need to spell out in 
the MetadataStore API semantics that scheme must be ignored, if that is what we 
wanted, or make that behavior a property of the MetadataStore implementation 
that we could query at runtime and enforce with some new contract test cases.

Mixing hadoop FS clients on a single bucket could be bad? I.e. if they have 
different ways of emulating directories (e.g. how empty directories are 
represented and invariants around that).

One idea is to inspect fs.s3.impl at runtime and if set to S3AFileSystem, add 
s3 to list of allowed schemes.  I'm not sure concurrent access to same bucket 
with proprietary s3:// client code and also the apache hadoop s3a:// client is 
safe though.  That would need an explanation. Do you guys know?

> Scheme assertion in S3Guard DynamoDBMetadataStore::checkPath is unnecessarily 
> restrictive
> -
>
> Key: HADOOP-15314
> URL: https://issues.apache.org/jira/browse/HADOOP-15314
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: DJ Hoffman
>Priority: Major
>
> In version 3.0.0, the checkPath method for dealing with paths prevents us 
> from using the s3:// scheme when utilizing S3Guard. However, in our 
> core-site.xml we have included 
> {noformat}
>   
>     fs.s3.impl
>     org.apache.hadoop.fs.s3a.S3AFileSystem
>   {noformat}
> which should enforce that s3 prefixed paths go through s3a and are properly 
> compatible with s3guard. We removed the assertion that paths use the s3a 
> scheme (some of our paths use the s3 scheme) and our testing thus far with 
> S3Guard enabled have been positive. We believe the assertion in checkPath is 
> unnecessary and could be expanded to include the s3 and s3n schemes if not 
> dropped altogether or altered in some other way. We're happy to develop and 
> test a patch if the community is amenable to the change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12862) LDAP Group Mapping over SSL can not specify trust store

2018-03-16 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403075#comment-16403075
 ] 

Konstantin Shvachko commented on HADOOP-12862:
--

Sounds like testing is a longer-term issue. BTW if I look into Hadoop 
dependencies I see apacheds and ldapsdk. May they be useful for testing. I 
wouldn't know.
The testing went well in our environment.
What do you think about removing {{.ssl.truststore.password}}? I really think 
people should not use configs for passwords. Typically configs are checked in 
in git repositories, so having passwords there is even worth than printing them 
on a command line, which [as you 
suggested|https://issues.apache.org/jira/browse/HADOOP-15315?focusedCommentId=16399166=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16399166]
 is a bad practice.

> LDAP Group Mapping over SSL can not specify trust store
> ---
>
> Key: HADOOP-12862
> URL: https://issues.apache.org/jira/browse/HADOOP-12862
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HADOOP-12862.001.patch, HADOOP-12862.002.patch, 
> HADOOP-12862.003.patch, HADOOP-12862.004.patch, HADOOP-12862.005.patch, 
> HADOOP-12862.006.patch, HADOOP-12862.007.patch, HADOOP-12862.008.patch
>
>
> In a secure environment, SSL is used to encrypt LDAP request for group 
> mapping resolution.
> We (+[~yoderme], +[~tgrayson]) have found that its implementation is strange.
> For information, Hadoop name node, as an LDAP client, talks to a LDAP server 
> to resolve the group mapping of a user. In the case of LDAP over SSL, a 
> typical scenario is to establish one-way authentication (the client verifies 
> the server's certificate is real) by storing the server's certificate in the 
> client's truststore.
> A rarer scenario is to establish two-way authentication: in addition to store 
> truststore for the client to verify the server, the server also verifies the 
> client's certificate is real, and the client stores its own certificate in 
> its keystore.
> However, the current implementation for LDAP over SSL does not seem to be 
> correct in that it only configures keystore but no truststore (so LDAP server 
> can verify Hadoop's certificate, but Hadoop may not be able to verify LDAP 
> server's certificate)
> I think there should an extra pair of properties to specify the 
> truststore/password for LDAP server, and use that to configure system 
> properties {{javax.net.ssl.trustStore}}/{{javax.net.ssl.trustStorePassword}}
> I am a security layman so my words can be imprecise. But I hope this makes 
> sense.
> Oracle's SSL LDAP documentation: 
> http://docs.oracle.com/javase/jndi/tutorial/ldap/security/ssl.html
> JSSE reference guide: 
> http://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/JSSERefGuide.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-14759) S3GuardTool prune to prune specific bucket entries

2018-03-16 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reassigned HADOOP-14759:
---

Assignee: Gabor Bota

> S3GuardTool prune to prune specific bucket entries
> --
>
> Key: HADOOP-14759
> URL: https://issues.apache.org/jira/browse/HADOOP-14759
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0-beta1
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> Users may think that when you provide a URI to a bucket, you are pruning all 
> entries in the table *for that bucket*. In fact you are purging all entries 
> across all buckets in the table:
> {code}
> hadoop s3guard prune -days 7 s3a://ireland-1
> {code}
> It should be restricted to that bucket, unless you specify otherwise
> +maybe also add a hard date rather than a relative one



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15320) Remove customized getFileBlockLocations for hadoop-azure and hadoop-azure-datalake

2018-03-16 Thread shanyu zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shanyu zhao updated HADOOP-15320:
-
Attachment: HADOOP-15320.patch

> Remove customized getFileBlockLocations for hadoop-azure and 
> hadoop-azure-datalake
> --
>
> Key: HADOOP-15320
> URL: https://issues.apache.org/jira/browse/HADOOP-15320
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/adl, fs/azure
>Affects Versions: 2.7.3, 2.9.0, 3.0.0
>Reporter: shanyu zhao
>Assignee: shanyu zhao
>Priority: Major
> Attachments: HADOOP-15320.patch
>
>
> hadoop-azure and hadoop-azure-datalake have its own implementation of 
> getFileBlockLocations(), which faked a list of artificial blocks based on the 
> hard-coded block size. And each block has one host with name "localhost". 
> Take a look at this code:
> [https://github.com/apache/hadoop/blob/release-2.9.0-RC3/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L3485]
> This is a unnecessary mock up for a "remote" file system to mimic HDFS. And 
> the problem with this mock is that for large (~TB) files we generates lots of 
> artificial blocks, and FileInputFormat.getSplits() is slow in calculating 
> splits based on these blocks.
> We can safely remove this customized getFileBlockLocations() implementation, 
> fall back to the default FileSystem.getFileBlockLocations() implementation, 
> which is to return 1 block for any file with 1 host "localhost". Note that 
> this doesn't mean we will create much less splits, because the number of 
> splits is still limited by the blockSize in 
> FileInputFormat.computeSplitSize():
> {code:java}
> return Math.max(minSize, Math.min(goalSize, blockSize));{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15320) Remove customized getFileBlockLocations for hadoop-azure and hadoop-azure-datalake

2018-03-16 Thread shanyu zhao (JIRA)
shanyu zhao created HADOOP-15320:


 Summary: Remove customized getFileBlockLocations for hadoop-azure 
and hadoop-azure-datalake
 Key: HADOOP-15320
 URL: https://issues.apache.org/jira/browse/HADOOP-15320
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/adl, fs/azure
Affects Versions: 3.0.0, 2.9.0, 2.7.3
Reporter: shanyu zhao
Assignee: shanyu zhao


hadoop-azure and hadoop-azure-datalake have its own implementation of 
getFileBlockLocations(), which faked a list of artificial blocks based on the 
hard-coded block size. And each block has one host with name "localhost". Take 
a look at this code:

[https://github.com/apache/hadoop/blob/release-2.9.0-RC3/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java#L3485]

This is a unnecessary mock up for a "remote" file system to mimic HDFS. And the 
problem with this mock is that for large (~TB) files we generates lots of 
artificial blocks, and FileInputFormat.getSplits() is slow in calculating 
splits based on these blocks.

We can safely remove this customized getFileBlockLocations() implementation, 
fall back to the default FileSystem.getFileBlockLocations() implementation, 
which is to return 1 block for any file with 1 host "localhost". Note that this 
doesn't mean we will create much less splits, because the number of splits is 
still limited by the blockSize in FileInputFormat.computeSplitSize():
{code:java}
return Math.max(minSize, Math.min(goalSize, blockSize));{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-12760) sun.misc.Cleaner has moved to a new location in OpenJDK 9

2018-03-16 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402439#comment-16402439
 ] 

Ajay Kumar edited comment on HADOOP-12760 at 3/16/18 8:15 PM:
--

[~ajisakaa], thanks for taking this up. Shall we keep existing approach for 
cases when java version is lower than 9.
{code:java}
CleanerUtil.java
public static void clean(ByteBuffer buffer) throws IOException {
if (IS_JAVA_VERSION_9) {
getCleaner().freeBuffer(buffer);
} else {
final sun.misc.Cleaner bufferCleaner = ((sun.nio.ch.DirectBuffer) buffer)
.cleaner();
bufferCleaner.clean();
}
}
* IS_JAVA_VERSION_9 is static variable referring to java version
{code}



was (Author: ajayydv):
Akira Ajisaka, thanks for taking this up. Shall we keep existing approach for 
cases when java version is lower than 9.
{code:java}
CleanerUtil.java
public static void clean(ByteBuffer buffer) throws IOException {
if (IS_JAVA_VERSION_9) {
getCleaner().freeBuffer(buffer);
} else {
final sun.misc.Cleaner bufferCleaner = ((sun.nio.ch.DirectBuffer) buffer)
.cleaner();
bufferCleaner.clean();
}
}
* IS_JAVA_VERSION_9 is static variable referring to java version
{code}


> sun.misc.Cleaner has moved to a new location in OpenJDK 9
> -
>
> Key: HADOOP-12760
> URL: https://issues.apache.org/jira/browse/HADOOP-12760
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Chris Hegarty
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-12760.00.patch, HADOOP-12760.01.patch, 
> HADOOP-12760.02.patch, HADOOP-12760.03.patch, HADOOP-12760.04.patch, 
> HADOOP-12760.05.patch, HADOOP-12760.06.patch
>
>
> This is a heads-up: there are upcoming changes in JDK 9 that will require, at 
> least, a small update to org.apache.hadoop.crypto.CryptoStreamUtils & 
> org.apache.hadoop.io.nativeio.NativeIO.
> OpenJDK issue no. 8148117: "Move sun.misc.Cleaner to jdk.internal.ref" [1], 
> will move the Cleaner class from sun.misc to jdk.internal.ref. There is 
> ongoing discussion about the possibility of providing a public supported API, 
> maybe in the JDK 9 timeframe, for releasing NIO direct buffer native memory, 
> see the core-libs-dev mail thread [2]. At the very least CryptoStreamUtils & 
> NativeIO [3] should be updated to have knowledge of the new location of the 
> JDK Cleaner.
> [1] https://bugs.openjdk.java.net/browse/JDK-8148117
> [2] 
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038243.html
> [3] https://github.com/apache/hadoop/search?utf8=✓=sun.misc.Cleaner



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12760) sun.misc.Cleaner has moved to a new location in OpenJDK 9

2018-03-16 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402439#comment-16402439
 ] 

Ajay Kumar commented on HADOOP-12760:
-

Akira Ajisaka, thanks for taking this up. Shall we keep existing approach for 
cases when java version is lower than 9.
{code:java}
CleanerUtil.java
public static void clean(ByteBuffer buffer) throws IOException {
if (IS_JAVA_VERSION_9) {
getCleaner().freeBuffer(buffer);
} else {
final sun.misc.Cleaner bufferCleaner = ((sun.nio.ch.DirectBuffer) buffer)
.cleaner();
bufferCleaner.clean();
}
}
* IS_JAVA_VERSION_9 is static variable referring to java version
{code}


> sun.misc.Cleaner has moved to a new location in OpenJDK 9
> -
>
> Key: HADOOP-12760
> URL: https://issues.apache.org/jira/browse/HADOOP-12760
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Chris Hegarty
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-12760.00.patch, HADOOP-12760.01.patch, 
> HADOOP-12760.02.patch, HADOOP-12760.03.patch, HADOOP-12760.04.patch, 
> HADOOP-12760.05.patch, HADOOP-12760.06.patch
>
>
> This is a heads-up: there are upcoming changes in JDK 9 that will require, at 
> least, a small update to org.apache.hadoop.crypto.CryptoStreamUtils & 
> org.apache.hadoop.io.nativeio.NativeIO.
> OpenJDK issue no. 8148117: "Move sun.misc.Cleaner to jdk.internal.ref" [1], 
> will move the Cleaner class from sun.misc to jdk.internal.ref. There is 
> ongoing discussion about the possibility of providing a public supported API, 
> maybe in the JDK 9 timeframe, for releasing NIO direct buffer native memory, 
> see the core-libs-dev mail thread [2]. At the very least CryptoStreamUtils & 
> NativeIO [3] should be updated to have knowledge of the new location of the 
> JDK Cleaner.
> [1] https://bugs.openjdk.java.net/browse/JDK-8148117
> [2] 
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2016-January/038243.html
> [3] https://github.com/apache/hadoop/search?utf8=✓=sun.misc.Cleaner



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x

2018-03-16 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402314#comment-16402314
 ] 

genericqa commented on HADOOP-14178:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 285 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 12m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-hdfs-project/hadoop-hdfs-native-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-mapreduce-project/hadoop-mapreduce-client hadoop-mapreduce-project 
hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 27m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 45m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 12m 56s{color} 
| {color:red} root generated 189 new + 1272 unchanged - 0 fixed = 1461 total 
(was 1272) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
4m 53s{color} | {color:orange} root: The patch generated 2 new + 7285 unchanged 
- 87 fixed = 7287 total (was 7372) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 11m 
16s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m 
47s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-hdfs-project/hadoop-hdfs-native-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-mapreduce-project/hadoop-mapreduce-client hadoop-mapreduce-project 
hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 32m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}127m 12s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate 

[jira] [Resolved] (HADOOP-14699) Impersonation errors with UGI after second principal relogin

2018-03-16 Thread Jeff Storck (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Storck resolved HADOOP-14699.
--
Resolution: Resolved

This issue will be resolved by HADOOP-9747.

> Impersonation errors with UGI after second principal relogin
> 
>
> Key: HADOOP-14699
> URL: https://issues.apache.org/jira/browse/HADOOP-14699
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.6.2, 2.7.3, 2.8.1
>Reporter: Jeff Storck
>Priority: Major
>
> Multiple principals that are logged in using UGI instances that are 
> instantiated from a UGI class loaded by the same classloader will encounter 
> problems when the second principal attempts to relogin and perform an action 
> using a UGI.doAs().  An impersonation will occur and the operation attempted 
> by the second principal after relogging in will fail.  There should not be an 
> implicit attempt to impersonate the second principal through the first 
> principal that logged in.
> I have created  a GitHub project that exhibits the impersonation error with 
> brief instructions on how to set up for the test and run it: 
> https://github.com/jtstorck/ugi-test
> {noformat}18:44:55.687 [pool-2-thread-2] WARN  
> h.u.u.ugirunnable.ugite...@example.com - Unexpected exception while 
> performing task for [ugite...@example.com (auth:KERBEROS)]
> org.apache.hadoop.ipc.RemoteException: User: ugite...@example.com is not 
> allowed to impersonate ugite...@example.com
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1481)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1427)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1337)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:787)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1700)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1436)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1433)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1448)
>   at 
> hadoop.ugitest.UgiTestMain$UgiRunnable.lambda$run$2(UgiTestMain.java:194)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at hadoop.ugitest.UgiTestMain$UgiRunnable.run(UgiTestMain.java:194)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HADOOP-14699) Impersonation errors with UGI after second principal relogin

2018-03-16 Thread Jeff Storck (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402207#comment-16402207
 ] 

Jeff Storck commented on HADOOP-14699:
--

[~xiaochen] Yes, this can be closed with the fixes for HADOOP-9747 in 3.x.

> Impersonation errors with UGI after second principal relogin
> 
>
> Key: HADOOP-14699
> URL: https://issues.apache.org/jira/browse/HADOOP-14699
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 2.6.2, 2.7.3, 2.8.1
>Reporter: Jeff Storck
>Priority: Major
>
> Multiple principals that are logged in using UGI instances that are 
> instantiated from a UGI class loaded by the same classloader will encounter 
> problems when the second principal attempts to relogin and perform an action 
> using a UGI.doAs().  An impersonation will occur and the operation attempted 
> by the second principal after relogging in will fail.  There should not be an 
> implicit attempt to impersonate the second principal through the first 
> principal that logged in.
> I have created  a GitHub project that exhibits the impersonation error with 
> brief instructions on how to set up for the test and run it: 
> https://github.com/jtstorck/ugi-test
> {noformat}18:44:55.687 [pool-2-thread-2] WARN  
> h.u.u.ugirunnable.ugite...@example.com - Unexpected exception while 
> performing task for [ugite...@example.com (auth:KERBEROS)]
> org.apache.hadoop.ipc.RemoteException: User: ugite...@example.com is not 
> allowed to impersonate ugite...@example.com
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1481)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1427)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1337)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:787)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1700)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1436)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1433)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1448)
>   at 
> hadoop.ugitest.UgiTestMain$UgiRunnable.lambda$run$2(UgiTestMain.java:194)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
>   at hadoop.ugitest.UgiTestMain$UgiRunnable.run(UgiTestMain.java:194)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (HADOOP-15319) hadoop fs -rm command misbehaves on recent hadoop version 2.5.0

2018-03-16 Thread Saurabh Padhy (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402174#comment-16402174
 ] 

Saurabh Padhy commented on HADOOP-15319:


[~shahrs87]

We execute the command 

 
{noformat}
hadoop fs -rm "/a/b/c/*"
{noformat}
Still facing the similar issues in the 2.8.2 version. But in case of "/a/b/c/" 
It is unable to remove the files. That's the reason we use "/*" which removes 
everything inside the directory.

 

> hadoop fs -rm command misbehaves on recent hadoop version 2.5.0
> ---
>
> Key: HADOOP-15319
> URL: https://issues.apache.org/jira/browse/HADOOP-15319
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 2.8.2
>Reporter: Saurabh Padhy
>Priority: Major
>
> This issue is regarding hadoop fs -rm command. 
> In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",
> It removes the files inside the c directory only.
> But in case of versions higher to 2.8.2,
> When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"
> It removes the inside files and directory as well.
> Please look into the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15319) hadoop fs -rm command misbehaves on recent hadoop version 2.5.0

2018-03-16 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402138#comment-16402138
 ] 

Rushabh S Shah commented on HADOOP-15319:
-

We have seen several cases where our customer don't quote the path and uses 
glob in the path.
We always ask our customer to put quote around the path.
Can you please see what happens if you run the following command.
{noformat}
hadoop fs -rm "/a/b/c/*"
{noformat}
Please note that the above line is in noformat tag. 

> hadoop fs -rm command misbehaves on recent hadoop version 2.5.0
> ---
>
> Key: HADOOP-15319
> URL: https://issues.apache.org/jira/browse/HADOOP-15319
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 2.8.2
>Reporter: Saurabh Padhy
>Priority: Major
>
> This issue is regarding hadoop fs -rm command. 
> In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",
> It removes the files inside the c directory only.
> But in case of versions higher to 2.8.2,
> When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"
> It removes the inside files and directory as well.
> Please look into the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15319) hadoop fs -rm command misbehaves on recent hadoop version 2.5.0

2018-03-16 Thread Saurabh Padhy (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402000#comment-16402000
 ] 

Saurabh Padhy commented on HADOOP-15319:


[~shahrs87]

Sorry, My bad - It is related to 2.8.2 version.

When we are using {color:#14892c}*"hadoop fs -rm /a/b/c/" or "hdfs dfs -rm 
/a/b/c/*{color}{color:#14892c}*"*{color} it gives error - ''{color:#d04437}c is 
a directory{color}"

But when we tried to execute

*"hadoop fs -rm /a/b/c/*" or "hdfs dfs -rm /a/b/c/*"* the 'c' directory also 
gets deleted

Whereas in 2.4.x version it was not deleting 'c' instead of it was only 
deleting the files inside c.

> hadoop fs -rm command misbehaves on recent hadoop version 2.5.0
> ---
>
> Key: HADOOP-15319
> URL: https://issues.apache.org/jira/browse/HADOOP-15319
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 2.8.2
>Reporter: Saurabh Padhy
>Priority: Major
>
> This issue is regarding hadoop fs -rm command. 
> In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",
> It removes the files inside the c directory only.
> But in case of versions higher to 2.8.2,
> When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"
> It removes the inside files and directory as well.
> Please look into the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15319) hadoop fs -rm command misbehaves on recent hadoop version 2.5.0

2018-03-16 Thread Saurabh Padhy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saurabh Padhy updated HADOOP-15319:
---
Description: 
This issue is regarding hadoop fs -rm command. 

In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",

It removes the files inside the c directory only.

But in case of versions higher to 2.8.2,

When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"

It removes the inside files and directory as well.

Please look into the issue.

  was:
This issue is regarding hadoop fs -rm command. 

In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",

It removes the files inside the c directory only.

But in case of versions higher to 2.5.0,

When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"

It removes the inside files and directory as well.

Please look into the issue.


> hadoop fs -rm command misbehaves on recent hadoop version 2.5.0
> ---
>
> Key: HADOOP-15319
> URL: https://issues.apache.org/jira/browse/HADOOP-15319
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 2.8.2
>Reporter: Saurabh Padhy
>Priority: Major
>
> This issue is regarding hadoop fs -rm command. 
> In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",
> It removes the files inside the c directory only.
> But in case of versions higher to 2.8.2,
> When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"
> It removes the inside files and directory as well.
> Please look into the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15319) hadoop fs -rm command misbehaves on recent hadoop version 2.5.0

2018-03-16 Thread Saurabh Padhy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saurabh Padhy updated HADOOP-15319:
---
Affects Version/s: (was: 2.5.0)
   2.8.2

> hadoop fs -rm command misbehaves on recent hadoop version 2.5.0
> ---
>
> Key: HADOOP-15319
> URL: https://issues.apache.org/jira/browse/HADOOP-15319
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 2.8.2
>Reporter: Saurabh Padhy
>Priority: Major
>
> This issue is regarding hadoop fs -rm command. 
> In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",
> It removes the files inside the c directory only.
> But in case of versions higher to 2.5.0,
> When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"
> It removes the inside files and directory as well.
> Please look into the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15314) Scheme assertion in S3Guard DynamoDBMetadataStore::checkPath is unnecessarily restrictive

2018-03-16 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401949#comment-16401949
 ] 

Steve Loughran commented on HADOOP-15314:
-

I see your goal here: have code which works with EMR switch to ASF Hadoop 
without changes,

A concern I have, however, is that the db is shared across all clients, and if 
at some point things were ever mixed to really support >1 scheme, then its a 
mess. Removing schema checks altogether would be danger, but s3 & s3n could be 
viable ish. Or somehow switch to s3a for the s3guard interaction, so the fact 
that you've brought s3a up under a different original schema isn't visible. 
That would be the best for stability over versions.

Thoughts, [~fabbri]?

> Scheme assertion in S3Guard DynamoDBMetadataStore::checkPath is unnecessarily 
> restrictive
> -
>
> Key: HADOOP-15314
> URL: https://issues.apache.org/jira/browse/HADOOP-15314
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: DJ Hoffman
>Priority: Major
>
> In version 3.0.0, the checkPath method for dealing with paths prevents us 
> from using the s3:// scheme when utilizing S3Guard. However, in our 
> core-site.xml we have included 
> {noformat}
>   
>     fs.s3.impl
>     org.apache.hadoop.fs.s3a.S3AFileSystem
>   {noformat}
> which should enforce that s3 prefixed paths go through s3a and are properly 
> compatible with s3guard. We removed the assertion that paths use the s3a 
> scheme (some of our paths use the s3 scheme) and our testing thus far with 
> S3Guard enabled have been positive. We believe the assertion in checkPath is 
> unnecessary and could be expanded to include the s3 and s3n schemes if not 
> dropped altogether or altered in some other way. We're happy to develop and 
> test a patch if the community is amenable to the change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15314) Scheme assertion in S3Guard DynamoDBMetadataStore::checkPath is unnecessarily restrictive

2018-03-16 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15314:

Summary: Scheme assertion in S3Guard DynamoDBMetadataStore::checkPath is 
unnecessarily restrictive  (was: Scheme assertion in 
DynamoDBMetadataStore::checkPath is unnecessarily restrictive)

> Scheme assertion in S3Guard DynamoDBMetadataStore::checkPath is unnecessarily 
> restrictive
> -
>
> Key: HADOOP-15314
> URL: https://issues.apache.org/jira/browse/HADOOP-15314
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: DJ Hoffman
>Priority: Major
>
> In version 3.0.0, the checkPath method for dealing with paths prevents us 
> from using the s3:// scheme when utilizing S3Guard. However, in our 
> core-site.xml we have included 
> {noformat}
>   
>     fs.s3.impl
>     org.apache.hadoop.fs.s3a.S3AFileSystem
>   {noformat}
> which should enforce that s3 prefixed paths go through s3a and are properly 
> compatible with s3guard. We removed the assertion that paths use the s3a 
> scheme (some of our paths use the s3 scheme) and our testing thus far with 
> S3Guard enabled have been positive. We believe the assertion in checkPath is 
> unnecessary and could be expanded to include the s3 and s3n schemes if not 
> dropped altogether or altered in some other way. We're happy to develop and 
> test a patch if the community is amenable to the change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15314) Scheme assertion in DynamoDBMetadataStore::checkPath is unnecessarily restrictive

2018-03-16 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15314:

Issue Type: Sub-task  (was: Bug)
Parent: HADOOP-15226

> Scheme assertion in DynamoDBMetadataStore::checkPath is unnecessarily 
> restrictive
> -
>
> Key: HADOOP-15314
> URL: https://issues.apache.org/jira/browse/HADOOP-15314
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.0.0
>Reporter: DJ Hoffman
>Priority: Major
>
> In version 3.0.0, the checkPath method for dealing with paths prevents us 
> from using the s3:// scheme when utilizing S3Guard. However, in our 
> core-site.xml we have included 
> {noformat}
>   
>     fs.s3.impl
>     org.apache.hadoop.fs.s3a.S3AFileSystem
>   {noformat}
> which should enforce that s3 prefixed paths go through s3a and are properly 
> compatible with s3guard. We removed the assertion that paths use the s3a 
> scheme (some of our paths use the s3 scheme) and our testing thus far with 
> S3Guard enabled have been positive. We believe the assertion in checkPath is 
> unnecessary and could be expanded to include the s3 and s3n schemes if not 
> dropped altogether or altered in some other way. We're happy to develop and 
> test a patch if the community is amenable to the change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15319) hadoop fs -rm command misbehaves on recent hadoop version 2.5.0

2018-03-16 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401920#comment-16401920
 ] 

Rushabh S Shah commented on HADOOP-15319:
-

First of all 2.5 _is not_ a recent hadoop version.
It was released approximately 4 years ago.

Below results are from cluster running almost recent 2.8 release.
bq. When we execute "hadoop fs -rm /a/b/c/*" or "hdfs dfs -rm /a/b/c/*"
It deleted only the files directly under {{c directory}} and didn't delete any 
directories underneath {{c directory}}.
 

> hadoop fs -rm command misbehaves on recent hadoop version 2.5.0
> ---
>
> Key: HADOOP-15319
> URL: https://issues.apache.org/jira/browse/HADOOP-15319
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 2.5.0
>Reporter: Saurabh Padhy
>Priority: Major
>
> This issue is regarding hadoop fs -rm command. 
> In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",
> It removes the files inside the c directory only.
> But in case of versions higher to 2.5.0,
> When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"
> It removes the inside files and directory as well.
> Please look into the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15319) hadoop fs -rm command misbehaves on recent hadoop version 2.5.0

2018-03-16 Thread Saurabh Padhy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saurabh Padhy updated HADOOP-15319:
---
Description: 
This issue is regarding hadoop fs -rm command. 

In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",

It removes the files inside the c directory only.

But in case of versions higher to 2.5.0,

When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"

It removes the inside files and directory as well.

Please look into the issue.

  was:
This issue is regarding hadoop fs -rm command. 

In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",

It removes the files inside the c directory only.

But in case of versions higher to 2.5.0,

When we execute "hadoop fs -rm /a/b/c/*" or "hdfs dfs -rm /a/b/c/*"

It removes the inside files and directory as well.

Please look into the issue.


> hadoop fs -rm command misbehaves on recent hadoop version 2.5.0
> ---
>
> Key: HADOOP-15319
> URL: https://issues.apache.org/jira/browse/HADOOP-15319
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 2.5.0
>Reporter: Saurabh Padhy
>Priority: Major
>
> This issue is regarding hadoop fs -rm command. 
> In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",
> It removes the files inside the c directory only.
> But in case of versions higher to 2.5.0,
> When we execute "hadoop fs -rm /a/b/c/**" or "hdfs dfs -rm /a/b/c/**"
> It removes the inside files and directory as well.
> Please look into the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15319) hadoop fs -rm command misbehaves on recent hadoop version 2.5.0

2018-03-16 Thread Saurabh Padhy (JIRA)
Saurabh Padhy created HADOOP-15319:
--

 Summary: hadoop fs -rm command misbehaves on recent hadoop version 
2.5.0
 Key: HADOOP-15319
 URL: https://issues.apache.org/jira/browse/HADOOP-15319
 Project: Hadoop Common
  Issue Type: Bug
  Components: bin
Affects Versions: 2.5.0
Reporter: Saurabh Padhy


This issue is regarding hadoop fs -rm command. 

In hadoop version 2.4.0 when we execute "hadoop fs -rm /a/b/c/*",

It removes the files inside the c directory only.

But in case of versions higher to 2.5.0,

When we execute "hadoop fs -rm /a/b/c/*" or "hdfs dfs -rm /a/b/c/*"

It removes the inside files and directory as well.

Please look into the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-12862) LDAP Group Mapping over SSL can not specify trust store

2018-03-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401815#comment-16401815
 ] 

Wei-Chiu Chuang edited comment on HADOOP-12862 at 3/16/18 12:31 PM:


Thank you for your testing effort [~shv]. I've previously manually tested the 
code against a CDH5.13 cluster. Won't hurt to have more people test it in more 
environments.

I proposed adding apache directory service libs so we could test against a mini 
LDAP server a while back. But didn't seem to get much traction. The other way 
(lighter weight) is to include Kerby. But I've not studied it further. 
[~drankye] or [~jiajia] any idea if we could include Kerby for unit testing 
LDAP related code?


was (Author: jojochuang):
Thank you for your testing effort [~shv]. I've previously manually tested the 
code against a CDH5.13 cluster. Won't hurt to have more people test it in more 
environments.

I proposed adding apache directory service libs so we could test against a mini 
LDAP server a while back. But didn't seem to get much traction. The other way 
(lighter weight) is to include Kerby. But I've not studied it further. 
[~drankye] or [~jiajia] any idea if we could include Kerby unit test purposes 
for LDAP related code?

> LDAP Group Mapping over SSL can not specify trust store
> ---
>
> Key: HADOOP-12862
> URL: https://issues.apache.org/jira/browse/HADOOP-12862
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HADOOP-12862.001.patch, HADOOP-12862.002.patch, 
> HADOOP-12862.003.patch, HADOOP-12862.004.patch, HADOOP-12862.005.patch, 
> HADOOP-12862.006.patch, HADOOP-12862.007.patch, HADOOP-12862.008.patch
>
>
> In a secure environment, SSL is used to encrypt LDAP request for group 
> mapping resolution.
> We (+[~yoderme], +[~tgrayson]) have found that its implementation is strange.
> For information, Hadoop name node, as an LDAP client, talks to a LDAP server 
> to resolve the group mapping of a user. In the case of LDAP over SSL, a 
> typical scenario is to establish one-way authentication (the client verifies 
> the server's certificate is real) by storing the server's certificate in the 
> client's truststore.
> A rarer scenario is to establish two-way authentication: in addition to store 
> truststore for the client to verify the server, the server also verifies the 
> client's certificate is real, and the client stores its own certificate in 
> its keystore.
> However, the current implementation for LDAP over SSL does not seem to be 
> correct in that it only configures keystore but no truststore (so LDAP server 
> can verify Hadoop's certificate, but Hadoop may not be able to verify LDAP 
> server's certificate)
> I think there should an extra pair of properties to specify the 
> truststore/password for LDAP server, and use that to configure system 
> properties {{javax.net.ssl.trustStore}}/{{javax.net.ssl.trustStorePassword}}
> I am a security layman so my words can be imprecise. But I hope this makes 
> sense.
> Oracle's SSL LDAP documentation: 
> http://docs.oracle.com/javase/jndi/tutorial/ldap/security/ssl.html
> JSSE reference guide: 
> http://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/JSSERefGuide.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12862) LDAP Group Mapping over SSL can not specify trust store

2018-03-16 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401815#comment-16401815
 ] 

Wei-Chiu Chuang commented on HADOOP-12862:
--

Thank you for your testing effort [~shv]. I've previously manually tested the 
code against a CDH5.13 cluster. Won't hurt to have more people test it in more 
environments.

I proposed adding apache directory service libs so we could test against a mini 
LDAP server a while back. But didn't seem to get much traction. The other way 
(lighter weight) is to include Kerby. But I've not studied it further. 
[~drankye] or [~jiajia] any idea if we could include Kerby unit test purposes 
for LDAP related code?

> LDAP Group Mapping over SSL can not specify trust store
> ---
>
> Key: HADOOP-12862
> URL: https://issues.apache.org/jira/browse/HADOOP-12862
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HADOOP-12862.001.patch, HADOOP-12862.002.patch, 
> HADOOP-12862.003.patch, HADOOP-12862.004.patch, HADOOP-12862.005.patch, 
> HADOOP-12862.006.patch, HADOOP-12862.007.patch, HADOOP-12862.008.patch
>
>
> In a secure environment, SSL is used to encrypt LDAP request for group 
> mapping resolution.
> We (+[~yoderme], +[~tgrayson]) have found that its implementation is strange.
> For information, Hadoop name node, as an LDAP client, talks to a LDAP server 
> to resolve the group mapping of a user. In the case of LDAP over SSL, a 
> typical scenario is to establish one-way authentication (the client verifies 
> the server's certificate is real) by storing the server's certificate in the 
> client's truststore.
> A rarer scenario is to establish two-way authentication: in addition to store 
> truststore for the client to verify the server, the server also verifies the 
> client's certificate is real, and the client stores its own certificate in 
> its keystore.
> However, the current implementation for LDAP over SSL does not seem to be 
> correct in that it only configures keystore but no truststore (so LDAP server 
> can verify Hadoop's certificate, but Hadoop may not be able to verify LDAP 
> server's certificate)
> I think there should an extra pair of properties to specify the 
> truststore/password for LDAP server, and use that to configure system 
> properties {{javax.net.ssl.trustStore}}/{{javax.net.ssl.trustStorePassword}}
> I am a security layman so my words can be imprecise. But I hope this makes 
> sense.
> Oracle's SSL LDAP documentation: 
> http://docs.oracle.com/javase/jndi/tutorial/ldap/security/ssl.html
> JSSE reference guide: 
> http://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/JSSERefGuide.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x

2018-03-16 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401807#comment-16401807
 ] 

Akira Ajisaka commented on HADOOP-14178:


The test failure is not related to the patch.

012 patch:
* Fixed checkstyle warnings.
* Fixed javac warnings except Whitebox.

> Move Mockito up to version 2.x
> --
>
> Key: HADOOP-14178
> URL: https://issues.apache.org/jira/browse/HADOOP-14178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-14178.001.patch, HADOOP-14178.002.patch, 
> HADOOP-14178.003.patch, HADOOP-14178.004.patch, HADOOP-14178.005-wip.patch, 
> HADOOP-14178.005-wip2.patch, HADOOP-14178.005-wip3.patch, 
> HADOOP-14178.005-wip4.patch, HADOOP-14178.005-wip5.patch, 
> HADOOP-14178.005-wip6.patch, HADOOP-14178.005.patch, HADOOP-14178.006.patch, 
> HADOOP-14178.007.patch, HADOOP-14178.008.patch, HADOOP-14178.009.patch, 
> HADOOP-14178.010.patch, HADOOP-14178.011.patch, HADOOP-14178.012.patch
>
>
> I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 
> since the switch to maven in 2011. 
> Mockito is now at version 2.1, [with lots of Java 8 
> support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. 
> That' s not just defining actions as closures, but in supporting Optional 
> types, mocking methods in interfaces, etc. 
> It's only used for testing, and, *provided there aren't regressions*, cost of 
> upgrade is low. The good news: test tools usually come with good test 
> coverage. The bad: mockito does go deep into java bytecodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14178) Move Mockito up to version 2.x

2018-03-16 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HADOOP-14178:
---
Attachment: HADOOP-14178.012.patch

> Move Mockito up to version 2.x
> --
>
> Key: HADOOP-14178
> URL: https://issues.apache.org/jira/browse/HADOOP-14178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-14178.001.patch, HADOOP-14178.002.patch, 
> HADOOP-14178.003.patch, HADOOP-14178.004.patch, HADOOP-14178.005-wip.patch, 
> HADOOP-14178.005-wip2.patch, HADOOP-14178.005-wip3.patch, 
> HADOOP-14178.005-wip4.patch, HADOOP-14178.005-wip5.patch, 
> HADOOP-14178.005-wip6.patch, HADOOP-14178.005.patch, HADOOP-14178.006.patch, 
> HADOOP-14178.007.patch, HADOOP-14178.008.patch, HADOOP-14178.009.patch, 
> HADOOP-14178.010.patch, HADOOP-14178.011.patch, HADOOP-14178.012.patch
>
>
> I don't know when Hadoop picked up Mockito, but it has been frozen at 1.8.5 
> since the switch to maven in 2011. 
> Mockito is now at version 2.1, [with lots of Java 8 
> support|https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2]. 
> That' s not just defining actions as closures, but in supporting Optional 
> types, mocking methods in interfaces, etc. 
> It's only used for testing, and, *provided there aren't regressions*, cost of 
> upgrade is low. The good news: test tools usually come with good test 
> coverage. The bad: mockito does go deep into java bytecodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer -xtrack option to save src/dest filesets as alternative to delete()

2018-03-16 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

Summary: DistCp to offer -xtrack  option to save src/dest filesets as 
alternative to delete()  (was: DistCp to offer option to save src/dest filesets 
as alternative to delete())

> DistCp to offer -xtrack  option to save src/dest filesets as 
> alternative to delete()
> --
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, 
> HADOOP-15208-002.patch, HADOOP-15208-003.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15208) DistCp to offer option to save src/dest filesets as alternative to delete()

2018-03-16 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15208:

   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

This is fixed in HADOOP-15209

> DistCp to offer option to save src/dest filesets as alternative to delete()
> ---
>
> Key: HADOOP-15208
> URL: https://issues.apache.org/jira/browse/HADOOP-15208
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HADOOP-15208-001.patch, HADOOP-15208-002.patch, 
> HADOOP-15208-002.patch, HADOOP-15208-003.patch
>
>
> There are opportunities to improve distcp delete performance and scalability 
> with object stores, but you need to test with production datasets to 
> determine if the optimizations work, don't run out of memory, etc.
> By adding the option to save the sequence files of source, dest listings, 
> people (myself included) can experiment with different strategies before 
> trying to commit one which doesn't scale



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14178) Move Mockito up to version 2.x

2018-03-16 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401545#comment-16401545
 ] 

genericqa commented on HADOOP-14178:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 268 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 12m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-hdfs-project/hadoop-hdfs-native-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-mapreduce-project/hadoop-mapreduce-client hadoop-mapreduce-project 
hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 31m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 55m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 22m 11s{color} 
| {color:red} root generated 220 new + 1269 unchanged - 3 fixed = 1489 total 
(was 1272) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
5m 43s{color} | {color:orange} root: The patch generated 31 new + 7137 
unchanged - 82 fixed = 7168 total (was 7219) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 12m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m 
47s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 29s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-hdfs-project/hadoop-hdfs-native-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-mapreduce-project/hadoop-mapreduce-client hadoop-mapreduce-project 
hadoop-client-modules/hadoop-client-minicluster . {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 35m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}172m 17s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | 

[jira] [Assigned] (HADOOP-15074) SequenceFile#Writer flush does not update the length of the written file.

2018-03-16 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned HADOOP-15074:
--

Assignee: Shashikant Banerjee  (was: Mukul Kumar Singh)

> SequenceFile#Writer flush does not update the length of the written file.
> -
>
> Key: HADOOP-15074
> URL: https://issues.apache.org/jira/browse/HADOOP-15074
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>
> SequenceFile#Writer flush does not update the length of the file. This 
> happens because as part of the flush, {{UPDATE_LENGTH}} flag is not passed to 
> the DFSOutputStream#hsync.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15262) AliyunOSS: rename() to move files in a directory in parallel

2018-03-16 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-15262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401497#comment-16401497
 ] 

genericqa commented on HADOOP-15262:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-aliyun in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 52m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | HADOOP-15262 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914825/HADOOP-15262.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3f0c1191aaca 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 21c6661 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14322/testReport/ |
| Max. process+thread count | 303 (vs. ulimit of 1) |
| modules | C: hadoop-tools/hadoop-aliyun U: hadoop-tools/hadoop-aliyun |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14322/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> AliyunOSS: rename() to move files in a directory in parallel
> 
>
> Key: HADOOP-15262
>