[jira] [Commented] (HADOOP-11335) KMS ACL in meta data or database

2015-01-17 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281666#comment-14281666
 ] 

Dian Fu commented on HADOOP-11335:
--

The test failure is unrelated to this patch. I have run the failed test case 
locally and it passed.

> KMS ACL in meta data or database
> 
>
> Key: HADOOP-11335
> URL: https://issues.apache.org/jira/browse/HADOOP-11335
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Jerry Chen
>Assignee: Dian Fu
>  Labels: Security
> Attachments: HADOOP-11335.001.patch, HADOOP-11335.002.patch, 
> HADOOP-11335.003.patch, KMS ACL in metadata or database.pdf
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Currently Hadoop KMS has implemented ACL for keys and the per key ACL are 
> stored in the configuration file kms-acls.xml.
> The management of ACL in configuration file would not be easy in enterprise 
> usage and it is put difficulties for backup and recovery.
> It is ideal to store the ACL for keys in the key meta data similar to what 
> file system ACL does.  In this way, the backup and recovery that works on 
> keys should work for ACL for keys too.
> On the other hand, with the ACL in meta data, the ACL of each key can be 
> easily manipulate with API or command line tool and take effect instantly.  
> This is very important for enterprise level access control management.  This 
> feature can be addressed by separate JIRA. While with the configuration file, 
> these would be hard to provide.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11463) Replace method-local TransferManager object with S3AFileSystem#transfers

2015-01-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281506#comment-14281506
 ] 

Hadoop QA commented on HADOOP-11463:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12692927/hadoop-11463-003.patch
  against trunk revision 2908fe4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-aws.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/5423//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/5423//console

This message is automatically generated.

> Replace method-local TransferManager object with S3AFileSystem#transfers
> 
>
> Key: HADOOP-11463
> URL: https://issues.apache.org/jira/browse/HADOOP-11463
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: hadoop-11463-001.patch, hadoop-11463-002.patch, 
> hadoop-11463-003.patch
>
>
> This is continuation of HADOOP-11446.
> The following changes are made according to Thomas Demoor's comments:
> 1. Replace method-local TransferManager object with S3AFileSystem#transfers
> 2. Do not shutdown TransferManager after purging existing multipart file - 
> otherwise the current transfer is unable to proceed
> 3. Shutdown TransferManager instance in the close method of S3AFileSystem



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11463) Replace method-local TransferManager object with S3AFileSystem#transfers

2015-01-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-11463:

Attachment: hadoop-11463-003.patch

> Replace method-local TransferManager object with S3AFileSystem#transfers
> 
>
> Key: HADOOP-11463
> URL: https://issues.apache.org/jira/browse/HADOOP-11463
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: hadoop-11463-001.patch, hadoop-11463-002.patch, 
> hadoop-11463-003.patch
>
>
> This is continuation of HADOOP-11446.
> The following changes are made according to Thomas Demoor's comments:
> 1. Replace method-local TransferManager object with S3AFileSystem#transfers
> 2. Do not shutdown TransferManager after purging existing multipart file - 
> otherwise the current transfer is unable to proceed
> 3. Shutdown TransferManager instance in the close method of S3AFileSystem



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11171) Enable using a proxy server to connect to S3a.

2015-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281481#comment-14281481
 ] 

Hudson commented on HADOOP-11171:
-

FAILURE: Integrated in Hadoop-trunk-Commit #6883 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6883/])
HADOOP-11171 Enable using a proxy server to connect to S3a. (Thomas Demoor via 
stevel) (stevel: rev 2908fe4ec52f78d74e4207274a34d88d54cd468f)
* hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
* 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AConfiguration.java
* hadoop-common-project/hadoop-common/CHANGES.txt


> Enable using a proxy server to connect to S3a.
> --
>
> Key: HADOOP-11171
> URL: https://issues.apache.org/jira/browse/HADOOP-11171
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.4.0
>Reporter: Thomas Demoor
>Assignee: Thomas Demoor
>  Labels: amazon, s3
> Fix For: 2.7.0
>
> Attachments: HADOOP-11171-10.patch, HADOOP-11171-2.patch, 
> HADOOP-11171-3.patch, HADOOP-11171-4.patch, HADOOP-11171-5.patch, 
> HADOOP-11171-6.patch, HADOOP-11171-7.patch, HADOOP-11171-8.patch, 
> HADOOP-11171-9.patch, HADOOP-11171.patch
>
>
> This exposes the AWS SDK config for a proxy (host and port) to s3a through 
> config settings.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11463) Replace method-local TransferManager object with S3AFileSystem#transfers

2015-01-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281479#comment-14281479
 ] 

Steve Loughran commented on HADOOP-11463:
-

I had a look at the other {{FileSystem}} classes -afraid you must also call 
{{super.close()}} for its cleanup work

> Replace method-local TransferManager object with S3AFileSystem#transfers
> 
>
> Key: HADOOP-11463
> URL: https://issues.apache.org/jira/browse/HADOOP-11463
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: hadoop-11463-001.patch, hadoop-11463-002.patch
>
>
> This is continuation of HADOOP-11446.
> The following changes are made according to Thomas Demoor's comments:
> 1. Replace method-local TransferManager object with S3AFileSystem#transfers
> 2. Do not shutdown TransferManager after purging existing multipart file - 
> otherwise the current transfer is unable to proceed
> 3. Shutdown TransferManager instance in the close method of S3AFileSystem



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-11171) Enable using a proxy server to connect to S3a.

2015-01-17 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-11171.
-
   Resolution: Fixed
Fix Version/s: 2.7.0

committed. You are building up more things to document through

> Enable using a proxy server to connect to S3a.
> --
>
> Key: HADOOP-11171
> URL: https://issues.apache.org/jira/browse/HADOOP-11171
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.4.0
>Reporter: Thomas Demoor
>Assignee: Thomas Demoor
>  Labels: amazon, s3
> Fix For: 2.7.0
>
> Attachments: HADOOP-11171-10.patch, HADOOP-11171-2.patch, 
> HADOOP-11171-3.patch, HADOOP-11171-4.patch, HADOOP-11171-5.patch, 
> HADOOP-11171-6.patch, HADOOP-11171-7.patch, HADOOP-11171-8.patch, 
> HADOOP-11171-9.patch, HADOOP-11171.patch
>
>
> This exposes the AWS SDK config for a proxy (host and port) to s3a through 
> config settings.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11171) Enable using a proxy server to connect to S3a.

2015-01-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281477#comment-14281477
 ] 

Steve Loughran commented on HADOOP-11171:
-

now takes ~3s for me, dropping to ~ 2.7s on Java 8.

+1, committing.

> Enable using a proxy server to connect to S3a.
> --
>
> Key: HADOOP-11171
> URL: https://issues.apache.org/jira/browse/HADOOP-11171
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.4.0
>Reporter: Thomas Demoor
>Assignee: Thomas Demoor
>  Labels: amazon, s3
> Attachments: HADOOP-11171-10.patch, HADOOP-11171-2.patch, 
> HADOOP-11171-3.patch, HADOOP-11171-4.patch, HADOOP-11171-5.patch, 
> HADOOP-11171-6.patch, HADOOP-11171-7.patch, HADOOP-11171-8.patch, 
> HADOOP-11171-9.patch, HADOOP-11171.patch
>
>
> This exposes the AWS SDK config for a proxy (host and port) to s3a through 
> config settings.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()

2015-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281476#comment-14281476
 ] 

Hudson commented on HADOOP-10542:
-

FAILURE: Integrated in Hadoop-trunk-Commit #6882 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6882/])
HADOOP-10542 Potential null pointer dereference in Jets3tFileSystemStore 
retrieveBlock(). (Ted Yu via stevel) (stevel: rev 
c6c0f4eb25e511944915bc869e741197f7a277e0)
* 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3/Jets3tFileSystemStore.java
* hadoop-common-project/hadoop-common/CHANGES.txt


> Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
> ---
>
> Key: HADOOP-10542
> URL: https://issues.apache.org/jira/browse/HADOOP-10542
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: hadoop-10542-001.patch
>
>
> {code}
>   in = get(blockToKey(block), byteRangeStart);
>   out = new BufferedOutputStream(new FileOutputStream(fileBlock));
>   byte[] buf = new byte[bufferSize];
>   int numRead;
>   while ((numRead = in.read(buf)) >= 0) {
> {code}
> get() may return null.
> The while loop dereferences in without null check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()

2015-01-17 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-10542:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1, committing

> Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
> ---
>
> Key: HADOOP-10542
> URL: https://issues.apache.org/jira/browse/HADOOP-10542
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: hadoop-10542-001.patch
>
>
> {code}
>   in = get(blockToKey(block), byteRangeStart);
>   out = new BufferedOutputStream(new FileOutputStream(fileBlock));
>   byte[] buf = new byte[bufferSize];
>   int numRead;
>   while ((numRead = in.read(buf)) >= 0) {
> {code}
> get() may return null.
> The while loop dereferences in without null check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11487) NativeS3FileSystem.getStatus must retry on FileNotFoundException

2015-01-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281462#comment-14281462
 ] 

Steve Loughran commented on HADOOP-11487:
-

# Which version of hadoop?
# Which S3 zone? Only US-east lacks create consistency

Blobstores are the bane of our lives. They aren't real filesystems...really 
code around it needs to recognise this and act on it, though as they all have 
standard expectations of files and their metadata, that's not easy

It's not enough to retry on FS status as there are other inconsistencies: 
directory renames and deletes, blob updates, etc. 

There's a new FS client,  s3a, in hadoop 2.6 which is where all future fs/s3 
work is going on. Try it to see if it is any better, though I doubt it.

If we were to fix it, the route would be to go with something derived off 
NetFlix S3mper. Retrying on a 404 is not sufficient.

> NativeS3FileSystem.getStatus must retry on FileNotFoundException
> 
>
> Key: HADOOP-11487
> URL: https://issues.apache.org/jira/browse/HADOOP-11487
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs, fs/s3
>Reporter: Paulo Motta
>
> I'm trying to copy a large amount of files from HDFS to S3 via distcp and I'm 
> getting the following exception:
> {code:java}
> 2015-01-16 20:53:18,187 ERROR [main] 
> org.apache.hadoop.tools.mapred.CopyMapper: Failure in copying 
> hdfs://10.165.35.216/hdfsFolder/file.gz to s3n://s3-bucket/file.gz
> java.io.FileNotFoundException: No such file or directory 
> 's3n://s3-bucket/file.gz'
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
>   at 
> org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> 2015-01-16 20:53:18,276 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.io.FileNotFoundException: No such file or 
> directory 's3n://s3-bucket/file.gz'
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
>   at 
> org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
>   at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> {code}
> However, when I try hadoop fs -ls s3n://s3-bucket/file.gz the file is there. 
> So probably due to Amazon's S3 eventual consistency the job failure.
> In my opinion, in order to fix this problem NativeS3FileSystem.getFileStatus 
> must use fs.s3.maxRetries property in order to avoid failures like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11335) KMS ACL in meta data or database

2015-01-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281452#comment-14281452
 ] 

Hadoop QA commented on HADOOP-11335:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12692914/HADOOP-11335.003.patch
  against trunk revision 43302f6.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1205 javac 
compiler warnings (more than the trunk's current 1204 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-common-project/hadoop-kms 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/5420//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/5420//artifact/patchprocess/diffJavacWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/5420//console

This message is automatically generated.

> KMS ACL in meta data or database
> 
>
> Key: HADOOP-11335
> URL: https://issues.apache.org/jira/browse/HADOOP-11335
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Jerry Chen
>Assignee: Dian Fu
>  Labels: Security
> Attachments: HADOOP-11335.001.patch, HADOOP-11335.002.patch, 
> HADOOP-11335.003.patch, KMS ACL in metadata or database.pdf
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Currently Hadoop KMS has implemented ACL for keys and the per key ACL are 
> stored in the configuration file kms-acls.xml.
> The management of ACL in configuration file would not be easy in enterprise 
> usage and it is put difficulties for backup and recovery.
> It is ideal to store the ACL for keys in the key meta data similar to what 
> file system ACL does.  In this way, the backup and recovery that works on 
> keys should work for ACL for keys too.
> On the other hand, with the ACL in meta data, the ACL of each key can be 
> easily manipulate with API or command line tool and take effect instantly.  
> This is very important for enterprise level access control management.  This 
> feature can be addressed by separate JIRA. While with the configuration file, 
> these would be hard to provide.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11209) Configuration is not thread-safe

2015-01-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281400#comment-14281400
 ] 

Hadoop QA commented on HADOOP-11209:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12692920/HADOOP-11209.003.patch
  against trunk revision 43302f6.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.ha.TestZKFailoverControllerStress

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/5421//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/5421//console

This message is automatically generated.

> Configuration is not thread-safe
> 
>
> Key: HADOOP-11209
> URL: https://issues.apache.org/jira/browse/HADOOP-11209
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Josh Rosen
>Assignee: Varun Saxena
> Attachments: HADOOP-11209.001.patch, HADOOP-11209.002.patch, 
> HADOOP-11209.003.patch
>
>
> {{Configuration}} objects are not fully thread-safe, which causes problems in 
> multi-threaded frameworks like Spark that use these configurations to 
> interact with existing Hadoop APIs (such as InputFormats).
> SPARK-2546 is an example of a problem caused by this lack of thread-safety.  
> In that bug, multiple concurrent modifications of the same Configuration (in 
> third-party code) caused an infinite loop because Configuration's internal 
> {{java.util.HashMap}} is not thread-safe.
> One workaround is for our code to clone Configuration objects; unfortunately, 
> this also suffers from thread-safety issues on older Hadoop versions because 
> Configuration's constructor wasn't thread-safe (HADOOP-10456).
> [Looking at a recent version of 
> Configuration.java|https://github.com/apache/hadoop/blob/d989ac04449dc33da5e2c32a7f24d59cc92de536/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L666],
>  it seems that the private {{updatingResource}} HashMap and 
> {{finalParameters}} HashSet fields the only non-thread-safe collections in 
> Configuration (Java's {{Properties}} class is thread-safe), so I don't think 
> that it would be hard to make Configuration fully thread-safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()

2015-01-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281392#comment-14281392
 ] 

Hadoop QA commented on HADOOP-10542:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12692922/hadoop-10542-001.patch
  against trunk revision 43302f6.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-aws.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/5422//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/5422//console

This message is automatically generated.

> Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
> ---
>
> Key: HADOOP-10542
> URL: https://issues.apache.org/jira/browse/HADOOP-10542
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: hadoop-10542-001.patch
>
>
> {code}
>   in = get(blockToKey(block), byteRangeStart);
>   out = new BufferedOutputStream(new FileOutputStream(fileBlock));
>   byte[] buf = new byte[bufferSize];
>   int numRead;
>   while ((numRead = in.read(buf)) >= 0) {
> {code}
> get() may return null.
> The while loop dereferences in without null check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()

2015-01-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-10542:

Attachment: hadoop-10542-001.patch

> Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
> ---
>
> Key: HADOOP-10542
> URL: https://issues.apache.org/jira/browse/HADOOP-10542
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: hadoop-10542-001.patch
>
>
> {code}
>   in = get(blockToKey(block), byteRangeStart);
>   out = new BufferedOutputStream(new FileOutputStream(fileBlock));
>   byte[] buf = new byte[bufferSize];
>   int numRead;
>   while ((numRead = in.read(buf)) >= 0) {
> {code}
> get() may return null.
> The while loop dereferences in without null check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()

2015-01-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HADOOP-10542:

Status: Patch Available  (was: Open)

> Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
> ---
>
> Key: HADOOP-10542
> URL: https://issues.apache.org/jira/browse/HADOOP-10542
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Attachments: hadoop-10542-001.patch
>
>
> {code}
>   in = get(blockToKey(block), byteRangeStart);
>   out = new BufferedOutputStream(new FileOutputStream(fileBlock));
>   byte[] buf = new byte[bufferSize];
>   int numRead;
>   while ((numRead = in.read(buf)) >= 0) {
> {code}
> get() may return null.
> The while loop dereferences in without null check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()

2015-01-17 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HADOOP-10542:
---

Assignee: Ted Yu

> Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
> ---
>
> Key: HADOOP-10542
> URL: https://issues.apache.org/jira/browse/HADOOP-10542
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> {code}
>   in = get(blockToKey(block), byteRangeStart);
>   out = new BufferedOutputStream(new FileOutputStream(fileBlock));
>   byte[] buf = new byte[bufferSize];
>   int numRead;
>   while ((numRead = in.read(buf)) >= 0) {
> {code}
> get() may return null.
> The while loop dereferences in without null check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11209) Configuration is not thread-safe

2015-01-17 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated HADOOP-11209:
--
Status: Patch Available  (was: Open)

> Configuration is not thread-safe
> 
>
> Key: HADOOP-11209
> URL: https://issues.apache.org/jira/browse/HADOOP-11209
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Josh Rosen
>Assignee: Varun Saxena
> Attachments: HADOOP-11209.001.patch, HADOOP-11209.002.patch, 
> HADOOP-11209.003.patch
>
>
> {{Configuration}} objects are not fully thread-safe, which causes problems in 
> multi-threaded frameworks like Spark that use these configurations to 
> interact with existing Hadoop APIs (such as InputFormats).
> SPARK-2546 is an example of a problem caused by this lack of thread-safety.  
> In that bug, multiple concurrent modifications of the same Configuration (in 
> third-party code) caused an infinite loop because Configuration's internal 
> {{java.util.HashMap}} is not thread-safe.
> One workaround is for our code to clone Configuration objects; unfortunately, 
> this also suffers from thread-safety issues on older Hadoop versions because 
> Configuration's constructor wasn't thread-safe (HADOOP-10456).
> [Looking at a recent version of 
> Configuration.java|https://github.com/apache/hadoop/blob/d989ac04449dc33da5e2c32a7f24d59cc92de536/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L666],
>  it seems that the private {{updatingResource}} HashMap and 
> {{finalParameters}} HashSet fields the only non-thread-safe collections in 
> Configuration (Java's {{Properties}} class is thread-safe), so I don't think 
> that it would be hard to make Configuration fully thread-safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11209) Configuration is not thread-safe

2015-01-17 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated HADOOP-11209:
--
Status: Open  (was: Patch Available)

> Configuration is not thread-safe
> 
>
> Key: HADOOP-11209
> URL: https://issues.apache.org/jira/browse/HADOOP-11209
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Josh Rosen
>Assignee: Varun Saxena
> Attachments: HADOOP-11209.001.patch, HADOOP-11209.002.patch, 
> HADOOP-11209.003.patch
>
>
> {{Configuration}} objects are not fully thread-safe, which causes problems in 
> multi-threaded frameworks like Spark that use these configurations to 
> interact with existing Hadoop APIs (such as InputFormats).
> SPARK-2546 is an example of a problem caused by this lack of thread-safety.  
> In that bug, multiple concurrent modifications of the same Configuration (in 
> third-party code) caused an infinite loop because Configuration's internal 
> {{java.util.HashMap}} is not thread-safe.
> One workaround is for our code to clone Configuration objects; unfortunately, 
> this also suffers from thread-safety issues on older Hadoop versions because 
> Configuration's constructor wasn't thread-safe (HADOOP-10456).
> [Looking at a recent version of 
> Configuration.java|https://github.com/apache/hadoop/blob/d989ac04449dc33da5e2c32a7f24d59cc92de536/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L666],
>  it seems that the private {{updatingResource}} HashMap and 
> {{finalParameters}} HashSet fields the only non-thread-safe collections in 
> Configuration (Java's {{Properties}} class is thread-safe), so I don't think 
> that it would be hard to make Configuration fully thread-safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11209) Configuration is not thread-safe

2015-01-17 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated HADOOP-11209:
--
Attachment: HADOOP-11209.003.patch

> Configuration is not thread-safe
> 
>
> Key: HADOOP-11209
> URL: https://issues.apache.org/jira/browse/HADOOP-11209
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Josh Rosen
>Assignee: Varun Saxena
> Attachments: HADOOP-11209.001.patch, HADOOP-11209.002.patch, 
> HADOOP-11209.003.patch
>
>
> {{Configuration}} objects are not fully thread-safe, which causes problems in 
> multi-threaded frameworks like Spark that use these configurations to 
> interact with existing Hadoop APIs (such as InputFormats).
> SPARK-2546 is an example of a problem caused by this lack of thread-safety.  
> In that bug, multiple concurrent modifications of the same Configuration (in 
> third-party code) caused an infinite loop because Configuration's internal 
> {{java.util.HashMap}} is not thread-safe.
> One workaround is for our code to clone Configuration objects; unfortunately, 
> this also suffers from thread-safety issues on older Hadoop versions because 
> Configuration's constructor wasn't thread-safe (HADOOP-10456).
> [Looking at a recent version of 
> Configuration.java|https://github.com/apache/hadoop/blob/d989ac04449dc33da5e2c32a7f24d59cc92de536/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L666],
>  it seems that the private {{updatingResource}} HashMap and 
> {{finalParameters}} HashSet fields the only non-thread-safe collections in 
> Configuration (Java's {{Properties}} class is thread-safe), so I don't think 
> that it would be hard to make Configuration fully thread-safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11209) Configuration is not thread-safe

2015-01-17 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281355#comment-14281355
 ] 

Varun Saxena commented on HADOOP-11209:
---

Thanks [~ozawa] for the review. Its difficult to simulate thread safety issues 
so will just update a test case which access/modifies the {{Configuration}} 
object from multiple threads.

> Configuration is not thread-safe
> 
>
> Key: HADOOP-11209
> URL: https://issues.apache.org/jira/browse/HADOOP-11209
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Josh Rosen
>Assignee: Varun Saxena
> Attachments: HADOOP-11209.001.patch, HADOOP-11209.002.patch, 
> HADOOP-11209.003.patch
>
>
> {{Configuration}} objects are not fully thread-safe, which causes problems in 
> multi-threaded frameworks like Spark that use these configurations to 
> interact with existing Hadoop APIs (such as InputFormats).
> SPARK-2546 is an example of a problem caused by this lack of thread-safety.  
> In that bug, multiple concurrent modifications of the same Configuration (in 
> third-party code) caused an infinite loop because Configuration's internal 
> {{java.util.HashMap}} is not thread-safe.
> One workaround is for our code to clone Configuration objects; unfortunately, 
> this also suffers from thread-safety issues on older Hadoop versions because 
> Configuration's constructor wasn't thread-safe (HADOOP-10456).
> [Looking at a recent version of 
> Configuration.java|https://github.com/apache/hadoop/blob/d989ac04449dc33da5e2c32a7f24d59cc92de536/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L666],
>  it seems that the private {{updatingResource}} HashMap and 
> {{finalParameters}} HashSet fields the only non-thread-safe collections in 
> Configuration (Java's {{Properties}} class is thread-safe), so I don't think 
> that it would be hard to make Configuration fully thread-safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11335) KMS ACL in meta data or database

2015-01-17 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281352#comment-14281352
 ] 

Dian Fu commented on HADOOP-11335:
--

Hi [~asuresh], thanks a lot for your review and comments.
{quote}
1) JavaKeyStoreProvider:
   * createKey() : I think it might be a bit odd that we have metadata with 
version == 0 
{quote}
OK, have updated the patch and version will no longer be equal to 0.
{quote}
* deleteKey() : Am sorry, I might be missing something but, I did not quite 
understand why we can't just delete the metadata from the cache when the key is 
deleted
{quote}
As we may store ACL in metadata and ACL shouldn't be deleted when the key is 
deleted.
{quote}
2) KeyProvider:
   * The Metadata Class now has a dependency on {{KeyOpType}} which is defined 
in {{KeyProviderAuthorizationExtension}}. The Extension classes were meant to 
add functionality to a KeyProvider. It seems a bit weird that the KeyProvider 
class should have a dependency on an Extension.
{quote}
Agree. Have moved {{KeyOpType}} to {{KeyProvider}}.
{quote}
* Not very confortable adding setters to Metadata. I Guess the original 
implementaion conciously made a choice to not allow modification of metadata 
once it is created (except for the version)
{quote}
Agree. Have removed the setters of Metadata.
{quote}
3) KeyProviderExtension:
   * Do we really need the read and write locks ? The undelying KeyProvider 
should take care of the synchronization. (for eg. {{JavaKeyStoreProvider}}) 
does infact use write and read locks for createKey etc.. this would probably 
lead to unnecessary double locking. 
4) KeyProviderAuthorizationExtension:
   * Same as above.. do we really need the read and write locks ? I feel the 
Extension class should handle its own concurrency semantics
{quote}
The lock in {{JavaKeyStoreProvider}} is only used for key related operations, 
such as {{createKey}}, {{deleteKey}}, etc. As the new added methods such as 
{{createKeyAcl}}, {{deleteKeyAcl}} also need to read/write keystore and these 
methods only exist in {{KeyProviderAuthorizationExtension}}, not in 
{{KeyProvider}}. So we add lock in {{KeyProviderExtension}} and the lock in 
{{KeyProviderAuthorizationExtension}} is inherited from 
{{KeyProviderExtension}}.
{quote}
5) MetadataKeyAuthorizer
   * Remove commented code
{quote}
Removed the commented code in latest patch.
{quote}
Looking at the commented code in {{MetadataKeyAuthorizer}}, I see that you had 
initially toyed with having an extended {{MetadataWithACL}} class. Any reason 
why you did not pursue that design ? It seems to me like that could have been a 
way to probably avoid having to modify the {{JavaKeyStoreProvider}} and 
{{KeyProvider}}. One suggestion would have been to templatized {{KeyProvider}} 
like so :
{noformat}
  public class KeyProvider
  ...
{noformat}
and have different implemenetation of a {{KeyProvider}} like :
{noformat}
  public classs KeyProviderWithACls extends KeyProvider
  ...  
{noformat} 
{quote}
The method you suggested is a good method. I have tried to use this method but 
I found there are some problems. For example, currently we create 
{{KeyProviderAuthorizationExtension}} by wrapping a {{KeyProvider}} in it and 
this {{KeyProvider}} should be created first. Then when we create 
{{KeyProvider}}, we should know the type for {{Metadata}}, for example whether 
it's {{Metadata}} or {{MetadataWithAcl}}. This is a little weird as whether ACL 
is stored in Metadata should be controlled in 
{{KeyProviderAuthorizationExtension}}.
I choose to solve this issue by modifying {{Metadata}} and it currently has two 
elements {{MetadataForKey}} and {{MetadataForAcl}}. You can refer to the latest 
patch (revision 003) for detailed implementation.

> KMS ACL in meta data or database
> 
>
> Key: HADOOP-11335
> URL: https://issues.apache.org/jira/browse/HADOOP-11335
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Jerry Chen
>Assignee: Dian Fu
>  Labels: Security
> Attachments: HADOOP-11335.001.patch, HADOOP-11335.002.patch, 
> HADOOP-11335.003.patch, KMS ACL in metadata or database.pdf
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Currently Hadoop KMS has implemented ACL for keys and the per key ACL are 
> stored in the configuration file kms-acls.xml.
> The management of ACL in configuration file would not be easy in enterprise 
> usage and it is put difficulties for backup and recovery.
> It is ideal to store the ACL for keys in the key meta data similar to what 
> file system ACL does.  In this way, the backup and recovery that works on 
> keys should work for ACL for keys too.
> On the other hand, with the ACL in meta data, the ACL of each key can be 
> easily manipulate 

[jira] [Created] (HADOOP-11487) NativeS3FileSystem.getStatus must retry on FileNotFoundException

2015-01-17 Thread Paulo Motta (JIRA)
Paulo Motta created HADOOP-11487:


 Summary: NativeS3FileSystem.getStatus must retry on 
FileNotFoundException
 Key: HADOOP-11487
 URL: https://issues.apache.org/jira/browse/HADOOP-11487
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs, fs/s3
Reporter: Paulo Motta


I'm trying to copy a large amount of files from HDFS to S3 via distcp and I'm 
getting the following exception:

{code:java}
2015-01-16 20:53:18,187 ERROR [main] org.apache.hadoop.tools.mapred.CopyMapper: 
Failure in copying hdfs://10.165.35.216/hdfsFolder/file.gz to 
s3n://s3-bucket/file.gz
java.io.FileNotFoundException: No such file or directory 
's3n://s3-bucket/file.gz'
at 
org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
at 
org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
2015-01-16 20:53:18,276 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.io.FileNotFoundException: No such file or 
directory 's3n://s3-bucket/file.gz'
at 
org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:445)
at 
org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:187)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:233)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
{code}

However, when I try hadoop fs -ls s3n://s3-bucket/file.gz the file is there. So 
probably due to Amazon's S3 eventual consistency the job failure.

In my opinion, in order to fix this problem NativeS3FileSystem.getFileStatus 
must use fs.s3.maxRetries property in order to avoid failures like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11335) KMS ACL in meta data or database

2015-01-17 Thread Dian Fu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated HADOOP-11335:
-
Attachment: HADOOP-11335.003.patch

> KMS ACL in meta data or database
> 
>
> Key: HADOOP-11335
> URL: https://issues.apache.org/jira/browse/HADOOP-11335
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 2.6.0
>Reporter: Jerry Chen
>Assignee: Dian Fu
>  Labels: Security
> Attachments: HADOOP-11335.001.patch, HADOOP-11335.002.patch, 
> HADOOP-11335.003.patch, KMS ACL in metadata or database.pdf
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Currently Hadoop KMS has implemented ACL for keys and the per key ACL are 
> stored in the configuration file kms-acls.xml.
> The management of ACL in configuration file would not be easy in enterprise 
> usage and it is put difficulties for backup and recovery.
> It is ideal to store the ACL for keys in the key meta data similar to what 
> file system ACL does.  In this way, the backup and recovery that works on 
> keys should work for ACL for keys too.
> On the other hand, with the ACL in meta data, the ACL of each key can be 
> easily manipulate with API or command line tool and take effect instantly.  
> This is very important for enterprise level access control management.  This 
> feature can be addressed by separate JIRA. While with the configuration file, 
> these would be hard to provide.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10542) Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()

2015-01-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281318#comment-14281318
 ] 

Steve Loughran commented on HADOOP-10542:
-

IOE: it's the only thing that ensures callers won't themselves NPE

> Potential null pointer dereference in Jets3tFileSystemStore#retrieveBlock()
> ---
>
> Key: HADOOP-10542
> URL: https://issues.apache.org/jira/browse/HADOOP-10542
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
>   in = get(blockToKey(block), byteRangeStart);
>   out = new BufferedOutputStream(new FileOutputStream(fileBlock));
>   byte[] buf = new byte[bufferSize];
>   int numRead;
>   while ((numRead = in.read(buf)) >= 0) {
> {code}
> get() may return null.
> The while loop dereferences in without null check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-9248) Allow configuration of Amazon S3 Endpoint

2015-01-17 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-9248.

   Resolution: Won't Fix
Fix Version/s: 2.7.0

OK, closing as a wontfix then. 

Timur, when the next beta release of Hadoop comes out, (or even better, grab 
the branch-2 branch and build it), please test the s3a support and make sure it 
works for you

> Allow configuration of Amazon S3 Endpoint
> -
>
> Key: HADOOP-9248
> URL: https://issues.apache.org/jira/browse/HADOOP-9248
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
> Environment: All environments connecting to S3
>Reporter: Timur Perelmutov
> Fix For: 2.7.0
>
>
> http://wiki.apache.org/hadoop/AmazonS3 page describes configuration of Hadoop 
> with S3 as storage. Other systems like EMC Atmos now implement S3 Interface, 
> but in order to be able to connect to them, the endpoint needs to be 
> configurable. Please add a configuration parameter that would be propagated  
> to underlying jets3t library as s3service.s3-endpoint param.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)