date:20141028


 [ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7235:

Attachment: HDFS-7235.007.patch

 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock


[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186433#comment-14186433
 ] 

Hadoop QA commented on HDFS-7235:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677534/HDFS-7235.007.patch
  against trunk revision 971e91c.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8562//console

This message is automatically generated.

 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7291) Persist in-memory replicas with appropriate unbuffered copy API on POSIX and Windows


[ 
https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186434#comment-14186434
 ] 

Hadoop QA commented on HDFS-7291:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677527/HDFS-7291.3.patch
  against trunk revision 971e91c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to cause Findbugs 
(version 2.0.3) to fail.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8561//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8561//console

This message is automatically generated.

 Persist in-memory replicas with appropriate unbuffered copy API on POSIX and 
 Windows
 

 Key: HDFS-7291
 URL: https://issues.apache.org/jira/browse/HDFS-7291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7291.0.patch, HDFS-7291.1.patch, HDFS-7291.2.patch, 
 HDFS-7291.3.patch


 HDFS-7090 changes to persist in-memory replicas using unbuffered IO on Linux 
 and Windows. On Linux distribution, it relies on the sendfile() API between 
 two file descriptors to achieve unbuffered IO copy. According to Linux 
 document at http://man7.org/linux/man-pages/man2/sendfile.2.html, this is 
 only supported on Linux kernel 2.6.33+. 
 As pointed by Haowei in the discussion below, FileChannel#transferTo already 
 has support for native unbuffered IO on POSIX platform. On Windows, JDK 6/7/8 
 has not implemented native unbuffered IO yet. We change to use 
 FileChannel#transfer for POSIX and our own native wrapper of CopyFileEx on 
 Windows for unbuffered copy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock


[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186437#comment-14186437
 ] 

Yongjun Zhang commented on HDFS-7235:
-

There was TestDistributedShell failure reported as YARN-2607, the symptom there 
looks a bit different than the one reported above. I ran the same test locally 
before (trunk tip 5b1dfe78b8b06335bed0bcb83f12bb936d4c021b) and after the patc, 
they failed the same way,  but the symptom running locally is different than 
YARN-2607, and the report above. 

Seems this test need some more study, just uploaded the same patch here again 
to see how it runs.





 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock


 [ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7235:

Attachment: (was: HDFS-7235.007.patch)

 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock


 [ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7235:

Attachment: HDFS-7235.007.patch

 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7291) Persist in-memory replicas with appropriate unbuffered copy API on POSIX and Windows

2014-10-28 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7291:
-
Attachment: HDFS-7291.4.patch

The NativeIO.c has mix usage of IOException and UnsupportedOperationException 
in similar case. But agree with [~wheat9] UnsupportedOperationException is more 
appropriate, patch updated.

The previous findbugs error is caused by test-patch.sh issue which is unrelated 
to the patch.

/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/dev-support/test-patch.sh:
 line 628: 
/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/../patchprocess/patchFindBugsOutputhadoop-hdfs.txt:
 No such file or directory 

 Persist in-memory replicas with appropriate unbuffered copy API on POSIX and 
 Windows
 

 Key: HDFS-7291
 URL: https://issues.apache.org/jira/browse/HDFS-7291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7291.0.patch, HDFS-7291.1.patch, HDFS-7291.2.patch, 
 HDFS-7291.3.patch, HDFS-7291.4.patch


 HDFS-7090 changes to persist in-memory replicas using unbuffered IO on Linux 
 and Windows. On Linux distribution, it relies on the sendfile() API between 
 two file descriptors to achieve unbuffered IO copy. According to Linux 
 document at http://man7.org/linux/man-pages/man2/sendfile.2.html, this is 
 only supported on Linux kernel 2.6.33+. 
 As pointed by Haowei in the discussion below, FileChannel#transferTo already 
 has support for native unbuffered IO on POSIX platform. On Windows, JDK 6/7/8 
 has not implemented native unbuffered IO yet. We change to use 
 FileChannel#transfer for POSIX and our own native wrapper of CopyFileEx on 
 Windows for unbuffered copy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock


[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186453#comment-14186453
 ] 

Hadoop QA commented on HDFS-7235:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677536/HDFS-7235.007.patch
  against trunk revision 971e91c.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8564//console

This message is automatically generated.

 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7291) Persist in-memory replicas with appropriate unbuffered copy API on POSIX and Windows

[
https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186459#comment-14186459
]

Hadoop QA commented on HDFS-7291:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12677537/HDFS-7291.4.patch
against trunk revision 971e91c.

{color:red}-1 patch{color}. Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8563//console

This message is automatically generated.

Persist in-memory replicas with appropriate unbuffered copy API on POSIX and
Windows

Key: HDFS-7291
URL: https://issues.apache.org/jira/browse/HDFS-7291
Project: Hadoop HDFS
Issue Type: Sub-task
Components: datanode
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
Attachments: HDFS-7291.0.patch, HDFS-7291.1.patch, HDFS-7291.2.patch,
HDFS-7291.3.patch, HDFS-7291.4.patch

HDFS-7090 changes to persist in-memory replicas using unbuffered IO on Linux
and Windows. On Linux distribution, it relies on the sendfile() API between
two file descriptors to achieve unbuffered IO copy. According to Linux
document at http://man7.org/linux/man-pages/man2/sendfile.2.html, this is
only supported on Linux kernel 2.6.33+.
As pointed by Haowei in the discussion below, FileChannel#transferTo already
has support for native unbuffered IO on POSIX platform. On Windows, JDK 6/7/8
has not implemented native unbuffered IO yet. We change to use
FileChannel#transfer for POSIX and our own native wrapper of CopyFileEx on
Windows for unbuffered copy.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails


[ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186474#comment-14186474
 ] 

Harsh J commented on HDFS-6741:
---

Failed test appears unrelated. Manual run with patch passes, so the build 
problem was likely intermittent:

{code}Running org.apache.hadoop.hdfs.tools.TestDFSAdminWithHA
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 58.649 sec - 
in org.apache.hadoop.hdfs.tools.TestDFSAdminWithHA

Results :

Tests run: 10, Failures: 0, Errors: 0, Skipped: 0
{code}

Build console appears truncated somehow for this test:

{code}
Running org.apache.hadoop.hdfs.tools.TestDFSAdminWithHA
estBlockMissingException
{code}

Thank you for the new patch Stephen, sorry again for having duplicated the 
effort. I'll commit this in momentarily.

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails


 [ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-6741:
--
Target Version/s:   (was: 3.0.0, 2.7.0)
Hadoop Flags: Reviewed

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token

2014-10-28 Thread bc Wong (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186481#comment-14186481
]

bc Wong commented on HDFS-7295:
---

bq. Given the fact that in Hadoop there is no way to revoke a DT, expiration
time serves as the last defense of stole tokens.

Not quite true. The
[mechanism|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java#L514]
is there, and even exposed in
[WebHDFS|http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Cancel_Delegation_Token].
But I'll concede that users can't get a list of all outstanding DTs (short of
using the OIV), which make revocation difficult.

Let's separate the right security model from current feature limitations in
HDFS. It's straightforward to build a revocation mechanism, along with some
stats reporting on DT usages, plus auditing. So lack of revocation today
shouldn't affect the direction we choose.

The alternative, which is to put real users' keytabs on the cluster, is far
worse. (Again, the use case example is a long running Spark Streaming app,
which runs as a real user, not a service account.) First, a compromise on the
keytab affects the user's corporate AD account. Second, normal users can't get
keytabs usually. I think it's hard to for most enterprise users to accept this
alternative.

Support arbitrary max expiration times for delegation token
---

Key: HDFS-7295
URL: https://issues.apache.org/jira/browse/HDFS-7295
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

Currently the max lifetime of HDFS delegation tokens is hardcoded to 7 days.
This is a problem for different users of HDFS such as long running YARN apps.
Users should be allowed to optionally specify max lifetime for their tokens.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails


 [ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-6741:
--
   Resolution: Fixed
Fix Version/s: 2.7.0
   Status: Resolved  (was: Patch Available)

Committed to branch-2 and trunk, thank you again Stephen!

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Fix For: 2.7.0

 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails

2014-10-28 Thread Stephen Chu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186486#comment-14186486
 ] 

Stephen Chu commented on HDFS-6741:
---

Thanks a lot, Harsh!

 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Fix For: 2.7.0

 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails


[ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186488#comment-14186488
 ] 

Hudson commented on HDFS-6741:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6365 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6365/])
HDFS-6741. Improve permission denied message when 
FSPermissionChecker#checkOwner fails. Contributed by Stephen Chu and Harsh J. 
(harsh) (harsh: rev 0398db19b2c4558a9f08ac2700a27752748896fa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPermission.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSPermissionChecker.java


 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Fix For: 2.7.0

 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token

2014-10-28 Thread Haohui Mai (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186510#comment-14186510
]

Haohui Mai commented on HDFS-7295:
--

bq. It's straightforward to build a revocation mechanism.

This is a common misconception. I should have explained the threat model
upfront, and what revocation exactly means. The threat model is that (1) an
attacker can steal the user's DT, (2) the system has no knowledge which token
is stolen, and (3) the system should not allow the attacker to reauthenticate
indefinitely using the stolen token.

The explicit revocation mechanism (canceling DT) you pointed out only works if
the NN know exactly which token is stolen, which is unfortunately not the case
in real-world environment. That's the exact reason why most of the capability
systems also have implicit revocation mechanism -- capabilities always have
expiration date, and the system ask the user to reauthenticate periodically to
renew their capabilities.

bq. The alternative, which is to put real users' keytabs on the cluster, is far
worse. (Again, the use case example is a long running Spark Streaming app,
which runs as a real user, not a service account.)

I'm not sure this is a fair comparison. If the requirement is to run
long-lasting jobs securely in the cluster, I'm unconvinced that the proposed
approach actually runs the jobs securely w.r.t. the threat model, as it
contains security flaws pointed out by [~ste...@apache.org] and [~aw]. I
understand there is usability concern, but this is an important correctness
issue from a security point of view.

Support arbitrary max expiration times for delegation token
---

Key: HDFS-7295
URL: https://issues.apache.org/jira/browse/HDFS-7295
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7299) Hadoop Namenode failing because of negative value in fsimage

Vishnu Ganth created HDFS-7299:
--

 Summary: Hadoop Namenode failing because of negative value in 
fsimage
 Key: HDFS-7299
 URL: https://issues.apache.org/jira/browse/HDFS-7299
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Vishnu Ganth


Hadoop Namenode is getting failed because of some unexpected value of block 
size in fsimage.

Stack trace:
{code}
2014-10-27 16:22:12,107 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
STARTUP_MSG: 
/
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = mastermachine-hostname/ip
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 2.0.0-cdh4.4.0
STARTUP_MSG:   classpath =

[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token

2014-10-28 Thread bc Wong (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186545#comment-14186545
]

bc Wong commented on HDFS-7295:
---

Yup. The thread model is a good place to start.
* Re (2) the system has no knowledge which token is stolen --- I think auditing
can definitely help. The NN audit logger does tell you who from which host is
accessing what file using which DT.
* Re (3) the system should not allow the attacker to reauthenticate
indefinitely using the stolen token --- This is where the configurable lifetime
cap comes in. For example, some IT admins make people change passwords every 3
months, some every year. If the config is user hbase's DT never expires, then
that's the same as having hbase's keytab on all nodes on the cluster. I don't
think that (3) is reasonable since not even Kerberos can satisfy (3) if the
attacker can steal the keytab in this case.

bq. I'm not sure this is a fair comparison
Could you elaborate more on that? The options brought up so far are (A)
arbitrary DT lifetime and (B) deploying users' keytabs, I'd think that (B) is
less secure due to its consequence. In addition, in all the situations (that I
can think of) where an attacker can steal the DT, the same attack can be used
to steal the keytab.

Do we have other proposals for the long running user app use case (e.g. Spark
Streaming)?

Support arbitrary max expiration times for delegation token
---

Key: HDFS-7295
URL: https://issues.apache.org/jira/browse/HDFS-7295
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-5894) Refactor a private internal class DataTransferEncryptor.SaslParticipant


 [ 
https://issues.apache.org/jira/browse/HDFS-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-5894:
--
Status: Patch Available  (was: Open)

 Refactor a private internal class DataTransferEncryptor.SaslParticipant
 ---

 Key: HDFS-5894
 URL: https://issues.apache.org/jira/browse/HDFS-5894
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Hiroshi Ikeda
Assignee: Harsh J
Priority: Trivial
 Attachments: HDFS-5894.patch, HDFS-5894.patch


 It is appropriate to use polymorphism for SaslParticipant instead of 
 scattering if-else statements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-5894) Refactor a private internal class DataTransferEncryptor.SaslParticipant


 [ 
https://issues.apache.org/jira/browse/HDFS-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-5894:
--
Attachment: HDFS-5894.patch

Thank you for getting back. I've attached a new patch that applies to current 
trunk state.

 Refactor a private internal class DataTransferEncryptor.SaslParticipant
 ---

 Key: HDFS-5894
 URL: https://issues.apache.org/jira/browse/HDFS-5894
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Hiroshi Ikeda
Assignee: Harsh J
Priority: Trivial
 Attachments: HDFS-5894.patch, HDFS-5894.patch


 It is appropriate to use polymorphism for SaslParticipant instead of 
 scattering if-else statements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5894) Refactor a private internal class DataTransferEncryptor.SaslParticipant


[ 
https://issues.apache.org/jira/browse/HDFS-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186559#comment-14186559
 ] 

Hadoop QA commented on HDFS-5894:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677557/HDFS-5894.patch
  against trunk revision 0398db1.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8565//console

This message is automatically generated.

 Refactor a private internal class DataTransferEncryptor.SaslParticipant
 ---

 Key: HDFS-5894
 URL: https://issues.apache.org/jira/browse/HDFS-5894
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Hiroshi Ikeda
Assignee: Harsh J
Priority: Trivial
 Attachments: HDFS-5894.patch, HDFS-5894.patch


 It is appropriate to use polymorphism for SaslParticipant instead of 
 scattering if-else statements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance

2014-10-28 Thread Yi Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-6606:
-
Attachment: HDFS-6606.009.patch

Chris, you are right, I update the patch to address it. Thank you, ATM and tucu 
for the review. Also thanks your volunteer to commit :) 
Thanks Suresh, Andy, Mike and Srikanth for the comments.

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, 
 HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, 
 HDFS-6606.006.patch, HDFS-6606.007.patch, HDFS-6606.008.patch, 
 HDFS-6606.009.patch, OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7299) Hadoop Namenode failing because of negative value in fsimage

2014-10-28 Thread Hu Liu, (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186582#comment-14186582
 ] 

Hu Liu, commented on HDFS-7299:
---

It seems that the fsimage is broken. You can use the offline image viewer to 
confirm.


 Hadoop Namenode failing because of negative value in fsimage
 

 Key: HDFS-7299
 URL: https://issues.apache.org/jira/browse/HDFS-7299
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Vishnu Ganth

 Hadoop Namenode is getting failed because of some unexpected value of block 
 size in fsimage.
 Stack trace:
 {code}
 2014-10-27 16:22:12,107 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 STARTUP_MSG: 
 /
 STARTUP_MSG: Starting NameNode
 STARTUP_MSG:   host = mastermachine-hostname/ip
 STARTUP_MSG:   args = []
 STARTUP_MSG:   version = 2.0.0-cdh4.4.0
 STARTUP_MSG:   classpath =

[jira] [Commented] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)


[ 
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186625#comment-14186625
 ] 

Tony Reix commented on HDFS-6515:
-

Hello
Reading the console output, I see nothing related to the patch that could be 
the root cause of the failure.

The error seems to be around:

HDFS-6515 patch is being downloaded at Mon Oct 27 07:51:14 UTC 2014 from
http://issues.apache.org/jira/secure/attachment/12654094/HDFS-6515-1.patch
cp: cannot stat '/home/jenkins/buildSupport/lib/*': No such file or directory
The patch does not appear to apply with p0 to p2
PATCH APPLICATION FAILED

where the error deals with some lib stored in Jenkins directory.

On my side, I gonna check that the patch works fine on the branch 'trunk'.

 testPageRounder   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 -

 Key: HDFS-6515
 URL: https://issues.apache.org/jira/browse/HDFS-6515
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0, 2.4.0
 Environment: Linux on PPC64
 Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora 
 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test
Reporter: Tony Reix
Priority: Blocker
  Labels: test
 Attachments: HDFS-6515-1.patch


 I have an issue with test :
testPageRounder
   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 on Linux/PowerPC.
 On Linux/Intel, test runs fine.
 On Linux/PowerPC, I have:
 testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)  
 Time elapsed: 64.037 sec   ERROR!
 java.lang.Exception: test timed out after 6 milliseconds
 Looking at details, I see that some Failed to cache  messages appear in the 
 traces. Only 10 on Intel, but 186 on PPC64.
 On PPC64, it looks like some thread is waiting for something that never 
 happens, generating a TimeOut.
 I'm now using IBM JVM, however I've just checked that the issue also appears 
 with OpenJDK.
 I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
 I need help for understanding what the test is doing, what traces are 
 expected, in order to understand what/where is the root cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token

2014-10-28 Thread Steve Loughran (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186642#comment-14186642
]

Steve Loughran commented on HDFS-7295:
--

[~aw]
bq. What Steve Loughran said.

I don't know whether to be pleased or scared by the fact you are agreeing with
me. Maybe both.

[~adhoot]

bq. My concern is the damage with a stolen keytab is far greater than the HDFS
token. Its universal kerberos identity versus something that works only with
HDFS.

In a more complex application you end up needing to authenticate IPC/REST
between different services anyway. Example: pool of tomcat instances talking
to HBase in YARN running against HDFS. Keytabs avoid having different solutions
for different parts of the stack. For the example cited, I'd just have one
single app account for the HBase and tomcat instances; {{sudo}} launch them
all as that user.

bq. Ops team might consider a longer delegation token to be lower risk than
having a more valuable asset - users's keytab - be exposed on a wide surface
area (we need all nodes to have access to the keytabs)

push it out during localization; rely on the NM to set up the paths securely
and to clean up afterwards. The weaknesses become
# packet sniffing. Better encrypt your wires.
# NM process fails, container then terminates: no cleanup
# malicious processes able to gain root access to the system. But do that and
you get enough other things away...

bq. Using keytabs for headless accounts will work for services that do not use
the user account. Spark streaming, for example, runs as the user just like Map
Reduce. This would mean asking user to create and deploy keytabs for those
scenarios, correct?

Depends on the duration of the instance. Short-lived: no. Medium lived: no.
Long-lived, you need a keytab —but it does not have to be that of the user
submitting the job, merely one with access to the (persistent) data.

[~bcwalrus]
bq. perhaps we can add a whitelist/blacklist for who can set arbitrary lifetime
on their DT, and whether there is a cap to the lifetime.

This adding even more complexity to a security system that is already hard for
some people (myself, for example) to understand.

bq. It's straightforward to build a revocation mechanism, along with some stats
reporting on DT usages, plus auditing.

Yes —but does it scale? Is every request going to have to trigger a token
revocation check, or simply a fraction? Even with that fraction, what load ends
up being placed on the infrastructure -including potentially the enterprise
wide Kerberos/AD systems. We also need to think about the availability of this
token revocation check infrastructure, whether to hide in the NN and add more
overhead there (as well as more data to keep in sync), or deploy and manage
some other token revocation infrastructure. I am not, personally, enthused by
the idea.

I don't think anyone pretends that keytabs are an ideal solution, I know some
cluster ops teams will be unhappy about this, but also think that saying
near-indefinite kerberos tokens isn't going to make those people happy
either.

There's another option which we looked at for slider: pushing out new tokens
from the client, just as the RM does token renewal today. you've got to
remember to refresh them regularly, and be able to get those tokens to the
processes in the YARN containers, processes that may then want to switch over
to them. I could imagine this though, with Oozie jobs scheduled to do the
renewal, and something in YARN to help with token propagation.

Support arbitrary max expiration times for delegation token
---

Key: HDFS-7295
URL: https://issues.apache.org/jira/browse/HDFS-7295
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)


[ 
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186650#comment-14186650
 ] 

Tony Reix commented on HDFS-6515:
-

Patching the trunk of Hadoop Common trunk from official GitHub with the patch 
provided here works perfectly :

$ patch -p0  ../HDFS-6515-1.patch 
patching file 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
Hunk #1 succeeded at 166 (offset 1 line).
patching file 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestFsDatasetCache.java

I've checked the 2 files and they are OK.

 testPageRounder   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 -

 Key: HDFS-6515
 URL: https://issues.apache.org/jira/browse/HDFS-6515
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0, 2.4.0
 Environment: Linux on PPC64
 Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora 
 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test
Reporter: Tony Reix
Priority: Blocker
  Labels: test
 Attachments: HDFS-6515-1.patch


 I have an issue with test :
testPageRounder
   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 on Linux/PowerPC.
 On Linux/Intel, test runs fine.
 On Linux/PowerPC, I have:
 testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)  
 Time elapsed: 64.037 sec   ERROR!
 java.lang.Exception: test timed out after 6 milliseconds
 Looking at details, I see that some Failed to cache  messages appear in the 
 traces. Only 10 on Intel, but 186 on PPC64.
 On PPC64, it looks like some thread is waiting for something that never 
 happens, generating a TimeOut.
 I'm now using IBM JVM, however I've just checked that the issue also appears 
 with OpenJDK.
 I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
 I need help for understanding what the test is doing, what traces are 
 expected, in order to understand what/where is the root cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token

2014-10-28 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186652#comment-14186652
 ] 

Steve Loughran commented on HDFS-7295:
--

I should add that in TWILL-101, twill is doing the push out new token strategy

 Support arbitrary max expiration times for delegation token
 ---

 Key: HDFS-7295
 URL: https://issues.apache.org/jira/browse/HDFS-7295
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

 Currently the max lifetime of HDFS delegation tokens is hardcoded to 7 days. 
 This is a problem for different users of HDFS such as long running YARN apps. 
 Users should be allowed to optionally specify max lifetime for their tokens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7299) Hadoop Namenode failing because of negative value in fsimage


[ 
https://issues.apache.org/jira/browse/HDFS-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186678#comment-14186678
 ] 

Vishnu Ganth commented on HDFS-7299:


[~huLiu]Thanks for the response Liu. I tried giving the command hdfs oiv -i 
fsimage_file -o output. I am getting the directory structure of hdfs. But how 
to confirm whether it is broken or working fine?

 Hadoop Namenode failing because of negative value in fsimage
 

 Key: HDFS-7299
 URL: https://issues.apache.org/jira/browse/HDFS-7299
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Vishnu Ganth

 Hadoop Namenode is getting failed because of some unexpected value of block 
 size in fsimage.
 Stack trace:
 {code}
 2014-10-27 16:22:12,107 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 STARTUP_MSG: 
 /
 STARTUP_MSG: Starting NameNode
 STARTUP_MSG:   host = mastermachine-hostname/ip
 STARTUP_MSG:   args = []
 STARTUP_MSG:   version = 2.0.0-cdh4.4.0
 STARTUP_MSG:   classpath =

[jira] [Commented] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)

[
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186699#comment-14186699
]

Tony Reix commented on HDFS-6515:
-

There is an issue on Hadoop Common GitHub Web page:
downloading ZIP File from Download ZIP button generated a set of source
code that is old, probably dated August 23th, since the latest commit displayed
is said to be:
Arpit Agarwal arp7 authored on 23 Aug latest commit 42a61a4fbc

When getting Hadoop Common source code with git clone, I see that the patch
fails.

I've change the patch, and tested it on a git clone, and that works.
I gonna push a new patch now.

testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
-

Key: HDFS-6515
URL: https://issues.apache.org/jira/browse/HDFS-6515
Project: Hadoop HDFS
Issue Type: Bug
Components: datanode
Affects Versions: 3.0.0, 2.4.0
Environment: Linux on PPC64
Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora
19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test
Reporter: Tony Reix
Priority: Blocker
Labels: test
Attachments: HDFS-6515-1.patch

I have an issue with test :
testPageRounder
(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
on Linux/PowerPC.
On Linux/Intel, test runs fine.
On Linux/PowerPC, I have:
testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
Time elapsed: 64.037 sec ERROR!
java.lang.Exception: test timed out after 6 milliseconds
Looking at details, I see that some Failed to cache messages appear in the
traces. Only 10 on Intel, but 186 on PPC64.
On PPC64, it looks like some thread is waiting for something that never
happens, generating a TimeOut.
I'm now using IBM JVM, however I've just checked that the issue also appears
with OpenJDK.
I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
I need help for understanding what the test is doing, what traces are
expected, in order to understand what/where is the root cause.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)

[
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tony Reix updated HDFS-6515:

Status: Open (was: Patch Available)

Source code of Hadoop has changed since the patch was produced.
I gonna provide a new version of the patch, that works with today's code.

testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
-

Key: HDFS-6515
URL: https://issues.apache.org/jira/browse/HDFS-6515
Project: Hadoop HDFS
Issue Type: Bug
Components: datanode
Affects Versions: 2.4.0, 3.0.0
Environment: Linux on PPC64
Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora
19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test
Reporter: Tony Reix
Priority: Blocker
Labels: test
Attachments: HDFS-6515-1.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)


 [ 
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Reix updated HDFS-6515:

   Labels: hadoop test  (was: test)
Affects Version/s: 2.4.1
   Status: Patch Available  (was: Open)

Previous patch has been updated to work with current Hadoop code.

$ patch -p0  /tmp/HDFS-6515-2.patch 
patching file 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
Hunk #1 succeeded at 171 (offset 6 lines).
patching file 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestFsDatasetCache.java

 git status
# On branch trunk
# Changes not staged for commit:
#   (use git add file... to update what will be committed)
#   (use git checkout -- file... to discard changes in working directory)
#
#   modified:   
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
#   modified:   
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestFsDatasetCache.java


 testPageRounder   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 -

 Key: HDFS-6515
 URL: https://issues.apache.org/jira/browse/HDFS-6515
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.1, 2.4.0, 3.0.0
 Environment: Linux on PPC64
 Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora 
 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test
Reporter: Tony Reix
Priority: Blocker
  Labels: test, hadoop
 Attachments: HDFS-6515-1.patch


 I have an issue with test :
testPageRounder
   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 on Linux/PowerPC.
 On Linux/Intel, test runs fine.
 On Linux/PowerPC, I have:
 testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)  
 Time elapsed: 64.037 sec   ERROR!
 java.lang.Exception: test timed out after 6 milliseconds
 Looking at details, I see that some Failed to cache  messages appear in the 
 traces. Only 10 on Intel, but 186 on PPC64.
 On PPC64, it looks like some thread is waiting for something that never 
 happens, generating a TimeOut.
 I'm now using IBM JVM, however I've just checked that the issue also appears 
 with OpenJDK.
 I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
 I need help for understanding what the test is doing, what traces are 
 expected, in order to understand what/where is the root cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)


 [ 
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Reix updated HDFS-6515:

Attachment: HDFS-6515-2.patch

 testPageRounder   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 -

 Key: HDFS-6515
 URL: https://issues.apache.org/jira/browse/HDFS-6515
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0, 2.4.0, 2.4.1
 Environment: Linux on PPC64
 Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora 
 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test
Reporter: Tony Reix
Priority: Blocker
  Labels: hadoop, test
 Attachments: HDFS-6515-1.patch, HDFS-6515-2.patch


 I have an issue with test :
testPageRounder
   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 on Linux/PowerPC.
 On Linux/Intel, test runs fine.
 On Linux/PowerPC, I have:
 testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)  
 Time elapsed: 64.037 sec   ERROR!
 java.lang.Exception: test timed out after 6 milliseconds
 Looking at details, I see that some Failed to cache  messages appear in the 
 traces. Only 10 on Intel, but 186 on PPC64.
 On PPC64, it looks like some thread is waiting for something that never 
 happens, generating a TimeOut.
 I'm now using IBM JVM, however I've just checked that the issue also appears 
 with OpenJDK.
 I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
 I need help for understanding what the test is doing, what traces are 
 expected, in order to understand what/where is the root cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6934) Move checksum computation off the hot path when writing to RAM disk


[ 
https://issues.apache.org/jira/browse/HDFS-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186727#comment-14186727
 ] 

Hudson commented on HDFS-6934:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
HDFS-6934. Move checksum computation off the hot path when writing to RAM disk. 
Contributed by Chris Nauroth. (cnauroth: rev 
463aec11718e47d4aabb86a7a539cb973460aae6)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestShell.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestScrLazyPersistFiles.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Options.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyPersistTestCase.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/ReplicaOutputStreams.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockMetadataHeader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskReplicaLruTracker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSOutputSummer.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInPipeline.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
HDFS-6934. Revert files accidentally committed. (cnauroth: rev 
5b1dfe78b8b06335bed0bcb83f12bb936d4c021b)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestShell.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java


 Move checksum computation off the hot path when writing to RAM disk
 ---

 Key: HDFS-6934
 URL: https://issues.apache.org/jira/browse/HDFS-6934
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client
Reporter: Arpit Agarwal
Assignee: Chris Nauroth
 Fix For: 2.6.0

 Attachments: HDFS-6934-branch-2.6.5.patch, HDFS-6934.3.patch, 
 HDFS-6934.4.patch, HDFS-6934.5.patch, h6934_20141003b.patch, 
 h6934_20141005.patch


 Since local RAM is considered reliable we can avoid writing checksums on the 
 hot path when replicas are being written to a local RAM disk.
 The checksum can be computed by the lazy writer when moving replicas to disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5928) show namespace and namenode ID on NN dfshealth page


[ 
https://issues.apache.org/jira/browse/HDFS-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186729#comment-14186729
 ] 

Hudson commented on HDFS-5928:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
HDFS-5928. Show namespace and namenode ID on NN dfshealth page. Contributed by 
Siqi Li. (wheat9: rev 00b4e44a2eba871b4ab47e51c52de95b12dca82e)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js


 show namespace and namenode ID on NN dfshealth page
 ---

 Key: HDFS-5928
 URL: https://issues.apache.org/jira/browse/HDFS-5928
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 2.7.0

 Attachments: HDFS-5928.007.patch, HDFS-5928.v2.patch, 
 HDFS-5928.v3.patch, HDFS-5928.v4.patch, HDFS-5928.v5.patch, 
 HDFS-5928.v6.patch, HDFS－5928.v1.patch, screenshot-1.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6538) Comment format error in ShortCircuitRegistry javadoc


[ 
https://issues.apache.org/jira/browse/HDFS-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186731#comment-14186731
 ] 

Hudson commented on HDFS-6538:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
HDFS-6538. Comment format error in ShortCircuitRegistry javadoc. Contributed by 
David Luo. (harsh) (harsh: rev 0058eadbd3149a5dee1ffc69c2d9f21caa916fb5)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java


 Comment format error in ShortCircuitRegistry javadoc
 

 Key: HDFS-6538
 URL: https://issues.apache.org/jira/browse/HDFS-6538
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: debugging
Assignee: David Luo
Priority: Trivial
  Labels: documentation
 Fix For: 2.7.0

 Attachments: HDFS-6538.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 The element comment for javadoc should be started by {noformat}/**{noformat}, 
 but it starts with only {noformat}/*{noformat} for class ShortCircuitRegistry.
 So I think there is a {noformat}*{noformat} Omitted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7282) Fix intermittent TestShortCircuitCache and TestBlockReaderFactory failures resulting from TemporarySocketDirectory GC


[ 
https://issues.apache.org/jira/browse/HDFS-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186734#comment-14186734
 ] 

Hudson commented on HDFS-7282:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
HDFS-7282. Fix intermittent TestShortCircuitCache and TestBlockReaderFactory 
failures resulting from TemporarySocketDirectory GC (Jinghui Wang via Colin P. 
McCabe) (cmccabe: rev 518a7f4af3d8deeecabfa0629b16521ce09de459)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderFactory.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Fix intermittent TestShortCircuitCache and TestBlockReaderFactory failures 
 resulting from TemporarySocketDirectory GC
 -

 Key: HDFS-7282
 URL: https://issues.apache.org/jira/browse/HDFS-7282
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.1
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 2.7.0

 Attachments: HDFS-7282.patch


 TemporarySocketDirectory has finalize method deletes the directory, in 
 TestShortCircuitCache and TestBlockReaderFactory, the 
 TemporarySocketDirectory created are not refereced later in the tests, which 
 can get garbage collected (deleted the dir) before Datanode start up 
 accessing the directory under TemporarySocketDirectory causing 
 FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails


[ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186722#comment-14186722
 ] 

Hudson commented on HDFS-6741:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
HDFS-6741. Improve permission denied message when 
FSPermissionChecker#checkOwner fails. Contributed by Stephen Chu and Harsh J. 
(harsh) (harsh: rev 0398db19b2c4558a9f08ac2700a27752748896fa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSPermissionChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPermission.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Fix For: 2.7.0

 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance


[ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186719#comment-14186719
 ] 

Hadoop QA commented on HDFS-6606:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677559/HDFS-6606.009.patch
  against trunk revision 0398db1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.fs.viewfs.TestViewFsHdfs
  org.apache.hadoop.fs.viewfs.TestViewFileSystemHdfs

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8566//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8566//console

This message is automatically generated.

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, 
 HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, 
 HDFS-6606.006.patch, HDFS-6606.007.patch, HDFS-6606.008.patch, 
 HDFS-6606.009.patch, OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7278) Add a command that allows sysadmins to manually trigger full block reports from a DN


[ 
https://issues.apache.org/jira/browse/HDFS-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186724#comment-14186724
 ] 

Hudson commented on HDFS-7278:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
HDFS-7278. Add a command that allows sysadmins to manually trigger full block 
reports from a DN (cmccabe) (cmccabe: rev 
baf794dc404ac54f4e8332654eadfac1bebacb8f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestTriggerBlockReport.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSCommands.apt.vm
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolServerSideTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/ClientDatanodeProtocol.proto
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java


 Add a command that allows sysadmins to manually trigger full block reports 
 from a DN
 

 Key: HDFS-7278
 URL: https://issues.apache.org/jira/browse/HDFS-7278
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.7.0

 Attachments: HDFS-7278.002.patch, HDFS-7278.003.patch, 
 HDFS-7278.004.patch, HDFS-7278.005.patch


 We should add a command that allows sysadmins to manually trigger full block 
 reports from a DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)


[ 
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186737#comment-14186737
 ] 

Hadoop QA commented on HDFS-6515:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677586/HDFS-6515-2.patch
  against trunk revision c9bec46.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to cause Findbugs 
(version 2.0.3) to fail.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8567//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8567//console

This message is automatically generated.

 testPageRounder   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 -

 Key: HDFS-6515
 URL: https://issues.apache.org/jira/browse/HDFS-6515
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0, 2.4.0, 2.4.1
 Environment: Linux on PPC64
 Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora 
 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test
Reporter: Tony Reix
Priority: Blocker
  Labels: hadoop, test
 Attachments: HDFS-6515-1.patch, HDFS-6515-2.patch


 I have an issue with test :
testPageRounder
   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 on Linux/PowerPC.
 On Linux/Intel, test runs fine.
 On Linux/PowerPC, I have:
 testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)  
 Time elapsed: 64.037 sec   ERROR!
 java.lang.Exception: test timed out after 6 milliseconds
 Looking at details, I see that some Failed to cache  messages appear in the 
 traces. Only 10 on Intel, but 186 on PPC64.
 On PPC64, it looks like some thread is waiting for something that never 
 happens, generating a TimeOut.
 I'm now using IBM JVM, however I've just checked that the issue also appears 
 with OpenJDK.
 I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
 I need help for understanding what the test is doing, what traces are 
 expected, in order to understand what/where is the root cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7299) Hadoop Namenode failing because of negative value in fsimage

2014-10-28 Thread Hu Liu, (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186756#comment-14186756
 ] 

Hu Liu, commented on HDFS-7299:
---

If you can get the correct directory structure without any error, the fsimage 
should be ok.

 Hadoop Namenode failing because of negative value in fsimage
 

 Key: HDFS-7299
 URL: https://issues.apache.org/jira/browse/HDFS-7299
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Vishnu Ganth

 Hadoop Namenode is getting failed because of some unexpected value of block 
 size in fsimage.
 Stack trace:
 {code}
 2014-10-27 16:22:12,107 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 STARTUP_MSG: 
 /
 STARTUP_MSG: Starting NameNode
 STARTUP_MSG:   host = mastermachine-hostname/ip
 STARTUP_MSG:   args = []
 STARTUP_MSG:   version = 2.0.0-cdh4.4.0
 STARTUP_MSG:   classpath =

[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token

2014-10-28 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186760#comment-14186760
 ] 

Steve Loughran commented on HDFS-7295:
--

Linking to YARN-2704 and log aggregation. There's also the need of the NM's to 
be able to get the localised resources of the AM submission in the event of an 
AM restart event after the original tokens have expired

 Support arbitrary max expiration times for delegation token
 ---

 Key: HDFS-7295
 URL: https://issues.apache.org/jira/browse/HDFS-7295
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

 Currently the max lifetime of HDFS delegation tokens is hardcoded to 7 days. 
 This is a problem for different users of HDFS such as long running YARN apps. 
 Users should be allowed to optionally specify max lifetime for their tokens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7299) Hadoop Namenode failing because of negative value in fsimage


[ 
https://issues.apache.org/jira/browse/HDFS-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186781#comment-14186781
 ] 

Vishnu Ganth commented on HDFS-7299:


Thanks Liu. i am getting the correct directory structure using offline image 
viewer. So any further ways to debug this issue..

 Hadoop Namenode failing because of negative value in fsimage
 

 Key: HDFS-7299
 URL: https://issues.apache.org/jira/browse/HDFS-7299
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Vishnu Ganth

 Hadoop Namenode is getting failed because of some unexpected value of block 
 size in fsimage.
 Stack trace:
 {code}
 2014-10-27 16:22:12,107 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 STARTUP_MSG: 
 /
 STARTUP_MSG: Starting NameNode
 STARTUP_MSG:   host = mastermachine-hostname/ip
 STARTUP_MSG:   args = []
 STARTUP_MSG:   version = 2.0.0-cdh4.4.0
 STARTUP_MSG:   classpath =

[jira] [Updated] (HDFS-5894) Refactor a private internal class DataTransferEncryptor.SaslParticipant


 [ 
https://issues.apache.org/jira/browse/HDFS-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-5894:
--
Attachment: HDFS-5894.patch

Re-uploading patch to retry after the build-bot was fixed to properly apply 
patches.

 Refactor a private internal class DataTransferEncryptor.SaslParticipant
 ---

 Key: HDFS-5894
 URL: https://issues.apache.org/jira/browse/HDFS-5894
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Hiroshi Ikeda
Assignee: Harsh J
Priority: Trivial
 Attachments: HDFS-5894.patch, HDFS-5894.patch, HDFS-5894.patch


 It is appropriate to use polymorphism for SaslParticipant instead of 
 scattering if-else statements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-4836) Update Tomcat version for httpfs to 6.0.37


 [ 
https://issues.apache.org/jira/browse/HDFS-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-4836:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

 Update Tomcat version for httpfs to 6.0.37
 --

 Key: HDFS-4836
 URL: https://issues.apache.org/jira/browse/HDFS-4836
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
Priority: Trivial
 Attachments: HDFS-4836.patch


 Tomcat has release a new version of tomcat with security fixes
 http://tomcat.apache.org/security-6.html#Fixed_in_Apache_Tomcat_6.0.37



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-4836) Update Tomcat version for httpfs to 6.0.37


[ 
https://issues.apache.org/jira/browse/HDFS-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186794#comment-14186794
 ] 

Harsh J commented on HDFS-4836:
---

Similar JIRA HADOOP-10814 has bumped it up to 6.0.41 on trunk.

 Update Tomcat version for httpfs to 6.0.37
 --

 Key: HDFS-4836
 URL: https://issues.apache.org/jira/browse/HDFS-4836
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
Priority: Trivial
 Attachments: HDFS-4836.patch


 Tomcat has release a new version of tomcat with security fixes
 http://tomcat.apache.org/security-6.html#Fixed_in_Apache_Tomcat_6.0.37



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7299) Hadoop Namenode failing because of negative value in fsimage


[ 
https://issues.apache.org/jira/browse/HDFS-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186800#comment-14186800
 ] 

Vishnu Ganth commented on HDFS-7299:


I found one of the file in hdfs containing negative NUM_BYTES set in fsimage.
  INODE
  INODE_PATH = /user/root/dir/out/part-m-05990
  REPLICATION = 3
  MODIFICATION_TIME = 2014-09-05 04:09
  ACCESS_TIME = 2014-09-05 07:42
  BLOCK_SIZE = 134217728
  BLOCKS [NUM_BLOCKS = 1]
BLOCK
  BLOCK_ID = 8582078737
  *NUM_BYTES = -1945969516689645797*
  GENERATION_STAMP = 5
  NS_QUOTA = -1
  DS_QUOTA = -1
  PERMISSIONS
USER_NAME = root
GROUP_NAME = supergroup
PERMISSION_STRING = rw-r--r--

Is there any way to edit the fsimage file...

 Hadoop Namenode failing because of negative value in fsimage
 

 Key: HDFS-7299
 URL: https://issues.apache.org/jira/browse/HDFS-7299
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Vishnu Ganth

 Hadoop Namenode is getting failed because of some unexpected value of block 
 size in fsimage.
 Stack trace:
 {code}
 2014-10-27 16:22:12,107 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
 STARTUP_MSG: 
 /
 STARTUP_MSG: Starting NameNode
 STARTUP_MSG:   host = mastermachine-hostname/ip
 STARTUP_MSG:   args = []
 STARTUP_MSG:   version = 2.0.0-cdh4.4.0
 STARTUP_MSG:   classpath =

[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance


[ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186802#comment-14186802
 ] 

Hudson commented on HDFS-6606:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6367 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6367/])
HDFS-6606. Optimize HDFS Encrypted Transport performance. (yliu) (yliu: rev 
58c0bb9ed9f4a2491395b63c68046562a73526c9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/DataTransferSaslUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslParticipant.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslDataTransferServer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslDataTransferClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CipherOption.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoInputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslResponseWithNegotiatedCipherOption.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptedTransfer.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto


 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, 
 HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, 
 HDFS-6606.006.patch, HDFS-6606.007.patch, HDFS-6606.008.patch, 
 HDFS-6606.009.patch, OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails


[ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186821#comment-14186821
 ] 

Hudson commented on HDFS-6741:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
HDFS-6741. Improve permission denied message when 
FSPermissionChecker#checkOwner fails. Contributed by Stephen Chu and Harsh J. 
(harsh) (harsh: rev 0398db19b2c4558a9f08ac2700a27752748896fa)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSPermissionChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPermission.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java


 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Fix For: 2.7.0

 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6934) Move checksum computation off the hot path when writing to RAM disk


[ 
https://issues.apache.org/jira/browse/HDFS-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186826#comment-14186826
 ] 

Hudson commented on HDFS-6934:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
HDFS-6934. Move checksum computation off the hot path when writing to RAM disk. 
Contributed by Chris Nauroth. (cnauroth: rev 
463aec11718e47d4aabb86a7a539cb973460aae6)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestScrLazyPersistFiles.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/ReplicaOutputStreams.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestShell.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockMetadataHeader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInPipeline.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSOutputSummer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyPersistTestCase.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskReplicaLruTracker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Options.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
HDFS-6934. Revert files accidentally committed. (cnauroth: rev 
5b1dfe78b8b06335bed0bcb83f12bb936d4c021b)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestShell.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java


 Move checksum computation off the hot path when writing to RAM disk
 ---

 Key: HDFS-6934
 URL: https://issues.apache.org/jira/browse/HDFS-6934
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client
Reporter: Arpit Agarwal
Assignee: Chris Nauroth
 Fix For: 2.6.0

 Attachments: HDFS-6934-branch-2.6.5.patch, HDFS-6934.3.patch, 
 HDFS-6934.4.patch, HDFS-6934.5.patch, h6934_20141003b.patch, 
 h6934_20141005.patch


 Since local RAM is considered reliable we can avoid writing checksums on the 
 hot path when replicas are being written to a local RAM disk.
 The checksum can be computed by the lazy writer when moving replicas to disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7278) Add a command that allows sysadmins to manually trigger full block reports from a DN


[ 
https://issues.apache.org/jira/browse/HDFS-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186823#comment-14186823
 ] 

Hudson commented on HDFS-7278:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
HDFS-7278. Add a command that allows sysadmins to manually trigger full block 
reports from a DN (cmccabe) (cmccabe: rev 
baf794dc404ac54f4e8332654eadfac1bebacb8f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSCommands.apt.vm
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/ClientDatanodeProtocol.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestTriggerBlockReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolServerSideTranslatorPB.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Add a command that allows sysadmins to manually trigger full block reports 
 from a DN
 

 Key: HDFS-7278
 URL: https://issues.apache.org/jira/browse/HDFS-7278
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.7.0

 Attachments: HDFS-7278.002.patch, HDFS-7278.003.patch, 
 HDFS-7278.004.patch, HDFS-7278.005.patch


 We should add a command that allows sysadmins to manually trigger full block 
 reports from a DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5928) show namespace and namenode ID on NN dfshealth page


[ 
https://issues.apache.org/jira/browse/HDFS-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186828#comment-14186828
 ] 

Hudson commented on HDFS-5928:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
HDFS-5928. Show namespace and namenode ID on NN dfshealth page. Contributed by 
Siqi Li. (wheat9: rev 00b4e44a2eba871b4ab47e51c52de95b12dca82e)
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js


 show namespace and namenode ID on NN dfshealth page
 ---

 Key: HDFS-5928
 URL: https://issues.apache.org/jira/browse/HDFS-5928
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 2.7.0

 Attachments: HDFS-5928.007.patch, HDFS-5928.v2.patch, 
 HDFS-5928.v3.patch, HDFS-5928.v4.patch, HDFS-5928.v5.patch, 
 HDFS-5928.v6.patch, HDFS－5928.v1.patch, screenshot-1.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6538) Comment format error in ShortCircuitRegistry javadoc


[ 
https://issues.apache.org/jira/browse/HDFS-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186830#comment-14186830
 ] 

Hudson commented on HDFS-6538:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
HDFS-6538. Comment format error in ShortCircuitRegistry javadoc. Contributed by 
David Luo. (harsh) (harsh: rev 0058eadbd3149a5dee1ffc69c2d9f21caa916fb5)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java


 Comment format error in ShortCircuitRegistry javadoc
 

 Key: HDFS-6538
 URL: https://issues.apache.org/jira/browse/HDFS-6538
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: debugging
Assignee: David Luo
Priority: Trivial
  Labels: documentation
 Fix For: 2.7.0

 Attachments: HDFS-6538.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 The element comment for javadoc should be started by {noformat}/**{noformat}, 
 but it starts with only {noformat}/*{noformat} for class ShortCircuitRegistry.
 So I think there is a {noformat}*{noformat} Omitted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance

2014-10-28 Thread Yi Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186838#comment-14186838
 ] 

Yi Liu commented on HDFS-6606:
--

Chris, I commit it to avoid rebase, since I see other JIRA doing small refactor 
for _SaslParticipant_.
Thanks again for your review.

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, 
 HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, 
 HDFS-6606.006.patch, HDFS-6606.007.patch, HDFS-6606.008.patch, 
 HDFS-6606.009.patch, OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6606) Optimize HDFS Encrypted Transport performance

2014-10-28 Thread Yi Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-6606:
-
   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

commit to trunk, branch-2, branch-2.6

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.6.0

 Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, 
 HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, 
 HDFS-6606.006.patch, HDFS-6606.007.patch, HDFS-6606.008.patch, 
 HDFS-6606.009.patch, OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7282) Fix intermittent TestShortCircuitCache and TestBlockReaderFactory failures resulting from TemporarySocketDirectory GC


[ 
https://issues.apache.org/jira/browse/HDFS-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186833#comment-14186833
 ] 

Hudson commented on HDFS-7282:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
HDFS-7282. Fix intermittent TestShortCircuitCache and TestBlockReaderFactory 
failures resulting from TemporarySocketDirectory GC (Jinghui Wang via Colin P. 
McCabe) (cmccabe: rev 518a7f4af3d8deeecabfa0629b16521ce09de459)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderFactory.java


 Fix intermittent TestShortCircuitCache and TestBlockReaderFactory failures 
 resulting from TemporarySocketDirectory GC
 -

 Key: HDFS-7282
 URL: https://issues.apache.org/jira/browse/HDFS-7282
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.1
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 2.7.0

 Attachments: HDFS-7282.patch


 TemporarySocketDirectory has finalize method deletes the directory, in 
 TestShortCircuitCache and TestBlockReaderFactory, the 
 TemporarySocketDirectory created are not refereced later in the tests, which 
 can get garbage collected (deleted the dir) before Datanode start up 
 accessing the directory under TemporarySocketDirectory causing 
 FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock


[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186878#comment-14186878
 ] 

Yongjun Zhang commented on HDFS-7235:
-

I was able to apply the patch locally even at the latest tip of trunk
{quote}
commit 58c0bb9ed9f4a2491395b63c68046562a73526c9
Author: yliu y...@apache.org
Date:   Tue Oct 28 21:11:31 2014 +0800
{quote}


 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock


 [ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7235:

Attachment: (was: HDFS-7235.007.patch)

 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock


 [ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7235:

Attachment: HDFS-7235.007.patch

 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6741) Improve permission denied message when FSPermissionChecker#checkOwner fails


[ 
https://issues.apache.org/jira/browse/HDFS-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186883#comment-14186883
 ] 

Hudson commented on HDFS-6741:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
HDFS-6741. Improve permission denied message when 
FSPermissionChecker#checkOwner fails. Contributed by Stephen Chu and Harsh J. 
(harsh) (harsh: rev 0398db19b2c4558a9f08ac2700a27752748896fa)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSPermissionChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSPermission.java


 Improve permission denied message when FSPermissionChecker#checkOwner fails
 ---

 Key: HDFS-6741
 URL: https://issues.apache.org/jira/browse/HDFS-6741
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0, 2.5.0
Reporter: Stephen Chu
Assignee: Harsh J
Priority: Trivial
  Labels: supportability
 Fix For: 2.7.0

 Attachments: HDFS-6741.1.patch, HDFS-6741.2.patch, HDFS-6741.2.patch


 Currently, FSPermissionChecker#checkOwner throws an AccessControlException 
 with a simple Permission denied message.
 When users try to set an ACL without ownership permissions, they'll see 
 something like:
 {code}
 [schu@hdfs-vanilla-1 hadoop]$ hdfs dfs -setfacl -m user:schu:--- /tmp
 setfacl: Permission denied
 {code}
 It'd be helpful if the message had an explanation why the permission was 
 denied to avoid confusion for users who aren't familiar with permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6934) Move checksum computation off the hot path when writing to RAM disk


[ 
https://issues.apache.org/jira/browse/HDFS-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186888#comment-14186888
 ] 

Hudson commented on HDFS-6934:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
HDFS-6934. Move checksum computation off the hot path when writing to RAM disk. 
Contributed by Chris Nauroth. (cnauroth: rev 
463aec11718e47d4aabb86a7a539cb973460aae6)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskAsyncLazyPersistService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/ReplicaOutputStreams.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Options.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyPersistTestCase.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockMetadataHeader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/DataChecksum.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInPipeline.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestShell.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSOutputSummer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestScrLazyPersistFiles.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/RamDiskReplicaLruTracker.java
HDFS-6934. Revert files accidentally committed. (cnauroth: rev 
5b1dfe78b8b06335bed0bcb83f12bb936d4c021b)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/nativeio/NativeIO.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestShell.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java


 Move checksum computation off the hot path when writing to RAM disk
 ---

 Key: HDFS-6934
 URL: https://issues.apache.org/jira/browse/HDFS-6934
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client
Reporter: Arpit Agarwal
Assignee: Chris Nauroth
 Fix For: 2.6.0

 Attachments: HDFS-6934-branch-2.6.5.patch, HDFS-6934.3.patch, 
 HDFS-6934.4.patch, HDFS-6934.5.patch, h6934_20141003b.patch, 
 h6934_20141005.patch


 Since local RAM is considered reliable we can avoid writing checksums on the 
 hot path when replicas are being written to a local RAM disk.
 The checksum can be computed by the lazy writer when moving replicas to disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7282) Fix intermittent TestShortCircuitCache and TestBlockReaderFactory failures resulting from TemporarySocketDirectory GC


[ 
https://issues.apache.org/jira/browse/HDFS-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186896#comment-14186896
 ] 

Hudson commented on HDFS-7282:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
HDFS-7282. Fix intermittent TestShortCircuitCache and TestBlockReaderFactory 
failures resulting from TemporarySocketDirectory GC (Jinghui Wang via Colin P. 
McCabe) (cmccabe: rev 518a7f4af3d8deeecabfa0629b16521ce09de459)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderFactory.java


 Fix intermittent TestShortCircuitCache and TestBlockReaderFactory failures 
 resulting from TemporarySocketDirectory GC
 -

 Key: HDFS-7282
 URL: https://issues.apache.org/jira/browse/HDFS-7282
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.1
Reporter: Jinghui Wang
Assignee: Jinghui Wang
 Fix For: 2.7.0

 Attachments: HDFS-7282.patch


 TemporarySocketDirectory has finalize method deletes the directory, in 
 TestShortCircuitCache and TestBlockReaderFactory, the 
 TemporarySocketDirectory created are not refereced later in the tests, which 
 can get garbage collected (deleted the dir) before Datanode start up 
 accessing the directory under TemporarySocketDirectory causing 
 FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7278) Add a command that allows sysadmins to manually trigger full block reports from a DN


[ 
https://issues.apache.org/jira/browse/HDFS-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186885#comment-14186885
 ] 

Hudson commented on HDFS-7278:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
HDFS-7278. Add a command that allows sysadmins to manually trigger full block 
reports from a DN (cmccabe) (cmccabe: rev 
baf794dc404ac54f4e8332654eadfac1bebacb8f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSCommands.apt.vm
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolServerSideTranslatorPB.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/ClientDatanodeProtocol.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestTriggerBlockReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


 Add a command that allows sysadmins to manually trigger full block reports 
 from a DN
 

 Key: HDFS-7278
 URL: https://issues.apache.org/jira/browse/HDFS-7278
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Fix For: 2.7.0

 Attachments: HDFS-7278.002.patch, HDFS-7278.003.patch, 
 HDFS-7278.004.patch, HDFS-7278.005.patch


 We should add a command that allows sysadmins to manually trigger full block 
 reports from a DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5928) show namespace and namenode ID on NN dfshealth page


[ 
https://issues.apache.org/jira/browse/HDFS-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186890#comment-14186890
 ] 

Hudson commented on HDFS-5928:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
HDFS-5928. Show namespace and namenode ID on NN dfshealth page. Contributed by 
Siqi Li. (wheat9: rev 00b4e44a2eba871b4ab47e51c52de95b12dca82e)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js


 show namespace and namenode ID on NN dfshealth page
 ---

 Key: HDFS-5928
 URL: https://issues.apache.org/jira/browse/HDFS-5928
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 2.7.0

 Attachments: HDFS-5928.007.patch, HDFS-5928.v2.patch, 
 HDFS-5928.v3.patch, HDFS-5928.v4.patch, HDFS-5928.v5.patch, 
 HDFS-5928.v6.patch, HDFS－5928.v1.patch, screenshot-1.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6538) Comment format error in ShortCircuitRegistry javadoc


[ 
https://issues.apache.org/jira/browse/HDFS-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186893#comment-14186893
 ] 

Hudson commented on HDFS-6538:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
HDFS-6538. Comment format error in ShortCircuitRegistry javadoc. Contributed by 
David Luo. (harsh) (harsh: rev 0058eadbd3149a5dee1ffc69c2d9f21caa916fb5)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ShortCircuitRegistry.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Comment format error in ShortCircuitRegistry javadoc
 

 Key: HDFS-6538
 URL: https://issues.apache.org/jira/browse/HDFS-6538
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
Reporter: debugging
Assignee: David Luo
Priority: Trivial
  Labels: documentation
 Fix For: 2.7.0

 Attachments: HDFS-6538.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 The element comment for javadoc should be started by {noformat}/**{noformat}, 
 but it starts with only {noformat}/*{noformat} for class ShortCircuitRegistry.
 So I think there is a {noformat}*{noformat} Omitted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)


[ 
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186918#comment-14186918
 ] 

Tony Reix commented on HDFS-6515:
-

Test report says:
  [INFO] BUILD SUCCESS
but there are errors after:
 - Determining number of patched Findbugs warnings :
  
/home/jenkins/j/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build@2/dev-support/test-patch.sh:
 line 622:  2899 Killed 
enkins-slave/workspace/PreCommit-HDFS-Build@2/dev-support/test-patch.sh: line 
622:  2899 Killed 
 - Running tests:
 /bin/grep: 
/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build@2/../patchprocess/patch:
 No such file or directory

 {color:red}-1 findbugs{color}.  The patch appears to cause Findbugs (version 
2.0.3) to fail.

 - Checking the integrity of system test framework code.:
   mv: cannot stat 
'/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build@2/../patchprocess': 
No such file or directory

I'm now running:
   mvn clean test findbugs:findbugs -DskipTests -DHadoopPatchProcess
in my environment, with trunk patched with 6515, in order to understand what's 
wrong.

Result:
[INFO] Apache Hadoop Project POM . FAILURE [1:20.245s]
[ERROR] Failed to execute goal 
org.codehaus.mojo:findbugs-maven-plugin:2.3.2:findbugs (default-cli) on project 
hadoop-project: Execution default-cli of goal 
org.codehaus.mojo:findbugs-maven-plugin:2.3.2:findbugs failed: Plugin 
org.codehaus.mojo:findbugs-maven-plugin:2.3.2 or one of its dependencies could 
not be resolved: Could not transfer artifact asm:asm-xml:jar:3.1 from/to 
central (http://repo.maven.apache.org/maven2): Read timed out - [Help 1]

Retesting with -X and Oracle 1.7 JVM instead of IBM JVM:
Result of: 
mvn -X test findbugs:findbugs -DskipTests -DHadoopPatchProcess -l 
mvn.findbugs.OpenJDK.res
in my environment (Ubuntu 14.04/Intel, Maven 3.0.4) is :
 BUILD SUCCESS

 testPageRounder   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 -

 Key: HDFS-6515
 URL: https://issues.apache.org/jira/browse/HDFS-6515
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0, 2.4.0, 2.4.1
 Environment: Linux on PPC64
 Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora 
 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test
Reporter: Tony Reix
Priority: Blocker
  Labels: hadoop, test
 Attachments: HDFS-6515-1.patch, HDFS-6515-2.patch


 I have an issue with test :
testPageRounder
   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
 on Linux/PowerPC.
 On Linux/Intel, test runs fine.
 On Linux/PowerPC, I have:
 testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)  
 Time elapsed: 64.037 sec   ERROR!
 java.lang.Exception: test timed out after 6 milliseconds
 Looking at details, I see that some Failed to cache  messages appear in the 
 traces. Only 10 on Intel, but 186 on PPC64.
 On PPC64, it looks like some thread is waiting for something that never 
 happens, generating a TimeOut.
 I'm now using IBM JVM, however I've just checked that the issue also appears 
 with OpenJDK.
 I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
 I need help for understanding what the test is doing, what traces are 
 expected, in order to understand what/where is the root cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6606) Optimize HDFS Encrypted Transport performance

2014-10-28 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186986#comment-14186986
 ] 

Chris Nauroth commented on HDFS-6606:
-

I had forgotten that you can do your own commits now, Yi.  :-)  Thank you for 
the patch, and thank you to all code reviewers.

 Optimize HDFS Encrypted Transport performance
 -

 Key: HDFS-6606
 URL: https://issues.apache.org/jira/browse/HDFS-6606
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, security
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.6.0

 Attachments: HDFS-6606.001.patch, HDFS-6606.002.patch, 
 HDFS-6606.003.patch, HDFS-6606.004.patch, HDFS-6606.005.patch, 
 HDFS-6606.006.patch, HDFS-6606.007.patch, HDFS-6606.008.patch, 
 HDFS-6606.009.patch, OptimizeHdfsEncryptedTransportperformance.pdf


 In HDFS-3637, [~atm] added support for encrypting the DataTransferProtocol, 
 it was a great work.
 It utilizes SASL {{Digest-MD5}} mechanism (use Qop: auth-conf),  it supports 
 three security strength:
 * high  3des   or rc4 (128bits)
 * medium des or rc4(56bits)
 * low   rc4(40bits)
 3des and rc4 are slow, only *tens of MB/s*, 
 http://www.javamex.com/tutorials/cryptography/ciphers.shtml
 http://www.cs.wustl.edu/~jain/cse567-06/ftp/encryption_perf/
 I will give more detailed performance data in future. Absolutely it’s 
 bottleneck and will vastly affect the end to end performance. 
 AES(Advanced Encryption Standard) is recommended as a replacement of DES, 
 it’s more secure; with AES-NI support, the throughput can reach nearly 
 *2GB/s*, it won’t be the bottleneck any more, AES and CryptoCodec work is 
 supported in HADOOP-10150, HADOOP-10603 and HADOOP-10693 (We may need to add 
 a new mode support for AES). 
 This JIRA will use AES with AES-NI support as encryption algorithm for 
 DataTransferProtocol.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7291) Persist in-memory replicas with appropriate unbuffered copy API on POSIX and Windows

[
https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187013#comment-14187013
]

Hadoop QA commented on HDFS-7291:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12677537/HDFS-7291.4.patch
against trunk revision 58c0bb9.

{color:red}-1 patch{color}. Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8570//console

This message is automatically generated.

Persist in-memory replicas with appropriate unbuffered copy API on POSIX and
Windows

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7291) Persist in-memory replicas with appropriate unbuffered copy API on POSIX and Windows

2014-10-28 Thread Chris Nauroth (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Nauroth updated HDFS-7291:

Hadoop Flags: Reviewed

+1 for the patch. The Jenkins failure looks spurious. I triggered another
run. I'll wait for that and then commit.

https://builds.apache.org/job/PreCommit-HDFS-Build/8570/

Persist in-memory replicas with appropriate unbuffered copy API on POSIX and
Windows

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5894) Refactor a private internal class DataTransferEncryptor.SaslParticipant


[ 
https://issues.apache.org/jira/browse/HDFS-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187012#comment-14187012
 ] 

Hadoop QA commented on HDFS-5894:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677596/HDFS-5894.patch
  against trunk revision c9bec46.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8568//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8568//console

This message is automatically generated.

 Refactor a private internal class DataTransferEncryptor.SaslParticipant
 ---

 Key: HDFS-5894
 URL: https://issues.apache.org/jira/browse/HDFS-5894
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Hiroshi Ikeda
Assignee: Harsh J
Priority: Trivial
 Attachments: HDFS-5894.patch, HDFS-5894.patch, HDFS-5894.patch


 It is appropriate to use polymorphism for SaslParticipant instead of 
 scattering if-else statements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed

Kihwal Lee created HDFS-7300:


 Summary: The getMaxNodesPerRack() method in 
BlockPlacementPolicyDefault is flawed
 Key: HDFS-7300
 URL: https://issues.apache.org/jira/browse/HDFS-7300
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Priority: Critical


The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases.
- Three replicas on two racks. The max is 3, so everything can go to one rack.
- Two replicas on two or more racks. The max is 2, both replicas can end up in 
the same rack.

{{BlockManager#isNeededReplication()}} fixes this after block/file is closed 
because {{blockHasEnoughRacks()}} will return fail.  This is not only extra 
work, but also can break the favored nodes feature.

When there are two racks and two favored nodes are specified in the same rack, 
NN may allocate the third replica on a node in the same rack, because 
{{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the other 
rack. There is 66% chance that a favored node is moved.  If {{maxNodesPerRack}} 
was 2, this would not happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed


 [ 
https://issues.apache.org/jira/browse/HDFS-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7300:
-
Target Version/s: 2.7.0

 The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
 

 Key: HDFS-7300
 URL: https://issues.apache.org/jira/browse/HDFS-7300
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Priority: Critical

 The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases.
 - Three replicas on two racks. The max is 3, so everything can go to one rack.
 - Two replicas on two or more racks. The max is 2, both replicas can end up 
 in the same rack.
 {{BlockManager#isNeededReplication()}} fixes this after block/file is closed 
 because {{blockHasEnoughRacks()}} will return fail.  This is not only extra 
 work, but also can break the favored nodes feature.
 When there are two racks and two favored nodes are specified in the same 
 rack, NN may allocate the third replica on a node in the same rack, because 
 {{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the 
 other rack. There is 66% chance that a favored node is moved.  If 
 {{maxNodesPerRack}} was 2, this would not happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7291) Persist in-memory replicas with appropriate unbuffered copy API on POSIX and Windows

2014-10-28 Thread Xiaoyu Yao (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187057#comment-14187057
]

Xiaoyu Yao commented on HDFS-7291:
--

The Jenkins failure is related to the change from HADOOP-10926 on
test-patch.sh.
An earlier break was found by HADOOP-11240 and resolved. I attached the new
failures to HADOOP-10926.

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.

Persist in-memory replicas with appropriate unbuffered copy API on POSIX and
Windows

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock


[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187153#comment-14187153
 ] 

Hadoop QA commented on HDFS-7235:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677607/HDFS-7235.007.patch
  against trunk revision 58c0bb9.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8569//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8569//console

This message is automatically generated.

 DataNode#transferBlock should report blocks that don't exist using 
 reportBadBlock
 -

 Key: HDFS-7235
 URL: https://issues.apache.org/jira/browse/HDFS-7235
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
 HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
 HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch


 When to decommission a DN, the process hangs. 
 What happens is, when NN chooses a replica as a source to replicate data on 
 the to-be-decommissioned DN to other DNs, it favors choosing this DN 
 to-be-decommissioned as the source of transfer (see BlockManager.java).  
 However, because of the bad disk, the DN would detect the source block to be 
 transfered as invalidBlock with the following logic in FsDatasetImpl.java:
 {code}
 /** Does the block exist and have the given state? */
   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
 final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
 b.getLocalBlock());
 return replicaInfo != null
  replicaInfo.getState() == state
  replicaInfo.getBlockFile().exists();
   }
 {code}
 The reason that this method returns false (detecting invalid block) is 
 because the block file doesn't exist due to bad disk in this case. 
 The key issue we found here is, after DN detects an invalid block for the 
 above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
 know that the block is corrupted, and keeps sending the data transfer request 
 to the same DN to be decommissioned, again and again. This caused an infinite 
 loop, so the decommission process hangs.
 Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6252) Phase out the old web UI in HDFS


[ 
https://issues.apache.org/jira/browse/HDFS-6252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187163#comment-14187163
 ] 

Zhe Zhang commented on HDFS-6252:
-

[~wheat9] I'm working on HDFS-7165 which requires a change to 
{{TestMissingBlocksAlert}}. It seems to me your changes to 
{{TestMissingBlocksAlert}} is compatible with and should go into branch-2. Let 
me know if you agree. Thanks!

 Phase out the old web UI in HDFS
 

 Key: HDFS-6252
 URL: https://issues.apache.org/jira/browse/HDFS-6252
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.5.0
Reporter: Fengdong Yu
Assignee: Haohui Mai
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-6252-branch-2.000.patch, HDFS-6252.000.patch, 
 HDFS-6252.001.patch, HDFS-6252.002.patch, HDFS-6252.003.patch, 
 HDFS-6252.004.patch, HDFS-6252.005.patch, HDFS-6252.006.patch


 We've deprecated hftp and hsftp in HDFS-5570, so if we always download file 
 from download this file on the browseDirectory.jsp, it will throw an error:
 Problem accessing /streamFile/***
 because streamFile servlet was deleted in HDFS-5570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed


 [ 
https://issues.apache.org/jira/browse/HDFS-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7300:
-
Assignee: Kihwal Lee
  Status: Patch Available  (was: Open)

 The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
 

 Key: HDFS-7300
 URL: https://issues.apache.org/jira/browse/HDFS-7300
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7300.patch


 The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases.
 - Three replicas on two racks. The max is 3, so everything can go to one rack.
 - Two replicas on two or more racks. The max is 2, both replicas can end up 
 in the same rack.
 {{BlockManager#isNeededReplication()}} fixes this after block/file is closed 
 because {{blockHasEnoughRacks()}} will return fail.  This is not only extra 
 work, but also can break the favored nodes feature.
 When there are two racks and two favored nodes are specified in the same 
 rack, NN may allocate the third replica on a node in the same rack, because 
 {{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the 
 other rack. There is 66% chance that a favored node is moved.  If 
 {{maxNodesPerRack}} was 2, this would not happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed


 [ 
https://issues.apache.org/jira/browse/HDFS-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7300:
-
Attachment: HDFS-7300.patch

Submitting the patch without a test case to run tests by precommit.  There is 
also a bug in {{chooseTarget()}} for the favored nodes case. 

 The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
 

 Key: HDFS-7300
 URL: https://issues.apache.org/jira/browse/HDFS-7300
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7300.patch


 The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases.
 - Three replicas on two racks. The max is 3, so everything can go to one rack.
 - Two replicas on two or more racks. The max is 2, both replicas can end up 
 in the same rack.
 {{BlockManager#isNeededReplication()}} fixes this after block/file is closed 
 because {{blockHasEnoughRacks()}} will return fail.  This is not only extra 
 work, but also can break the favored nodes feature.
 When there are two racks and two favored nodes are specified in the same 
 rack, NN may allocate the third replica on a node in the same rack, because 
 {{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the 
 other rack. There is 66% chance that a favored node is moved.  If 
 {{maxNodesPerRack}} was 2, this would not happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed


[ 
https://issues.apache.org/jira/browse/HDFS-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187195#comment-14187195
 ] 

Kihwal Lee commented on HDFS-7300:
--

The base formula for caluculating the value is
{noformat}
  maxNodesPerRack = (totalNumOfReplicas-1)/numOfRacks + 2
{noformat}

In the patch, the single rack case and the single replica case are handled 
without applying this formula. Then it is guaranteed that the number of rack is 
greater than 1 when calculating the max value.  It is also guaranteed to give a 
sufficiently big max value.

{noformat}
  maxNodePerRack * numOfRacks = totalNumOfReplicas
  totalNumOfReplicas-1 + 2 * numOfRack = totalNumOfReplicas
  numOfRack = 0.5
{noformat}

Since numOfRacks is greater than 1, maxNodePerRack is guaranteed to be large 
enough.
In order to take care of the case of {{maxNodePerRack == totalNumOfReplicas}}, 
which happens in the cases described in the description, maxNodePerRack is 
decremented if necessary. This still results in a sufficiently large value.

{noformat}
  (maxNodePerRack - 1) * numOfRacks  totalNumOfReplicas
  totalNumOfReplicas-1 + numOfRack  totalNumOfReplicas
  numOfRack  1
{noformat}

It shows the resulting max value is not only large enough, but also allows a 
bit of slack for unbalanced racks, as the original formula does.

 The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
 

 Key: HDFS-7300
 URL: https://issues.apache.org/jira/browse/HDFS-7300
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7300.patch


 The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases.
 - Three replicas on two racks. The max is 3, so everything can go to one rack.
 - Two replicas on two or more racks. The max is 2, both replicas can end up 
 in the same rack.
 {{BlockManager#isNeededReplication()}} fixes this after block/file is closed 
 because {{blockHasEnoughRacks()}} will return fail.  This is not only extra 
 work, but also can break the favored nodes feature.
 When there are two racks and two favored nodes are specified in the same 
 rack, NN may allocate the third replica on a node in the same rack, because 
 {{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the 
 other rack. There is 66% chance that a favored node is moved.  If 
 {{maxNodesPerRack}} was 2, this would not happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7291) Persist in-memory replicas with appropriate unbuffered copy API on POSIX and Windows

2014-10-28 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187224#comment-14187224
 ] 

Chris Nauroth commented on HDFS-7291:
-

This build seems to be doing better:

https://builds.apache.org/job/PreCommit-HDFS-Build/8571/

 Persist in-memory replicas with appropriate unbuffered copy API on POSIX and 
 Windows
 

 Key: HDFS-7291
 URL: https://issues.apache.org/jira/browse/HDFS-7291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7291.0.patch, HDFS-7291.1.patch, HDFS-7291.2.patch, 
 HDFS-7291.3.patch, HDFS-7291.4.patch


 HDFS-7090 changes to persist in-memory replicas using unbuffered IO on Linux 
 and Windows. On Linux distribution, it relies on the sendfile() API between 
 two file descriptors to achieve unbuffered IO copy. According to Linux 
 document at http://man7.org/linux/man-pages/man2/sendfile.2.html, this is 
 only supported on Linux kernel 2.6.33+. 
 As pointed by Haowei in the discussion below, FileChannel#transferTo already 
 has support for native unbuffered IO on POSIX platform. On Windows, JDK 6/7/8 
 has not implemented native unbuffered IO yet. We change to use 
 FileChannel#transfer for POSIX and our own native wrapper of CopyFileEx on 
 Windows for unbuffered copy.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6252) Phase out the old web UI in HDFS

2014-10-28 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187231#comment-14187231
 ] 

Haohui Mai commented on HDFS-6252:
--

Yes. Please feel free to file a jira and merge the changes to branch-2.

 Phase out the old web UI in HDFS
 

 Key: HDFS-6252
 URL: https://issues.apache.org/jira/browse/HDFS-6252
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.5.0
Reporter: Fengdong Yu
Assignee: Haohui Mai
Priority: Minor
 Fix For: 2.7.0

 Attachments: HDFS-6252-branch-2.000.patch, HDFS-6252.000.patch, 
 HDFS-6252.001.patch, HDFS-6252.002.patch, HDFS-6252.003.patch, 
 HDFS-6252.004.patch, HDFS-6252.005.patch, HDFS-6252.006.patch


 We've deprecated hftp and hsftp in HDFS-5570, so if we always download file 
 from download this file on the browseDirectory.jsp, it will throw an error:
 Problem accessing /streamFile/***
 because streamFile servlet was deleted in HDFS-5570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7301) TestMissingBlocksAlert should use MXBeans instead of old web UI

Zhe Zhang created HDFS-7301:
---

 Summary: TestMissingBlocksAlert should use MXBeans instead of old 
web UI
 Key: HDFS-7301
 URL: https://issues.apache.org/jira/browse/HDFS-7301
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Zhe Zhang
Assignee: Zhe Zhang


HDFS-6252 has phased out the old web UI in trunk. {{TestMissingBlocksAlert}} 
was excluded in its branch-2 patch. After revisiting the problem [~wheat9] and 
I agreed that it should go into branch-2. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed


[ 
https://issues.apache.org/jira/browse/HDFS-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187195#comment-14187195
 ] 

Kihwal Lee edited comment on HDFS-7300 at 10/28/14 7:08 PM:


The base formula for caluculating the value is
{noformat}
  maxNodesPerRack = (totalNumOfReplicas-1)/numOfRacks + 2
{noformat}

In the patch, the single rack case and the single replica case are handled 
without applying this formula. Then it is guaranteed that the number of rack is 
greater than 1 when calculating the max value.  It is also guaranteed to give a 
sufficiently big max value.

{noformat}
  maxNodePerRack * numOfRacks = totalNumOfReplicas
  totalNumOfReplicas-1 + 2 * numOfRack = totalNumOfReplicas
  numOfRack = 0.5
{noformat}

Since numOfRacks is greater than 1, maxNodePerRack is guaranteed to be large 
enough.
In order to take care of the case of {{maxNodePerRack == totalNumOfReplicas}}, 
which happens in the cases listed in the description, maxNodePerRack is 
decremented if necessary. This still results in a sufficiently large value.

{noformat}
  (maxNodePerRack - 1) * numOfRacks  totalNumOfReplicas
  totalNumOfReplicas-1 + numOfRack  totalNumOfReplicas
  numOfRack  1
{noformat}

It shows the resulting max value is not only large enough, but also allows a 
bit of slack for unbalanced racks, as the original formula does.


was (Author: kihwal):
The base formula for caluculating the value is
{noformat}
  maxNodesPerRack = (totalNumOfReplicas-1)/numOfRacks + 2
{noformat}

In the patch, the single rack case and the single replica case are handled 
without applying this formula. Then it is guaranteed that the number of rack is 
greater than 1 when calculating the max value.  It is also guaranteed to give a 
sufficiently big max value.

{noformat}
  maxNodePerRack * numOfRacks = totalNumOfReplicas
  totalNumOfReplicas-1 + 2 * numOfRack = totalNumOfReplicas
  numOfRack = 0.5
{noformat}

Since numOfRacks is greater than 1, maxNodePerRack is guaranteed to be large 
enough.
In order to take care of the case of {{maxNodePerRack == totalNumOfReplicas}}, 
which happens in the cases described in the description, maxNodePerRack is 
decremented if necessary. This still results in a sufficiently large value.

{noformat}
  (maxNodePerRack - 1) * numOfRacks  totalNumOfReplicas
  totalNumOfReplicas-1 + numOfRack  totalNumOfReplicas
  numOfRack  1
{noformat}

It shows the resulting max value is not only large enough, but also allows a 
bit of slack for unbalanced racks, as the original formula does.

 The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
 

 Key: HDFS-7300
 URL: https://issues.apache.org/jira/browse/HDFS-7300
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7300.patch


 The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases.
 - Three replicas on two racks. The max is 3, so everything can go to one rack.
 - Two replicas on two or more racks. The max is 2, both replicas can end up 
 in the same rack.
 {{BlockManager#isNeededReplication()}} fixes this after block/file is closed 
 because {{blockHasEnoughRacks()}} will return fail.  This is not only extra 
 work, but also can break the favored nodes feature.
 When there are two racks and two favored nodes are specified in the same 
 rack, NN may allocate the third replica on a node in the same rack, because 
 {{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the 
 other rack. There is 66% chance that a favored node is moved.  If 
 {{maxNodesPerRack}} was 2, this would not happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream


 [ 
https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7276:
--
Attachment: h7276_20141028.patch

h7276_20141028.patch: using timed wait.

 Limit the number of byte arrays used by DFSOutputStream
 ---

 Key: HDFS-7276
 URL: https://issues.apache.org/jira/browse/HDFS-7276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7276_20141021.patch, h7276_20141022.patch, 
 h7276_20141023.patch, h7276_20141024.patch, h7276_20141027.patch, 
 h7276_20141027b.patch, h7276_20141028.patch


 When there are a lot of DFSOutputStream's writing concurrently, the number of 
 outstanding packets could be large.  The byte arrays created by those packets 
 could occupy a lot of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream


[ 
https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187317#comment-14187317
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7276:
---

The percentage differences are shown in the performance test
{noformat}
arrayLength=65536, nThreads=512, nAllocations=32768, maxArrays=1024
  NewByteArrayWithoutLimit: 3439,  3394,  3497,  3459,  3442, avg= 3.446s
 NewByteArrayWithLimit: 3448,  3563,  3552,  3492,  3509, avg= 3.513s (  
1.93%)
 UsingByteArrayManager: 3357,  3369,  3327,  3345,  3324, avg= 3.344s ( 
-2.95%) ( -4.79%)
{noformat}
The time elapsed for UsingByteArrayManager is 2.95% and 4.79% less than 
NewByteArrayWithoutLimit and NewByteArrayWithLimit, respectively.

 Limit the number of byte arrays used by DFSOutputStream
 ---

 Key: HDFS-7276
 URL: https://issues.apache.org/jira/browse/HDFS-7276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h7276_20141021.patch, h7276_20141022.patch, 
 h7276_20141023.patch, h7276_20141024.patch, h7276_20141027.patch, 
 h7276_20141027b.patch, h7276_20141028.patch


 When there are a lot of DFSOutputStream's writing concurrently, the number of 
 outstanding packets could be large.  The byte arrays created by those packets 
 could occupy a lot of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token

2014-10-28 Thread bc Wong (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187324#comment-14187324
]

bc Wong commented on HDFS-7295:
---

[~ste...@apache.org], I don't understand the scaling concern with revocation. I
was thinking that we can just cancel the DT with NN. And that's already
supported. If the NN no longer knows about a DT, then the requests using such a
DT will automatically get rejected. No need to keep a separate revocation list,
unlike the X509 stuff.

This requires showing the HDFS admin what are all the outstanding DTs, and
logging the DT (a SHA hash) in the audit log. That latter facility is already
in place today, and the SHA hash of the DT is cached (HDFS-4680). This works at
a pretty large scale for us. So I'm not that concerned about perf here.

bq. pushing out new tokens from the client

When the Spark Streaming app is running, it's all in the cluster. It doesn't
have any Kerberos credential at that point. I don't think it can get new
tokens. Right?

Support arbitrary max expiration times for delegation token
---

Key: HDFS-7295
URL: https://issues.apache.org/jira/browse/HDFS-7295
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6663) Admin command to track file and locations from block id


[ 
https://issues.apache.org/jira/browse/HDFS-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187328#comment-14187328
 ] 

Kihwal Lee commented on HDFS-6663:
--

+1 the latest patch looks good.

 Admin command to track file and locations from block id
 ---

 Key: HDFS-6663
 URL: https://issues.apache.org/jira/browse/HDFS-6663
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He
 Attachments: HDFS-6663-2.patch, HDFS-6663-3.patch, HDFS-6663-3.patch, 
 HDFS-6663-4.patch, HDFS-6663-5.patch, HDFS-6663-5.patch, HDFS-6663-WIP.patch, 
 HDFS-6663.patch


 A dfsadmin command that allows finding out the file and the locations given a 
 block number will be very useful in debugging production issues.   It may be 
 possible to add this feature to Fsck, instead of creating a new command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6663) Admin command to track file and locations from block id


[ 
https://issues.apache.org/jira/browse/HDFS-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187338#comment-14187338
 ] 

Hudson commented on HDFS-6663:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6370 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6370/])
HDFS-6663. Admin command to track file and locations from block id. (kihwal: 
rev 371a3b87ed346732ed58a4faab0c6c1db57c86ed)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CorruptReplicasMap.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java


 Admin command to track file and locations from block id
 ---

 Key: HDFS-6663
 URL: https://issues.apache.org/jira/browse/HDFS-6663
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He
 Attachments: HDFS-6663-2.patch, HDFS-6663-3.patch, HDFS-6663-3.patch, 
 HDFS-6663-4.patch, HDFS-6663-5.patch, HDFS-6663-5.patch, HDFS-6663-WIP.patch, 
 HDFS-6663.patch


 A dfsadmin command that allows finding out the file and the locations given a 
 block number will be very useful in debugging production issues.   It may be 
 possible to add this feature to Fsck, instead of creating a new command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6663) Admin command to track file and locations from block id


[ 
https://issues.apache.org/jira/browse/HDFS-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187339#comment-14187339
 ] 

Kihwal Lee commented on HDFS-6663:
--

Committed to trunk and cherry-picked to branch-2. There was a merge conflict 
due to the context difference in help message in branch-2.

 Admin command to track file and locations from block id
 ---

 Key: HDFS-6663
 URL: https://issues.apache.org/jira/browse/HDFS-6663
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He
 Attachments: HDFS-6663-2.patch, HDFS-6663-3.patch, HDFS-6663-3.patch, 
 HDFS-6663-4.patch, HDFS-6663-5.patch, HDFS-6663-5.patch, HDFS-6663-WIP.patch, 
 HDFS-6663.patch


 A dfsadmin command that allows finding out the file and the locations given a 
 block number will be very useful in debugging production issues.   It may be 
 possible to add this feature to Fsck, instead of creating a new command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6663) Admin command to track file and locations from block id


 [ 
https://issues.apache.org/jira/browse/HDFS-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6663:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Admin command to track file and locations from block id
 ---

 Key: HDFS-6663
 URL: https://issues.apache.org/jira/browse/HDFS-6663
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He
 Fix For: 2.7.0

 Attachments: HDFS-6663-2.patch, HDFS-6663-3.patch, HDFS-6663-3.patch, 
 HDFS-6663-4.patch, HDFS-6663-5.patch, HDFS-6663-5.patch, HDFS-6663-WIP.patch, 
 HDFS-6663.patch


 A dfsadmin command that allows finding out the file and the locations given a 
 block number will be very useful in debugging production issues.   It may be 
 possible to add this feature to Fsck, instead of creating a new command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6663) Admin command to track file and locations from block id


 [ 
https://issues.apache.org/jira/browse/HDFS-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6663:
-
Target Version/s: 2.7.0  (was: 2.6.0)
   Fix Version/s: 2.7.0

 Admin command to track file and locations from block id
 ---

 Key: HDFS-6663
 URL: https://issues.apache.org/jira/browse/HDFS-6663
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Chen He
 Fix For: 2.7.0

 Attachments: HDFS-6663-2.patch, HDFS-6663-3.patch, HDFS-6663-3.patch, 
 HDFS-6663-4.patch, HDFS-6663-5.patch, HDFS-6663-5.patch, HDFS-6663-WIP.patch, 
 HDFS-6663.patch


 A dfsadmin command that allows finding out the file and the locations given a 
 block number will be very useful in debugging production issues.   It may be 
 possible to add this feature to Fsck, instead of creating a new command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7213) processIncrementalBlockReport performance degradation


[ 
https://issues.apache.org/jira/browse/HDFS-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187357#comment-14187357
 ] 

Kihwal Lee commented on HDFS-7213:
--

+1

 processIncrementalBlockReport performance degradation
 -

 Key: HDFS-7213
 URL: https://issues.apache.org/jira/browse/HDFS-7213
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Daryn Sharp
Assignee: Eric Payne
Priority: Critical
 Attachments: HDFS-7213.1412804753, HDFS-7213.1412806496.txt


 {{BlockManager#processIncrementalBlockReport}} has a debug line that is 
 missing a {{isDebugEnabled}} check.  The write lock is being held.  Coupled 
 with the increase in incremental block reports from receiving blocks, under 
 heavy load this log line noticeably degrades performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7213) processIncrementalBlockReport performance degradation


 [ 
https://issues.apache.org/jira/browse/HDFS-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7213:
-
   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk and cherry-picked to branch-2. Thanks for fixing it, Eric.

 processIncrementalBlockReport performance degradation
 -

 Key: HDFS-7213
 URL: https://issues.apache.org/jira/browse/HDFS-7213
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Daryn Sharp
Assignee: Eric Payne
Priority: Critical
 Fix For: 2.7.0

 Attachments: HDFS-7213.1412804753, HDFS-7213.1412806496.txt


 {{BlockManager#processIncrementalBlockReport}} has a debug line that is 
 missing a {{isDebugEnabled}} check.  The write lock is being held.  Coupled 
 with the increase in incremental block reports from receiving blocks, under 
 heavy load this log line noticeably degrades performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7213) processIncrementalBlockReport performance degradation


[ 
https://issues.apache.org/jira/browse/HDFS-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187362#comment-14187362
 ] 

Hudson commented on HDFS-7213:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6371 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6371/])
HDFS-7213. processIncrementalBlockReport performance degradation. (kihwal: rev 
e226b5b40d716b6d363c43a8783766b72734e347)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 processIncrementalBlockReport performance degradation
 -

 Key: HDFS-7213
 URL: https://issues.apache.org/jira/browse/HDFS-7213
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Daryn Sharp
Assignee: Eric Payne
Priority: Critical
 Fix For: 2.7.0

 Attachments: HDFS-7213.1412804753, HDFS-7213.1412806496.txt


 {{BlockManager#processIncrementalBlockReport}} has a debug line that is 
 missing a {{isDebugEnabled}} check.  The write lock is being held.  Coupled 
 with the increase in incremental block reports from receiving blocks, under 
 heavy load this log line noticeably degrades performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7302) namenode -rollingUpgrade downgrade may finalize a rolling upgrade

Tsz Wo Nicholas Sze created HDFS-7302:
-

 Summary: namenode -rollingUpgrade downgrade may finalize a rolling 
upgrade
 Key: HDFS-7302
 URL: https://issues.apache.org/jira/browse/HDFS-7302
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze


The namenode startup option -rollingUpgrade downgrade is originally designed 
for downgrading cluster.  However, running namenode -rollingUpgrade downgrade 
with the new software could result in finalizing the ongoing rolling upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7302) namenode -rollingUpgrade downgrade may finalize a rolling upgrade


[ 
https://issues.apache.org/jira/browse/HDFS-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187373#comment-14187373
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7302:
---

Downgrade actually can be done in a rolling fashion as shown in HDFS-7230.  So 
the -rollingUpgrade downgrade startup option is indeed not very useful.  I 
suggest removing it.

 namenode -rollingUpgrade downgrade may finalize a rolling upgrade
 -

 Key: HDFS-7302
 URL: https://issues.apache.org/jira/browse/HDFS-7302
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze

 The namenode startup option -rollingUpgrade downgrade is originally 
 designed for downgrading cluster.  However, running namenode -rollingUpgrade 
 downgrade with the new software could result in finalizing the ongoing 
 rolling upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7303) If there are multiple datanodes on the same host, then only one datanode is listed on the NN UI’s datanode tab

2014-10-28 Thread Benoy Antony (JIRA)

Benoy Antony created HDFS-7303:
--

 Summary:  If there are multiple datanodes on the same host, then 
only one datanode is listed on the NN UI’s datanode tab
 Key: HDFS-7303
 URL: https://issues.apache.org/jira/browse/HDFS-7303
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Benoy Antony
Assignee: Benoy Antony
Priority: Minor


If you start multiple datanodes on different ports on the the same host, only 
one of them appears in the NN UI’s datanode tab.

While this is not a common scenario, there are still scenarios where you need 
to start multiple datanodes on the same host.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7303) If there are multiple datanodes on the same host, then only one datanode is listed on the NN UI’s datanode tab

2014-10-28 Thread Benoy Antony (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated HDFS-7303:
---
Status: Patch Available  (was: Open)

  If there are multiple datanodes on the same host, then only one datanode is 
 listed on the NN UI’s datanode tab
 ---

 Key: HDFS-7303
 URL: https://issues.apache.org/jira/browse/HDFS-7303
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Benoy Antony
Assignee: Benoy Antony
Priority: Minor
 Attachments: HDFS-7303.patch


 If you start multiple datanodes on different ports on the the same host, only 
 one of them appears in the NN UI’s datanode tab.
 While this is not a common scenario, there are still scenarios where you need 
 to start multiple datanodes on the same host.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7303) If there are multiple datanodes on the same host, then only one datanode is listed on the NN UI’s datanode tab

2014-10-28 Thread Benoy Antony (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony updated HDFS-7303:
---
Attachment: HDFS-7303.patch

Attaching patch which does the following :

1. If there are multiple datanodes on the same host, host:port displayed.
2. If there is a single datanode on the host, only host is displayed.

This is done for live nodes, dead nodes and decommissioned nodes.

  If there are multiple datanodes on the same host, then only one datanode is 
 listed on the NN UI’s datanode tab
 ---

 Key: HDFS-7303
 URL: https://issues.apache.org/jira/browse/HDFS-7303
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Benoy Antony
Assignee: Benoy Antony
Priority: Minor
 Attachments: HDFS-7303.patch


 If you start multiple datanodes on different ports on the the same host, only 
 one of them appears in the NN UI’s datanode tab.
 While this is not a common scenario, there are still scenarios where you need 
 to start multiple datanodes on the same host.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7301) TestMissingBlocksAlert should use MXBeans instead of old web UI


 [ 
https://issues.apache.org/jira/browse/HDFS-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7301:

Status: Patch Available  (was: Open)

 TestMissingBlocksAlert should use MXBeans instead of old web UI
 ---

 Key: HDFS-7301
 URL: https://issues.apache.org/jira/browse/HDFS-7301
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Zhe Zhang
Assignee: Zhe Zhang

 HDFS-6252 has phased out the old web UI in trunk. {{TestMissingBlocksAlert}} 
 was excluded in its branch-2 patch. After revisiting the problem [~wheat9] 
 and I agreed that it should go into branch-2. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7301) TestMissingBlocksAlert should use MXBeans instead of old web UI