[jira] [Updated] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Maysam Yabandeh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maysam Yabandeh updated HDFS-6982:
--
Status: Patch Available  (was: Open)

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> HDFS-6982.v7.patch, nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Maysam Yabandeh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maysam Yabandeh updated HDFS-6982:
--
Status: Open  (was: Patch Available)

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> HDFS-6982.v7.patch, nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7385) ThreadLocal used in FSEditLog class lead FSImage permission mess up

2014-11-12 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209383#comment-14209383
 ] 

Vinayakumar B commented on HDFS-7385:
-

Hi [~jiangyu1211], Good find.

Instead of {{op.setAclEntries(null)}} and {{op.setXAttrs(null)}}, how about 
introducing a {{reset()}} method in both {{AddOp}} and {{MkdirOp}} which will 
reset all values to null. This method can be called as soon as get from the 
ThreadLocal cache and later setters can set the value whatever they want.
This will avoid, any such mistakes in future.

Ex:
{code}
MkdirOp op = MkdirOp.getInstance(cache.get())
  .reset()
  .setInodeId(newNode.getId())
  .setPath(path)
  .setTimestamp(newNode.getModificationTime())
  .setPermissionStatus(permissions);
{code}

> ThreadLocal used in FSEditLog class  lead FSImage permission mess up
> 
>
> Key: HDFS-7385
> URL: https://issues.apache.org/jira/browse/HDFS-7385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0, 2.5.0
>Reporter: jiangyu
>Assignee: jiangyu
>Priority: Critical
> Attachments: HDFS-7385.patch
>
>
>   We migrated our NameNodes from low configuration to high configuration 
> machines last week. Firstly,we  imported the current directory including 
> fsimage and editlog files from original ActiveNameNode to new ActiveNameNode 
> and started the New NameNode, then  changed the configuration of all 
> datanodes and restarted all of datanodes , then blockreport to new NameNodes 
> at once and send heartbeat after that.
>Everything seemed perfect, but after we restarted Resoucemanager , 
> most of the users compained that their jobs couldn't be executed for the 
> reason of permission problem.
>   We applied Acls in our clusters, and after migrated we found most of 
> the directories and files which were not set Acls before now had the 
> properties of Acls. That is the reason why users could not execute their 
> jobs.So we had to change most of the files permission to a+r and directories 
> permission to a+rx to make sure the jobs can be executed.
> After searching this problem for some days, i found there is a bug in 
> FSEditLog.java. The ThreadLocal variable cache in FSEditLog don’t set the 
> proper value in logMkdir and logOpenFile functions. Here is the code of 
> logMkdir:
>   public void logMkDir(String path, INode newNode) {
> PermissionStatus permissions = newNode.getPermissionStatus();
> MkdirOp op = MkdirOp.getInstance(cache.get())
>   .setInodeId(newNode.getId())
>   .setPath(path)
>   .setTimestamp(newNode.getModificationTime())
>   .setPermissionStatus(permissions);
> AclFeature f = newNode.getAclFeature();
> if (f != null) {
>   op.setAclEntries(AclStorage.readINodeLogicalAcl(newNode));
> }
> logEdit(op);
>   }
>   For example, if we mkdir with Acls through one handler(Thread indeed), 
> we set the AclEntries to the op from the cache. After that, if we mkdir 
> without any Acls setting and set through the same handler, the AclEnties from 
> the cache is the same with the last one which set the Acls, and because the 
> newNode have no AclFeature, we don’t have any chance to change it. Then the 
> editlog is wrong,record the wrong Acls. After the Standby load the editlogs 
> from journalnodes and  apply them to memory in SNN then savenamespace and 
> transfer the wrong fsimage to ANN, all the fsimages get wrong. The only 
> solution is to save namespace from ANN and you can get the right fsimage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209378#comment-14209378
 ] 

Yongjun Zhang commented on HDFS-7386:
-

Many thanks Chris! I just uploaded rev 002 to address both of your comments. 
Really appreciate your help on reviewing the patch.

I searched the code base for 1024 but not 1023:-) Very nice that you pointed 
out the place I missed. 


> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Attachments: HDFS-7386.001.patch, HDFS-7386.002.patch
>
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7386:

Attachment: HDFS-7386.002.patch

> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Attachments: HDFS-7386.001.patch, HDFS-7386.002.patch
>
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7392) org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever

2014-11-12 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209299#comment-14209299
 ] 

Yi Liu commented on HDFS-7392:
--

[~vacekf], I can't get exact issue from your description, do you get issue in 
real environment?  If so, you can write the repro steps in comments. 

> org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever
> -
>
> Key: HDFS-7392
> URL: https://issues.apache.org/jira/browse/HDFS-7392
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Frantisek Vacek
>Priority: Critical
> Attachments: 1.png, 2.png
>
>
> In some specific circumstances, 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts 
> and last forever.
> What are specific circumstances:
> 1) HDFS URI (hdfs://share.merck.com:8020/someDir/someFile.txt) should point 
> to valid IP address but without name node service running on it.
> 2) There should be at least 2 IP addresses for such a URI. See output below:
> {quote}
> [~/proj/quickbox]$ nslookup share.merck.com
> Server: 127.0.1.1
> Address:127.0.1.1#53
> share.merck.com canonical name = 
> internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com.
> Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
> Address: 54.40.29.223
> Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
> Address: 54.40.29.65
> {quote}
> In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() 
> returns sometimes true (even if address didn't actually changed see img. 1) 
> and the timeoutFailures counter is set to 0 (see img. 2). The 
> maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is 
> repeated forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7056) Snapshot support for truncate

2014-11-12 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-7056:
---
Attachment: HDFS-3107-HDFS-7056-combined.patch

> Snapshot support for truncate
> -
>
> Key: HDFS-7056
> URL: https://issues.apache.org/jira/browse/HDFS-7056
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFSSnapshotWithTruncateDesign.docx
>
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7056) Snapshot support for truncate

2014-11-12 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-7056:
---
Attachment: (was: HDFS-3107-HDFS-7056-combined.patch)

> Snapshot support for truncate
> -
>
> Key: HDFS-7056
> URL: https://issues.apache.org/jira/browse/HDFS-7056
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFSSnapshotWithTruncateDesign.docx
>
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7056) Snapshot support for truncate

2014-11-12 Thread Plamen Jeliazkov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209290#comment-14209290
 ] 

Plamen Jeliazkov commented on HDFS-7056:


Seems Jenkins grabbed the HDFS-7056 patch. I will re-attach the combined patch 
so bot can run.

> Snapshot support for truncate
> -
>
> Key: HDFS-7056
> URL: https://issues.apache.org/jira/browse/HDFS-7056
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFSSnapshotWithTruncateDesign.docx
>
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6938) Cleanup javac warnings in FSNamesystem

2014-11-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209237#comment-14209237
 ] 

Haohui Mai commented on HDFS-6938:
--

Turned out that the patch is missing in 2.6. I've cherry-picked the patch into 
branch-2.

> Cleanup javac warnings in FSNamesystem
> --
>
> Key: HDFS-6938
> URL: https://issues.apache.org/jira/browse/HDFS-6938
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-6938.001.patch
>
>
> Clean up some unused code/compiler warnings post fs-encryption merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6938) Cleanup javac warnings in FSNamesystem

2014-11-12 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6938:
-
Affects Version/s: (was: 3.0.0)

> Cleanup javac warnings in FSNamesystem
> --
>
> Key: HDFS-6938
> URL: https://issues.apache.org/jira/browse/HDFS-6938
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-6938.001.patch
>
>
> Clean up some unused code/compiler warnings post fs-encryption merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6938) Cleanup javac warnings in FSNamesystem

2014-11-12 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6938:
-
Target Version/s:   (was: 3.0.0)

> Cleanup javac warnings in FSNamesystem
> --
>
> Key: HDFS-6938
> URL: https://issues.apache.org/jira/browse/HDFS-6938
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-6938.001.patch
>
>
> Clean up some unused code/compiler warnings post fs-encryption merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6938) Cleanup javac warnings in FSNamesystem

2014-11-12 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6938:
-
Fix Version/s: (was: 2.6.0)
   2.7.0

> Cleanup javac warnings in FSNamesystem
> --
>
> Key: HDFS-6938
> URL: https://issues.apache.org/jira/browse/HDFS-6938
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-6938.001.patch
>
>
> Clean up some unused code/compiler warnings post fs-encryption merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209227#comment-14209227
 ] 

Chris Nauroth commented on HDFS-7386:
-

Thanks for the patch, Yongjun.  This looks good.  Here are just a few comments:
# Let's JavaDoc the new {{SecurityUtil#isPrivilegedPort}} method.
# There is one more place that we can use this new method, in 
{{SecureDataNodeStarter#getSecureResources}}.  In this case, you'll want to 
negate the return value of {{SecurityUtil#isPrivilegedPort}}.


> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Attachments: HDFS-7386.001.patch
>
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7385) ThreadLocal used in FSEditLog class lead FSImage permission mess up

2014-11-12 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209172#comment-14209172
 ] 

Yi Liu commented on HDFS-7385:
--

[~jiangyu1211], {{OP_ADD}} is for create/append file, although you see the name 
"logOpenFile"
Please add the test case as soon as possible, I will help to review and try to 
push it into 2.6, since I think the issue is critical, although the fix is easy.

> ThreadLocal used in FSEditLog class  lead FSImage permission mess up
> 
>
> Key: HDFS-7385
> URL: https://issues.apache.org/jira/browse/HDFS-7385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0, 2.5.0
>Reporter: jiangyu
>Assignee: jiangyu
> Attachments: HDFS-7385.patch
>
>
>   We migrated our NameNodes from low configuration to high configuration 
> machines last week. Firstly,we  imported the current directory including 
> fsimage and editlog files from original ActiveNameNode to new ActiveNameNode 
> and started the New NameNode, then  changed the configuration of all 
> datanodes and restarted all of datanodes , then blockreport to new NameNodes 
> at once and send heartbeat after that.
>Everything seemed perfect, but after we restarted Resoucemanager , 
> most of the users compained that their jobs couldn't be executed for the 
> reason of permission problem.
>   We applied Acls in our clusters, and after migrated we found most of 
> the directories and files which were not set Acls before now had the 
> properties of Acls. That is the reason why users could not execute their 
> jobs.So we had to change most of the files permission to a+r and directories 
> permission to a+rx to make sure the jobs can be executed.
> After searching this problem for some days, i found there is a bug in 
> FSEditLog.java. The ThreadLocal variable cache in FSEditLog don’t set the 
> proper value in logMkdir and logOpenFile functions. Here is the code of 
> logMkdir:
>   public void logMkDir(String path, INode newNode) {
> PermissionStatus permissions = newNode.getPermissionStatus();
> MkdirOp op = MkdirOp.getInstance(cache.get())
>   .setInodeId(newNode.getId())
>   .setPath(path)
>   .setTimestamp(newNode.getModificationTime())
>   .setPermissionStatus(permissions);
> AclFeature f = newNode.getAclFeature();
> if (f != null) {
>   op.setAclEntries(AclStorage.readINodeLogicalAcl(newNode));
> }
> logEdit(op);
>   }
>   For example, if we mkdir with Acls through one handler(Thread indeed), 
> we set the AclEntries to the op from the cache. After that, if we mkdir 
> without any Acls setting and set through the same handler, the AclEnties from 
> the cache is the same with the last one which set the Acls, and because the 
> newNode have no AclFeature, we don’t have any chance to change it. Then the 
> editlog is wrong,record the wrong Acls. After the Standby load the editlogs 
> from journalnodes and  apply them to memory in SNN then savenamespace and 
> transfer the wrong fsimage to ANN, all the fsimages get wrong. The only 
> solution is to save namespace from ANN and you can get the right fsimage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Maysam Yabandeh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maysam Yabandeh updated HDFS-6982:
--
Attachment: HDFS-6982.v7.patch

[~andrew.wang], submitting the new patch revised based on your last comments.

A couple of explanations:
bq. Seems like 30min would be a more human-friendly number than 25min also
The idea was to increase the periods in exponential manner: 5^0,5^1, 5^2

bq. It also seems like an unnecessary step to also have to specify the 
TopAuditLogger in the conf, if a user already specified 
dfs.namenode.top.periods.min. If there are periods set, let's just also create 
the TopAuditLogger.
I am inclined towards redundantly specifying the audit logger in the conf. I 
think it would also avoid confusion for future readers if we spell out the 
registered audit loggers.

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> HDFS-6982.v7.patch, nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209205#comment-14209205
 ] 

Hadoop QA commented on HDFS-6982:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12681232/HDFS-6982.v7.patch
  against trunk revision 9f0319b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8727//console

This message is automatically generated.

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> HDFS-6982.v7.patch, nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209204#comment-14209204
 ] 

Hadoop QA commented on HDFS-7386:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12681184/HDFS-7386.001.patch
  against trunk revision d7150a1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDFSUpgradeFromImage

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8725//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8725//console

This message is automatically generated.

> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Attachments: HDFS-7386.001.patch
>
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7345) Local Reconstruction Codes (LRC)

2014-11-12 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-7345:

Description: HDFS-7285 proposes to support Erasure Coding inside HDFS, 
supports multiple Erasure Coding codecs via pluggable framework and implements 
Reed Solomon code by default. This is to support a more advanced coding 
mechanism, Local Reconstruction Codes (LRC). As discussed in the paper 
(https://www.usenix.org/system/files/conference/atc12/atc12-final181_0.pdf), 
LRC reduces the number of erasure coding fragments that need to be read when 
reconstructing data fragments that are offline, while still keeping the storage 
overhead low. The important benefits of LRC are that it reduces the bandwidth 
and I/Os required for repair reads over prior codes, while still allowing a 
significant reduction in storage overhead. The implementation would also 
consider how to distribute the calculating of local and global parity blocks to 
other relevant DataNodes.  (was: HDFS-7285 proposes to support Erasure Coding 
inside HDFS, supports multiple Erasure Coding codecs via pluggable framework 
and implements Reed Solomon code by default. This is to support a more advanced 
coding mechanism, Local Reconstruction Codes (LRC). As discussed in the paper 
(https://www.usenix.org/system/files/conference/atc12/atc12-final181_0.pdf), 
LRC reduces the number of erasure coding fragments that need to be read when 
reconstructing data fragments that are offline, while still keeping the storage 
overhead low. The important benefits of LRC are that it reduces the bandwidth 
and I/Os required for repair reads over prior codes, while still allowing a 
significant reduction in storage overhead. Intel ISA library also supports LRC 
in its update and can also be leveraged. The implementation would also consider 
how to distribute the calculating of local and global parity blocks to other 
relevant DataNodes.)

> Local Reconstruction Codes (LRC)
> 
>
> Key: HDFS-7345
> URL: https://issues.apache.org/jira/browse/HDFS-7345
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>
> HDFS-7285 proposes to support Erasure Coding inside HDFS, supports multiple 
> Erasure Coding codecs via pluggable framework and implements Reed Solomon 
> code by default. This is to support a more advanced coding mechanism, Local 
> Reconstruction Codes (LRC). As discussed in the paper 
> (https://www.usenix.org/system/files/conference/atc12/atc12-final181_0.pdf), 
> LRC reduces the number of erasure coding fragments that need to be read when 
> reconstructing data fragments that are offline, while still keeping the 
> storage overhead low. The important benefits of LRC are that it reduces the 
> bandwidth and I/Os required for repair reads over prior codes, while still 
> allowing a significant reduction in storage overhead. The implementation 
> would also consider how to distribute the calculating of local and global 
> parity blocks to other relevant DataNodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7385) ThreadLocal used in FSEditLog class lead FSImage permission mess up

2014-11-12 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7385:
-
Priority: Critical  (was: Major)

> ThreadLocal used in FSEditLog class  lead FSImage permission mess up
> 
>
> Key: HDFS-7385
> URL: https://issues.apache.org/jira/browse/HDFS-7385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0, 2.5.0
>Reporter: jiangyu
>Assignee: jiangyu
>Priority: Critical
> Attachments: HDFS-7385.patch
>
>
>   We migrated our NameNodes from low configuration to high configuration 
> machines last week. Firstly,we  imported the current directory including 
> fsimage and editlog files from original ActiveNameNode to new ActiveNameNode 
> and started the New NameNode, then  changed the configuration of all 
> datanodes and restarted all of datanodes , then blockreport to new NameNodes 
> at once and send heartbeat after that.
>Everything seemed perfect, but after we restarted Resoucemanager , 
> most of the users compained that their jobs couldn't be executed for the 
> reason of permission problem.
>   We applied Acls in our clusters, and after migrated we found most of 
> the directories and files which were not set Acls before now had the 
> properties of Acls. That is the reason why users could not execute their 
> jobs.So we had to change most of the files permission to a+r and directories 
> permission to a+rx to make sure the jobs can be executed.
> After searching this problem for some days, i found there is a bug in 
> FSEditLog.java. The ThreadLocal variable cache in FSEditLog don’t set the 
> proper value in logMkdir and logOpenFile functions. Here is the code of 
> logMkdir:
>   public void logMkDir(String path, INode newNode) {
> PermissionStatus permissions = newNode.getPermissionStatus();
> MkdirOp op = MkdirOp.getInstance(cache.get())
>   .setInodeId(newNode.getId())
>   .setPath(path)
>   .setTimestamp(newNode.getModificationTime())
>   .setPermissionStatus(permissions);
> AclFeature f = newNode.getAclFeature();
> if (f != null) {
>   op.setAclEntries(AclStorage.readINodeLogicalAcl(newNode));
> }
> logEdit(op);
>   }
>   For example, if we mkdir with Acls through one handler(Thread indeed), 
> we set the AclEntries to the op from the cache. After that, if we mkdir 
> without any Acls setting and set through the same handler, the AclEnties from 
> the cache is the same with the last one which set the Acls, and because the 
> newNode have no AclFeature, we don’t have any chance to change it. Then the 
> editlog is wrong,record the wrong Acls. After the Standby load the editlogs 
> from journalnodes and  apply them to memory in SNN then savenamespace and 
> transfer the wrong fsimage to ANN, all the fsimages get wrong. The only 
> solution is to save namespace from ANN and you can get the right fsimage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated HDFS-7391:

   Resolution: Fixed
Fix Version/s: 2.6.0
   Status: Resolved  (was: Patch Available)

I just committed this to branch-2.6. Thanks [~rkanter]!



[~kasha] - I wasn't sure if you wanted this in branch-2.5, so I haven't 
cherry-picked it. Please do so if you want to. Thanks.

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Fix For: 2.6.0
>
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7385) ThreadLocal used in FSEditLog class lead FSImage permission mess up

2014-11-12 Thread jiangyu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209162#comment-14209162
 ] 

jiangyu commented on HDFS-7385:
---

[~hitliuyi], it also occur when open files, the same reason of using the 
ThreadLocal variable cache as mkdir . I will add test case later on. 

> ThreadLocal used in FSEditLog class  lead FSImage permission mess up
> 
>
> Key: HDFS-7385
> URL: https://issues.apache.org/jira/browse/HDFS-7385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0, 2.5.0
>Reporter: jiangyu
>Assignee: jiangyu
> Attachments: HDFS-7385.patch
>
>
>   We migrated our NameNodes from low configuration to high configuration 
> machines last week. Firstly,we  imported the current directory including 
> fsimage and editlog files from original ActiveNameNode to new ActiveNameNode 
> and started the New NameNode, then  changed the configuration of all 
> datanodes and restarted all of datanodes , then blockreport to new NameNodes 
> at once and send heartbeat after that.
>Everything seemed perfect, but after we restarted Resoucemanager , 
> most of the users compained that their jobs couldn't be executed for the 
> reason of permission problem.
>   We applied Acls in our clusters, and after migrated we found most of 
> the directories and files which were not set Acls before now had the 
> properties of Acls. That is the reason why users could not execute their 
> jobs.So we had to change most of the files permission to a+r and directories 
> permission to a+rx to make sure the jobs can be executed.
> After searching this problem for some days, i found there is a bug in 
> FSEditLog.java. The ThreadLocal variable cache in FSEditLog don’t set the 
> proper value in logMkdir and logOpenFile functions. Here is the code of 
> logMkdir:
>   public void logMkDir(String path, INode newNode) {
> PermissionStatus permissions = newNode.getPermissionStatus();
> MkdirOp op = MkdirOp.getInstance(cache.get())
>   .setInodeId(newNode.getId())
>   .setPath(path)
>   .setTimestamp(newNode.getModificationTime())
>   .setPermissionStatus(permissions);
> AclFeature f = newNode.getAclFeature();
> if (f != null) {
>   op.setAclEntries(AclStorage.readINodeLogicalAcl(newNode));
> }
> logEdit(op);
>   }
>   For example, if we mkdir with Acls through one handler(Thread indeed), 
> we set the AclEntries to the op from the cache. After that, if we mkdir 
> without any Acls setting and set through the same handler, the AclEnties from 
> the cache is the same with the last one which set the Acls, and because the 
> newNode have no AclFeature, we don’t have any chance to change it. Then the 
> editlog is wrong,record the wrong Acls. After the Standby load the editlogs 
> from journalnodes and  apply them to memory in SNN then savenamespace and 
> transfer the wrong fsimage to ANN, all the fsimages get wrong. The only 
> solution is to save namespace from ANN and you can get the right fsimage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209161#comment-14209161
 ] 

Hudson commented on HDFS-7391:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6529 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6529/])
HDFS-7391. Renable SSLv2Hello in HttpFS. Contributed by Robert Kanter. 
(acmurthy: rev 9f0319bba1788e4c579ce533b14c0deab63f28ee)
* hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/tomcat/ssl-server.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7056) Snapshot support for truncate

2014-11-12 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-7056:
---
Attachment: HDFS-7056.patch
HDFS-3107-HDFS-7056-combined.patch

Trunk had moved on since I generated my patch. Refreshed both combined and 
regular patch.

Made some additional changes with help from Konstantin:
# Patch failed with compilation error due to HDFS-7381. Updated patch to 
account for new BlockIdManager in trunk.
# Renamed parameter of commitBlockSynchronization(), from "lastblock" to 
"oldBlock". We did this change because commitBlockSynchronization() handles 
both regular recovery and copy-on-write truncate now.
# Removed a call in commitBlockSynchronization() to getStoredBlock() because we 
could get it from iFile.getLastBlock().
# Fixed TestCommitBlockSynchronization again by making mocked INodeFile return 
BlockInfoUC when calling getLastBlock() on it.

> Snapshot support for truncate
> -
>
> Key: HDFS-7056
> URL: https://issues.apache.org/jira/browse/HDFS-7056
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFSSnapshotWithTruncateDesign.docx
>
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209148#comment-14209148
 ] 

Arun C Murthy commented on HDFS-7391:
-

[~kasha] you just saved me some typing chores, I'll take care of this. Thanks!

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7385) ThreadLocal used in FSEditLog class lead FSImage permission mess up

2014-11-12 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209118#comment-14209118
 ] 

Yi Liu commented on HDFS-7385:
--

[~jiangyu1211], good find, I think it's a critical issue. It should occur if 
multi-ops of create file (mkdir)  happens using same thread.

Please add a test case to reproduce it, it's not hard.

> ThreadLocal used in FSEditLog class  lead FSImage permission mess up
> 
>
> Key: HDFS-7385
> URL: https://issues.apache.org/jira/browse/HDFS-7385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0, 2.5.0
>Reporter: jiangyu
>Assignee: jiangyu
> Attachments: HDFS-7385.patch
>
>
>   We migrated our NameNodes from low configuration to high configuration 
> machines last week. Firstly,we  imported the current directory including 
> fsimage and editlog files from original ActiveNameNode to new ActiveNameNode 
> and started the New NameNode, then  changed the configuration of all 
> datanodes and restarted all of datanodes , then blockreport to new NameNodes 
> at once and send heartbeat after that.
>Everything seemed perfect, but after we restarted Resoucemanager , 
> most of the users compained that their jobs couldn't be executed for the 
> reason of permission problem.
>   We applied Acls in our clusters, and after migrated we found most of 
> the directories and files which were not set Acls before now had the 
> properties of Acls. That is the reason why users could not execute their 
> jobs.So we had to change most of the files permission to a+r and directories 
> permission to a+rx to make sure the jobs can be executed.
> After searching this problem for some days, i found there is a bug in 
> FSEditLog.java. The ThreadLocal variable cache in FSEditLog don’t set the 
> proper value in logMkdir and logOpenFile functions. Here is the code of 
> logMkdir:
>   public void logMkDir(String path, INode newNode) {
> PermissionStatus permissions = newNode.getPermissionStatus();
> MkdirOp op = MkdirOp.getInstance(cache.get())
>   .setInodeId(newNode.getId())
>   .setPath(path)
>   .setTimestamp(newNode.getModificationTime())
>   .setPermissionStatus(permissions);
> AclFeature f = newNode.getAclFeature();
> if (f != null) {
>   op.setAclEntries(AclStorage.readINodeLogicalAcl(newNode));
> }
> logEdit(op);
>   }
>   For example, if we mkdir with Acls through one handler(Thread indeed), 
> we set the AclEntries to the op from the cache. After that, if we mkdir 
> without any Acls setting and set through the same handler, the AclEnties from 
> the cache is the same with the last one which set the Acls, and because the 
> newNode have no AclFeature, we don’t have any chance to change it. Then the 
> editlog is wrong,record the wrong Acls. After the Standby load the editlogs 
> from journalnodes and  apply them to memory in SNN then savenamespace and 
> transfer the wrong fsimage to ANN, all the fsimages get wrong. The only 
> solution is to save namespace from ANN and you can get the right fsimage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7385) ThreadLocal used in FSEditLog class lead FSImage permission mess up

2014-11-12 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7385:
-
Target Version/s: 2.6.0  (was: 2.4.0, 2.5.0)

> ThreadLocal used in FSEditLog class  lead FSImage permission mess up
> 
>
> Key: HDFS-7385
> URL: https://issues.apache.org/jira/browse/HDFS-7385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0, 2.5.0
>Reporter: jiangyu
>Assignee: jiangyu
> Attachments: HDFS-7385.patch
>
>
>   We migrated our NameNodes from low configuration to high configuration 
> machines last week. Firstly,we  imported the current directory including 
> fsimage and editlog files from original ActiveNameNode to new ActiveNameNode 
> and started the New NameNode, then  changed the configuration of all 
> datanodes and restarted all of datanodes , then blockreport to new NameNodes 
> at once and send heartbeat after that.
>Everything seemed perfect, but after we restarted Resoucemanager , 
> most of the users compained that their jobs couldn't be executed for the 
> reason of permission problem.
>   We applied Acls in our clusters, and after migrated we found most of 
> the directories and files which were not set Acls before now had the 
> properties of Acls. That is the reason why users could not execute their 
> jobs.So we had to change most of the files permission to a+r and directories 
> permission to a+rx to make sure the jobs can be executed.
> After searching this problem for some days, i found there is a bug in 
> FSEditLog.java. The ThreadLocal variable cache in FSEditLog don’t set the 
> proper value in logMkdir and logOpenFile functions. Here is the code of 
> logMkdir:
>   public void logMkDir(String path, INode newNode) {
> PermissionStatus permissions = newNode.getPermissionStatus();
> MkdirOp op = MkdirOp.getInstance(cache.get())
>   .setInodeId(newNode.getId())
>   .setPath(path)
>   .setTimestamp(newNode.getModificationTime())
>   .setPermissionStatus(permissions);
> AclFeature f = newNode.getAclFeature();
> if (f != null) {
>   op.setAclEntries(AclStorage.readINodeLogicalAcl(newNode));
> }
> logEdit(op);
>   }
>   For example, if we mkdir with Acls through one handler(Thread indeed), 
> we set the AclEntries to the op from the cache. After that, if we mkdir 
> without any Acls setting and set through the same handler, the AclEnties from 
> the cache is the same with the last one which set the Acls, and because the 
> newNode have no AclFeature, we don’t have any chance to change it. Then the 
> editlog is wrong,record the wrong Acls. After the Standby load the editlogs 
> from journalnodes and  apply them to memory in SNN then savenamespace and 
> transfer the wrong fsimage to ANN, all the fsimages get wrong. The only 
> solution is to save namespace from ANN and you can get the right fsimage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7056) Snapshot support for truncate

2014-11-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209037#comment-14209037
 ] 

Hadoop QA commented on HDFS-7056:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12681191/HDFS-7056.patch
  against trunk revision b0a9cd3.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8726//console

This message is automatically generated.

> Snapshot support for truncate
> -
>
> Key: HDFS-7056
> URL: https://issues.apache.org/jira/browse/HDFS-7056
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFSSnapshotWithTruncateDesign.docx
>
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7056) Snapshot support for truncate

2014-11-12 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-7056:
---
Status: Open  (was: Patch Available)

> Snapshot support for truncate
> -
>
> Key: HDFS-7056
> URL: https://issues.apache.org/jira/browse/HDFS-7056
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFSSnapshotWithTruncateDesign.docx
>
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7342) Lease Recovery doesn't happen some times

2014-11-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209022#comment-14209022
 ] 

Ravi Prakash commented on HDFS-7342:


Some details I've been able to gather from the logs on a cluster running Hadoop 
2.2.0:
The client logs. 
{noformat}
2014-10-27 19:46:54,952 INFO [Thread-60] 
org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS 
hdfs://:8020/
. nothing related to this file...
2014-10-28 01:18:26,018 INFO [main] org.apache.hadoop.hdfs.DFSClient: Could not 
complete  retrying...
2014-10-28 01:18:26,419 INFO [main] org.apache.hadoop.hdfs.DFSClient: Could not 
complete  retrying...
...goes on for 10 mins.
2014-10-28 01:28:24,481 INFO [main] org.apache.hadoop.hdfs.DFSClient: Could not 
complete  retrying...
2014-10-28 01:28:24,883 INFO [main] org.apache.hadoop.hdfs.DFSClient: Could not 
complete  retrying...
{noformat}

The Namenode Logs grepping for 
{noformat}
2014-10-27 19:46:58,041 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: .  
blk_A_A{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW]]}
2014-10-27 20:13:26,607 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: .  
blk_A_B{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW]]}
2014-10-27 20:47:52,422 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: .  
blk_A_C{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW]]}
2014-10-27 21:23:13,844 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: .  
blk_A_D{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW]]}
2014-10-27 22:02:33,405 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: .  
blk_A_E{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW]]}
2014-10-27 22:42:49,227 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: .  
blk_A_F{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW]]}
2014-10-27 23:25:58,555 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: .  
blk_A_G{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW]]}
2014-10-28 00:07:36,093 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: .  
blk_A_H{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW]]}
2014-10-28 01:13:50,298 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: .  
blk_A_I{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW], 
ReplicaUnderConstruction[:50010|RBW]]}
2014-10-28 01:18:20,868 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
completeFile:  is closed by DFSClient_attempt_X_Y_r_T_U_V_W
2014-10-28 01:18:21,272 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
completeFile:  is closed by DFSClient_attempt_X_Y_r_T_U_V_W
This keeps going interspersed with other logs until 
2014-10-28 01:28:24,483 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
completeFile:  is closed by DFSClient_attempt_X_Y_r_T_U_V_W
2014-10-28 01:28:25,615 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
completeFile:  is closed by DFSClient_attempt_X_Y_r_T_U_V_W
2014-10-28 02:28:17,569 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease.  
Holder: DFSClient_attempt_X_Y_r_T_U_V_W, pendingcreates: 1], src=
..BOOM NN IS IN INFINITE LOOP.. Only the following two messages 
keep getting repeated:
2014-10-28 02:28:17,568 INFO 
org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease.  Holder: 
DFSClient_attempt_X_Y_r_T_U_V_W, pendingcreates: 1] has expired hard limit
2014-10-28 02:28:17,569 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease.  
Holder: DFSClient_attempt_X_Y_r_T_U_V_W, pendingcreates: 1], src=
2014-10-28 02:28:17,569 INFO 
org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease.  Holder: 
DFSClient_attempt_X_Y_r_T_U_V_W, pendingcreates: 1] has expired hard limit
2014-10-28 02:28:17,569 INFO 
org.apache.hadoop.hd

[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209143#comment-14209143
 ] 

Karthik Kambatla commented on HDFS-7391:


[~acmurthy] - Missed your comment here, and committed the addendum for 
HADOOP-11217. Will let you commit this, thanks. 

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209024#comment-14209024
 ] 

Andrew Wang commented on HDFS-6982:
---

How would this behave to a sudden, large spike in operations? This is the 
situation we're trying to detect. i.e. for something like:

{noformat}
0, 0, 0, 100, 0, 0, 0, ...
{noformat}

What I'd want to see is essentially a step function going 0 -> 100 -> 0, 
but an EWMA would necessarily tail off exponentially.

I'm also happy to take a look at any references you have. I've done some 
reading on calculating percentiles on rolling windows, and what we have now is 
pretty typical for that, i.e. a number of buckets each representing a fixed 
time interval, aggregating buckets to calculate the metric, old buckets being 
discarded as time passes.

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7390) Provide JMX metrics per storage type

2014-11-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209026#comment-14209026
 ] 

Haohui Mai commented on HDFS-7390:
--

Thanks for the work. Can you make The JMX directly output JSON objects instead 
JSON strings?

> Provide JMX metrics per storage type
> 
>
> Key: HDFS-7390
> URL: https://issues.apache.org/jira/browse/HDFS-7390
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.5.2
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HDFS-7390.patch
>
>
> HDFS-2832  added heterogeneous support. In a cluster with different storage 
> types, it is useful to have metrics per storage type. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209030#comment-14209030
 ] 

Arun C Murthy commented on HDFS-7391:
-

Sounds good, thanks [~rkanter] and [~ywskycn]!

I'll commit both shortly for RC1.

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7056) Snapshot support for truncate

2014-11-12 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-7056:
---
Status: Patch Available  (was: Open)

> Snapshot support for truncate
> -
>
> Key: HDFS-7056
> URL: https://issues.apache.org/jira/browse/HDFS-7056
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFSSnapshotWithTruncateDesign.docx
>
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208995#comment-14208995
 ] 

Haohui Mai commented on HDFS-6982:
--

bq. However, my understanding is that there's no direct link between the alpha 
parameter and a time-based window, e.g. 1mi, 5 min, 30min.

Let n equals to the number of observations per window. Setting {{alpha = (n-1) 
/ n}} would make the math right assuming that the number of requests follows 
Poisson distribution.

bq. IIUC the situation you describe will lead to small errors, not big ones. If 
there are bigger correctness issues, I think we can fix them by adding more 
synchronization. Thanks.

Depending on the timing, the errors will lead to one of the following: (1) 
correct results, (2) consistently missing one measurement from some users, (3) 
inconsistent measurement for the same users. The artificial errors makes nntop 
less valuable.

I don't quite understand your concerns on fixing the issue. This is a variant 
of the online counting problem which is relatively well-studied. Applying the 
de facto solution can eliminate the errors and makes the implementation 
simpler. I'm not sure why we need to reinvent the wheel here.

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208972#comment-14208972
 ] 

Yongjun Zhang commented on HDFS-7386:
-

HI Chris, thanks again for your input, just submitted a trivial patch.


> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Attachments: HDFS-7386.001.patch
>
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7386:

Status: Patch Available  (was: Open)

> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Attachments: HDFS-7386.001.patch
>
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7386:

Priority: Trivial  (was: Major)

> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Attachments: HDFS-7386.001.patch
>
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208968#comment-14208968
 ] 

Andrew Wang commented on HDFS-6982:
---

Haohui, IIUC your suggestion here is to use an exponential moving average. 
However, my understanding is that there's no direct link between the alpha 
parameter and a time-based window, e.g. 1mi, 5 min, 30min. I think the 
time-based windows are more operator-friendly.

Also, considering that the purpose of this feature is to determine a ranked 
list of users and what operations they're doing, the exact values aren't 
actually that important. IIUC the situation you describe will lead to small 
errors, not big ones. If there are bigger correctness issues, I think we can 
fix them by adding more synchronization. Thanks.

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7386:

Attachment: HDFS-7386.001.patch

> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>Priority: Trivial
> Attachments: HDFS-7386.001.patch
>
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-7386:

Issue Type: Improvement  (was: Bug)

> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208944#comment-14208944
 ] 

Haohui Mai commented on HDFS-6982:
--

For the UI part I think it is fine to report only the smallest window and let 
the UI to plot the graph, thus I think the patch can be further simplified.

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208942#comment-14208942
 ] 

Haohui Mai commented on HDFS-6982:
--

Please correct me if I'm wrong. Let's say we have a rolling window of 3, and 
the current observation {{o}} is 

{noformat}
o = [o1, o2, o3];
{noformat}

Consider the following interleaving.

1. The user measures the observation. He gets {{(o1 + o2 + o3) / 3}}.
2. The observation {{o1}} is stale, thus it is reset to zero by {{safeReset}}.
3. Right before {{bucket.inc()}} is called, the user makes another measurement, 
now he gets {{(0 + o2 + o3) / 3}}.
4. {{o1}} is updated.

That way the user gets incorrect measurement in step 3.

My feeling is that it is more robust to calculate the moving average instead of 
reseting the observation in every ticks. Actually, the core functionality can 
be implemented in the following code:

{code}
observation = new ConcurrentHashMap();

synchronized void bulkUpdate(Map updates) {
  for (Map.Entry e : updates) {
long v = observation.get(e.getKey()) != null ? observation.get(e.getKey()) 
: 0;
observation.put(e.getKey(), ALPHA * v + e.getValue());
  }
  for (Map.Entry e : observation) {
if (!updates.containsKey(e.getKey())) {
  long v = ALPHA * e.getValue();
  if (v == 0) { observation.remove(e.getKey()); } else { 
observation.put(e.getKey(), v); }
}
  }
}
synchronized Map observe() { return map; }
{code}

Assuming that the size of {{updates}} is bounded (which should be the case in 
nntop), it should be fairly efficient. Thoughts?

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7312) Update DistCp v1 to optionally not use tmp location

2014-11-12 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208914#comment-14208914
 ] 

Yongjun Zhang commented on HDFS-7312:
-

Hi [~jprosser],

Thanks for the nice work here. I looked through, it looks good, except for a 
couple of improvements and nits:

Improvements:

* For the following new test created, to achieve better code sharing
{code}
 /** copy files from dfs file system to dfs file system with skiptmp */
  public void testCopyFromDfsToDfsWithSkiptmp() throws Exception {
{code}
suggest to create a new private method with an additional boolean parameter 
skipTmp, {{private void testCopyFromDfsToDfs(final boolean skipTmp}}. Then you 
can make the pre-existing test testCopyFromDfsToDfs and your new test to call 
this private method with false and true accordingly.

* This same above suggestion applies to {{testCopySingleFileWithSkiptmp()}}.

* Line 1221, do we really need to ceate the tmpDir here? For example, if we 
copy a single file to a destination file, why a tmpDir is needed? I think we 
can avoid doing this (removing that block of code), and have a file-existence 
checking when doing the fullyDelete. The point is, the TMP_DIR_LABEL may not 
always be created, we don't have to create it just for the purpose of deleting 
it. Right?

Nits:

* About usage description "Copy files directly to the final destination.", 
maybe we can change it to "Instead, copy files directly to the final 
destination." ?
* Line 394, "// filename and the path becomes its parent directory.", replace 
"path" with "destPath"
* Line 1218, rename "tmpDirRoot" to "tmpDirPrefix"?
* Line 396, tab key not replaced with spaces. 
* line 400  remove newly added extra newline
* Line 956, replace "The job to configure" with "The job configuration"
* Line 1218, line is too long, need to be <= 80 columns based on hadoop code 
guideline;

BTW, I guess you have tried to test it out in real clusters, right?

Thanks a lot.


> Update DistCp v1 to optionally not use tmp location
> ---
>
> Key: HDFS-7312
> URL: https://issues.apache.org/jira/browse/HDFS-7312
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 2.5.1
>Reporter: Joseph Prosser
>Assignee: Joseph Prosser
>Priority: Minor
> Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, 
> HDFS-7312.003.patch, HDFS-7312.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> DistCp v1 currently copies files to a tmp location and then renames that to 
> the specified destination.  This can cause performance issues on filesystems 
> such as S3.  A -skiptmp flag will be added to bypass this step and copy 
> directly to the destination.  This feature mirrors a similar one added to 
> HBase ExportSnapshot 
> [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208879#comment-14208879
 ] 

Andrew Wang commented on HDFS-6982:
---

bq. The configuration keys are already there. I have some constants in TopConf 
though. Did you mean to move those top-specific constants to DFSConfigKeys?

Right now there are some prefixes that are concatenated to generate the full 
string. Since there are only 3 config keys right now, I think it'd be nicer to 
just write them out, i.e. "dfs.namenode.top.window.buckets".

bq. We actually use jmx for plotting the data exported by nntop, and that is 
why one reporting period is sufficient. For html view, the web component 
directly contacts the TopMetrics and retrieves the top users for all reporting 
periods.

I guess this is reasonable, though it does differ a bit from how the new webUI 
works. [~wheat9] do you have any feelings here?

bq. 

Could you file a follow-on JIRA for this issue, and post the patch? It's fine 
to do it before or after, but it seems like we definitely need it in the end 
state either way.

Thanks Maysam!

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-11-12 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208866#comment-14208866
 ] 

Maysam Yabandeh commented on HDFS-6982:
---

Thanks for well-detailed review. Some questions before I submit the new patch: 
bq. Do you mind writing out the key strings in DFSConfigKeys? It's how we do it 
in the rest of the file, and more readable.
The configuration keys are already there. I have some constants in TopConf 
though. Did you mean to move those top-specific constants to DFSConfigKeys?
bq. The jmx page right now only exposes the shortest window, I see the 
smallestWindow hardcode. Since the goal is to use jmx to populate the HTML 
view, we need to somehow expose all of the configured windows.
We actually use jmx for plotting the data exported by nntop, and that is why 
one reporting period is sufficient. For html view, the web component directly 
contacts the TopMetrics and retrieves the top users for all reporting periods.

Also about the counters never go back to 0 and reported users never being 
removed the plan was to submit a separate patch for it as the fix was kind of 
orthogonal to what nntop does: refer to #3 in this comment: 
https://issues.apache.org/jira/browse/HDFS-6982?focusedCommentId=14122097&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14122097
If you think that change should be made before this patch gets committed, I can 
open the jira for that change and we have it committed before.

> nntop: top­-like tool for name node users
> -
>
> Key: HDFS-6982
> URL: https://issues.apache.org/jira/browse/HDFS-6982
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Maysam Yabandeh
>Assignee: Maysam Yabandeh
> Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208839#comment-14208839
 ] 

Wei Yan commented on HDFS-7391:
---

[~acmurthy], for HADOOP-11243, we tried different approaches (whilelist, 
blacklist) to add the SSLv2Hello, but not success. The shuffle server still 
cannot accept SSLv2Hello protocol. Given that the shuffle happens between NMs, 
so I think we can keep the existing solution without SSLv2Hello.

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208833#comment-14208833
 ] 

Robert Kanter commented on HDFS-7391:
-

This is the addendum patch for HDFS-7274.  I had to create a new JIRA because 
HDFS-7274 was already closed.  (Also, the branch-2.5 version of the patch is 
different here because it also fixes a problem where older versions of Tomcat 
used a different property name).

The addendum patch for HADOOP-11217 I was able to put there because it wasn't 
closed yet.  

The third JIRA was HADOOP-11243, which [~ywskycn] worked on.  I'm not sure of 
the details, but he wasn't able to re-enable SSLv2Hello there.

So, there's just HDFS-7391 and the HADOOP-11217 addendum.

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208818#comment-14208818
 ] 

Arun C Murthy commented on HDFS-7391:
-

[~rkanter] - Is this and the addendum patch for HADOOP-11217 the only ones for 
SSLv2Hello? I thought there were 3?

I'll commit both later today if there are no objections; others please commit 
to branch-2/branch-2.6/branch-2.6.0 if you get to it before me. Thanks.

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2936) File close()-ing hangs indefinitely if the number of live blocks does not match the minimum replication

2014-11-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208753#comment-14208753
 ] 

Ravi Prakash commented on HDFS-2936:


Thanks Harsh for this JIRA! I would go a different route on this. The 
min-replication count to me as a user means "It will take that many failures to 
lose data" . That is a simple concept to reason about. If we create a separate 
config that applies only for the write pipelines, 1. there is a window of 
opportunity during which my assumption is not valid (the time it takes for the 
NN to order that replication), and it makes understanding the concept slightly 
more complex.

I would suggest that we should fix the write pipeline to contain the minimum 
replication count and that the client should wait until that happens. I realize 
that might be a much bigger change.

> File close()-ing hangs indefinitely if the number of live blocks does not 
> match the minimum replication
> ---
>
> Key: HDFS-2936
> URL: https://issues.apache.org/jira/browse/HDFS-2936
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 0.23.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: HDFS-2936.patch
>
>
> If an admin wishes to enforce replication today for all the users of their 
> cluster, he may set {{dfs.namenode.replication.min}}. This property prevents 
> users from creating files with < expected replication factor.
> However, the value of minimum replication set by the above value is also 
> checked at several other points, especially during completeFile (close) 
> operations. If a condition arises wherein a write's pipeline may have gotten 
> only < minimum nodes in it, the completeFile operation does not successfully 
> close the file and the client begins to hang waiting for NN to replicate the 
> last bad block in the background. This form of hard-guarantee can, for 
> example, bring down clusters of HBase during high xceiver load on DN, or disk 
> fill-ups on many of them, etc..
> I propose we should split the property in two parts:
> * dfs.namenode.replication.min
> ** Stays the same name, but only checks file creation time replication factor 
> value and during adjustments made via setrep/etc.
> * dfs.namenode.replication.min.for.write
> ** New property that disconnects the rest of the checks from the above 
> property, such as the checks done during block commit, file complete/close, 
> safemode checks for block availability, etc..
> Alternatively, we may also choose to remove the client-side hang of 
> completeFile/close calls with a set number of retries. This would further 
> require discussion about how a file-closure handle ought to be handled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7008) xlator should be closed upon exit from DFSAdmin#genericRefresh()

2014-11-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208746#comment-14208746
 ] 

Hadoop QA commented on HDFS-7008:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666954/HDFS-7008.1.patch
  against trunk revision 782abbb.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8724//console

This message is automatically generated.

> xlator should be closed upon exit from DFSAdmin#genericRefresh()
> 
>
> Key: HDFS-7008
> URL: https://issues.apache.org/jira/browse/HDFS-7008
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Tsuyoshi OZAWA
>Priority: Minor
> Attachments: HDFS-7008.1.patch
>
>
> {code}
> GenericRefreshProtocol xlator =
>   new GenericRefreshProtocolClientSideTranslatorPB(proxy);
> // Refresh
> Collection responses = xlator.refresh(identifier, args);
> {code}
> GenericRefreshProtocolClientSideTranslatorPB#close() should be called on 
> xlator before return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7008) xlator should be closed upon exit from DFSAdmin#genericRefresh()

2014-11-12 Thread Chris Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208739#comment-14208739
 ] 

Chris Li commented on HDFS-7008:


Linking issue

> xlator should be closed upon exit from DFSAdmin#genericRefresh()
> 
>
> Key: HDFS-7008
> URL: https://issues.apache.org/jira/browse/HDFS-7008
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Tsuyoshi OZAWA
>Priority: Minor
> Attachments: HDFS-7008.1.patch
>
>
> {code}
> GenericRefreshProtocol xlator =
>   new GenericRefreshProtocolClientSideTranslatorPB(proxy);
> // Refresh
> Collection responses = xlator.refresh(identifier, args);
> {code}
> GenericRefreshProtocolClientSideTranslatorPB#close() should be called on 
> xlator before return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2014-11-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208657#comment-14208657
 ] 

Ravi Prakash commented on HDFS-4882:


These unit tests failures are spurious and unrelated to the code changes in the 
patch.

> Namenode LeaseManager checkLeases() runs into infinite loop
> ---
>
> Key: HDFS-4882
> URL: https://issues.apache.org/jira/browse/HDFS-4882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.0.0-alpha, 2.5.1
>Reporter: Zesheng Wu
>Assignee: Ravi Prakash
>Priority: Critical
> Attachments: 4882.1.patch, 4882.patch, 4882.patch, HDFS-4882.1.patch, 
> HDFS-4882.2.patch, HDFS-4882.patch
>
>
> Scenario:
> 1. cluster with 4 DNs
> 2. the size of the file to be written is a little more than one block
> 3. write the first block to 3 DNs, DN1->DN2->DN3
> 4. all the data packets of first block is successfully acked and the client 
> sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
> 5. DN2 and DN3 are down
> 6. client recovers the pipeline, but no new DN is added to the pipeline 
> because of the current pipeline stage is PIPELINE_CLOSE
> 7. client continuously writes the last block, and try to close the file after 
> written all the data
> 8. NN finds that the penultimate block doesn't has enough replica(our 
> dfs.namenode.replication.min=2), and the client's close runs into indefinite 
> loop(HDFS-2936), and at the same time, NN makes the last block's state to 
> COMPLETE
> 9. shutdown the client
> 10. the file's lease exceeds hard limit
> 11. LeaseManager realizes that and begin to do lease recovery by call 
> fsnamesystem.internalReleaseLease()
> 12. but the last block's state is COMPLETE, and this triggers lease manager's 
> infinite loop and prints massive logs like this:
> {noformat}
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
> DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
>  limit
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
>  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
> /user/h_wuzesheng/test.dat
> 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
> blk_-7028017402720175688_1202597,
> lastBLockState=COMPLETE
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
> for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
> APREDUCE_-1252656407_1, pendingcreates: 1]
> {noformat}
> (the 3rd line log is a debug log added by us)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2014-11-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208622#comment-14208622
 ] 

Hadoop QA commented on HDFS-4882:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12681116/HDFS-4882.2.patch
  against trunk revision be7bf95.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerDynamicBehavior
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8723//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8723//console

This message is automatically generated.

> Namenode LeaseManager checkLeases() runs into infinite loop
> ---
>
> Key: HDFS-4882
> URL: https://issues.apache.org/jira/browse/HDFS-4882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.0.0-alpha, 2.5.1
>Reporter: Zesheng Wu
>Assignee: Ravi Prakash
>Priority: Critical
> Attachments: 4882.1.patch, 4882.patch, 4882.patch, HDFS-4882.1.patch, 
> HDFS-4882.2.patch, HDFS-4882.patch
>
>
> Scenario:
> 1. cluster with 4 DNs
> 2. the size of the file to be written is a little more than one block
> 3. write the first block to 3 DNs, DN1->DN2->DN3
> 4. all the data packets of first block is successfully acked and the client 
> sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
> 5. DN2 and DN3 are down
> 6. client recovers the pipeline, but no new DN is added to the pipeline 
> because of the current pipeline stage is PIPELINE_CLOSE
> 7. client continuously writes the last block, and try to close the file after 
> written all the data
> 8. NN finds that the penultimate block doesn't has enough replica(our 
> dfs.namenode.replication.min=2), and the client's close runs into indefinite 
> loop(HDFS-2936), and at the same time, NN makes the last block's state to 
> COMPLETE
> 9. shutdown the client
> 10. the file's lease exceeds hard limit
> 11. LeaseManager realizes that and begin to do lease recovery by call 
> fsnamesystem.internalReleaseLease()
> 12. but the last block's state is COMPLETE, and this triggers lease manager's 
> infinite loop and prints massive logs like this:
> {noformat}
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
> DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
>  limit
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
>  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
> /user/h_wuzesheng/test.dat
> 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
> blk_-7028017402720175688_1202597,
> lastBLockState=COMPLETE
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
> for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
> APREDUCE_-1252656407_1, pendingcreates: 1]
> {noformat}
> (the 3rd line log is a debug log added by us)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7387) NFS may only do partial commit due to a race between COMMIT and write

2014-11-12 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-7387:

Fix Version/s: (was: 2.7.0)
   2.6.0

I've merged this to branch-2.6 and branch-2.6.0 for inclusion in the 2.6.0 
release candidate.

> NFS may only do partial commit due to a race between COMMIT and write
> -
>
> Key: HDFS-7387
> URL: https://issues.apache.org/jira/browse/HDFS-7387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.6.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Critical
> Fix For: 2.6.0
>
> Attachments: HDFS-7387.001.patch, HDFS-7387.002.patch
>
>
> The requested range may not be committed when the following happens:
> 1. the last pending write is removed from the queue to write to hdfs
> 2. a commit request arrives, NFS sees there is not pending write, and it will 
> do a sync
> 3. this sync request could flush only part of the last write to hdfs
> 4. if a file read happens immediately after the above steps, the user may not 
> see all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7387) NFS may only do partial commit due to a race between COMMIT and write

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208528#comment-14208528
 ] 

Hudson commented on HDFS-7387:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6524 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6524/])
HDFS-7387. Merging to branch-2.6 for hadoop-2.6.0-rc1. (acmurthy: rev 
782abbb000ab1c9e2e033e347eea8827d6e866ef)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> NFS may only do partial commit due to a race between COMMIT and write
> -
>
> Key: HDFS-7387
> URL: https://issues.apache.org/jira/browse/HDFS-7387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.6.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: HDFS-7387.001.patch, HDFS-7387.002.patch
>
>
> The requested range may not be committed when the following happens:
> 1. the last pending write is removed from the queue to write to hdfs
> 2. a commit request arrives, NFS sees there is not pending write, and it will 
> do a sync
> 3. this sync request could flush only part of the last write to hdfs
> 4. if a file read happens immediately after the above steps, the user may not 
> see all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2014-11-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208478#comment-14208478
 ] 

Ravi Prakash commented on HDFS-4882:


s/ConcurrentinternalReleaseLease/ConcurrentSkipList/

> Namenode LeaseManager checkLeases() runs into infinite loop
> ---
>
> Key: HDFS-4882
> URL: https://issues.apache.org/jira/browse/HDFS-4882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.0.0-alpha, 2.5.1
>Reporter: Zesheng Wu
>Assignee: Ravi Prakash
>Priority: Critical
> Attachments: 4882.1.patch, 4882.patch, 4882.patch, HDFS-4882.1.patch, 
> HDFS-4882.2.patch, HDFS-4882.patch
>
>
> Scenario:
> 1. cluster with 4 DNs
> 2. the size of the file to be written is a little more than one block
> 3. write the first block to 3 DNs, DN1->DN2->DN3
> 4. all the data packets of first block is successfully acked and the client 
> sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
> 5. DN2 and DN3 are down
> 6. client recovers the pipeline, but no new DN is added to the pipeline 
> because of the current pipeline stage is PIPELINE_CLOSE
> 7. client continuously writes the last block, and try to close the file after 
> written all the data
> 8. NN finds that the penultimate block doesn't has enough replica(our 
> dfs.namenode.replication.min=2), and the client's close runs into indefinite 
> loop(HDFS-2936), and at the same time, NN makes the last block's state to 
> COMPLETE
> 9. shutdown the client
> 10. the file's lease exceeds hard limit
> 11. LeaseManager realizes that and begin to do lease recovery by call 
> fsnamesystem.internalReleaseLease()
> 12. but the last block's state is COMPLETE, and this triggers lease manager's 
> infinite loop and prints massive logs like this:
> {noformat}
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
> DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
>  limit
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
>  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
> /user/h_wuzesheng/test.dat
> 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
> blk_-7028017402720175688_1202597,
> lastBLockState=COMPLETE
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
> for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
> APREDUCE_-1252656407_1, pendingcreates: 1]
> {noformat}
> (the 3rd line log is a debug log added by us)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop

2014-11-12 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4882:
---
Attachment: HDFS-4882.2.patch

I tried using iterators, but then realized that 
FSNamesystem.internalReleaseLease() calls into renewLease, and to make those 
modifications were too unsightly.
Here's a patch which uses SortedSet.tailSet. However I still like the earlier 
patch more (because its a genuine case of two threads accessing the same 
data-structure). With tailSet we are just trying to build our own 
synchronization mechanism (which is likely more inefficient than the 
ConcurrentinternalReleaseLease) .
I'd vote for the earlier patch (HDFS-4882.1.patch) . I'd also request for this 
to make it into 2.6.0 because of this issue's severity.

> Namenode LeaseManager checkLeases() runs into infinite loop
> ---
>
> Key: HDFS-4882
> URL: https://issues.apache.org/jira/browse/HDFS-4882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.0.0-alpha, 2.5.1
>Reporter: Zesheng Wu
>Assignee: Ravi Prakash
>Priority: Critical
> Attachments: 4882.1.patch, 4882.patch, 4882.patch, HDFS-4882.1.patch, 
> HDFS-4882.2.patch, HDFS-4882.patch
>
>
> Scenario:
> 1. cluster with 4 DNs
> 2. the size of the file to be written is a little more than one block
> 3. write the first block to 3 DNs, DN1->DN2->DN3
> 4. all the data packets of first block is successfully acked and the client 
> sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out
> 5. DN2 and DN3 are down
> 6. client recovers the pipeline, but no new DN is added to the pipeline 
> because of the current pipeline stage is PIPELINE_CLOSE
> 7. client continuously writes the last block, and try to close the file after 
> written all the data
> 8. NN finds that the penultimate block doesn't has enough replica(our 
> dfs.namenode.replication.min=2), and the client's close runs into indefinite 
> loop(HDFS-2936), and at the same time, NN makes the last block's state to 
> COMPLETE
> 9. shutdown the client
> 10. the file's lease exceeds hard limit
> 11. LeaseManager realizes that and begin to do lease recovery by call 
> fsnamesystem.internalReleaseLease()
> 12. but the last block's state is COMPLETE, and this triggers lease manager's 
> infinite loop and prints massive logs like this:
> {noformat}
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease.  Holder: 
> DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard
>  limit
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. 
>  Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src=
> /user/h_wuzesheng/test.dat
> 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block 
> blk_-7028017402720175688_1202597,
> lastBLockState=COMPLETE
> 2013-06-05,17:42:25,695 INFO 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery 
> for file /user/h_wuzesheng/test.dat lease [Lease.  Holder: DFSClient_NONM
> APREDUCE_-1252656407_1, pendingcreates: 1]
> {noformat}
> (the 3rd line log is a debug log added by us)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208371#comment-14208371
 ] 

Yongjun Zhang commented on HDFS-7386:
-

HI Chris,

I just saw your comments here and in HADOOP-11293, they are very helpful, and 
thank you so much! I will work on patches for both of them a bit later today.



> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7386) Replace check "port number < 1024" with shared isPrivilegedPort method

2014-11-12 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208354#comment-14208354
 ] 

Chris Nauroth commented on HDFS-7386:
-

Hi Yongjun.

On further reflection, I think we should not incorporate a Windows check here.  
Sometimes the check for < 1024 is used on the client side to detect the 
behavior of the server side.  If we consider the possibility of a Windows 
client connecting to a Linux server, then the client on Windows could assume 
incorrectly that there are no privileged ports, even though the server on Linux 
does have privileged ports.  As a practical matter, I think this means that 
when secure mode is fully implemented for Windows, there is going to be a 
limitation that the DataNode can't use a port < 1024.  Otherwise, it would 
throw off some of this detection logic.  It's not a bad limitation, just 
something we'll need to be aware of.

> Replace check "port number < 1024" with shared isPrivilegedPort method 
> ---
>
> Key: HDFS-7386
> URL: https://issues.apache.org/jira/browse/HDFS-7386
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> Per discussion in HDFS-7382, I'm filing this jira as a follow-up, to replace 
> check "port number < 1024" with shared isPrivilegedPort method.
> Thanks [~cnauroth] for the work on HDFS-7382 and suggestion there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208349#comment-14208349
 ] 

Robert Kanter commented on HDFS-7391:
-

We verified that an old OpenSSL client is able to do the SSLv2Hello handshake 
and then use TLSv1 (and that SSLv2 and SSLv3 are disabled).

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7394) Log at INFO level when InvalidToken is seen in ShortCircuitCache

2014-11-12 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-7394:


 Summary: Log at INFO level when InvalidToken is seen in 
ShortCircuitCache
 Key: HDFS-7394
 URL: https://issues.apache.org/jira/browse/HDFS-7394
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Priority: Minor


For long running clients, getting an {{InvalidToken}} exception is expected and 
the client refetches a block token when it happens.  The related events are 
logged at INFO except the ones in {{ShortCircuitCache}}.  It will be better if 
they are also made to log at INFO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Dave Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208188#comment-14208188
 ] 

Dave Thompson commented on HDFS-7391:
-

Looking at the patch, config change appears benign

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS

2014-11-12 Thread Dave Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208172#comment-14208172
 ] 

Dave Thompson commented on HDFS-7391:
-

For clarifications you are not suggesting turning on SSLv2, which has 
been deprecated for 18 years, for reasons discussed in RFC6176.

Rather, you are suggesting turning on the backwards compatible Client-Hello,
that was introduced in 1996 for transition, for clients that didn't know 
if they were connecting to an SSLv2 or SSLv3 server.

A bit surprised that there exists hadoop clients that find this necessary.
Java 6 with openssl 0.9.8x, I believe will support up to SSLv3.1 (TLS 1.0),
which I've used as a server... I can't speak to client configurability.

My primary concern would be that in enabling acceptance of SSLv2 Client-Hello,
that assurances/confirmation be made that a resulting SSLv2.0 session 
is not allowed.

> Renable SSLv2Hello in HttpFS
> 
>
> Key: HDFS-7391
> URL: https://issues.apache.org/jira/browse/HDFS-7391
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.6.0, 2.5.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch
>
>
> We should re-enable "SSLv2Hello", which is required for older clients (e.g. 
> Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be 
> clear, it does not mean SSLv2, which is insecure.
> I couldn't simply do an addendum patch on HDFS-7274 because it's already been 
> closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7375) Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208131#comment-14208131
 ] 

Hudson commented on HDFS-7375:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/3/])
HDFS-7375. Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement. 
Contributed by Haohui Mai. (wheat9: rev 
46f6f9d60d0a2c1f441a0e81a071b08c24dbd6d6)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSClusterStats.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/FSClusterStats.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java


> Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement
> --
>
> Key: HDFS-7375
> URL: https://issues.apache.org/jira/browse/HDFS-7375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7375.000.patch, HDFS-7375.001.patch
>
>
> {{FSClusterStats}} is a private class that exports statistics for 
> {{BlockPlacementPolicy}}. This jira proposes moving it to {{ 
> o.a.h.h.hdfs.server.blockmanagement}} to simplify the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7389) Named user ACL cannot stop the user from accessing the FS entity.

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208128#comment-14208128
 ] 

Hudson commented on HDFS-7389:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/3/])
HDFS-7389. Named user ACL cannot stop the user from accessing the FS entity. 
Contributed by Vinayakumar B. (cnauroth: rev 
163bb55067bde71246b4030a08256ba9a8182dc8)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java


> Named user ACL cannot stop the user from accessing the FS entity.
> -
>
> Key: HDFS-7389
> URL: https://issues.apache.org/jira/browse/HDFS-7389
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.1
>Reporter: Chunjun Xiao
>Assignee: Vinayakumar B
> Fix For: 2.7.0
>
> Attachments: HDFS-7389-001.patch, HDFS-7389-002.patch
>
>
> In 
> http://hortonworks.com/blog/hdfs-acls-fine-grained-permissions-hdfs-files-hadoop/:
> {quote}
> It’s important to keep in mind the order of evaluation for ACL entries when a 
> user attempts to access a file system object:
> 1. If the user is the file owner, then the owner permission bits are enforced.
> 2. Else if the user has a named user ACL entry, then those permissions are 
> enforced.
> 3. Else if the user is a member of the file’s group or any named group in an 
> ACL entry, then the union of permissions for all matching entries are 
> enforced.  (The user may be a member of multiple groups.)
> 4. If none of the above were applicable, then the other permission bits are 
> enforced.
> {quote}
> Assume we have a user UserA from group GroupA, if we config a directory as 
> following ACL entries:
> group:GroupA:rwx
> user:UserA:---
> According to the design spec above, userA should have no access permission to 
> the file object, while actually userA still has rwx access to the dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7387) NFS may only do partial commit due to a race between COMMIT and write

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208135#comment-14208135
 ] 

Hudson commented on HDFS-7387:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/3/])
HDFS-7387. NFS may only do partial commit due to a race between COMMIT and 
write. Contributed by Brandon Li (brandonli: rev 
99d9d0c2d19b9f161b765947f3fb64619ea58090)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java


> NFS may only do partial commit due to a race between COMMIT and write
> -
>
> Key: HDFS-7387
> URL: https://issues.apache.org/jira/browse/HDFS-7387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.6.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: HDFS-7387.001.patch, HDFS-7387.002.patch
>
>
> The requested range may not be committed when the following happens:
> 1. the last pending write is removed from the queue to write to hdfs
> 2. a commit request arrives, NFS sees there is not pending write, and it will 
> do a sync
> 3. this sync request could flush only part of the last write to hdfs
> 4. if a file read happens immediately after the above steps, the user may not 
> see all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208130#comment-14208130
 ] 

Hudson commented on HDFS-7381:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/3/])
HDFS-7381. Decouple the management of block id and gen stamps from 
FSNamesystem. Contributed by Haohui Mai. (wheat9: rev 
571e9c623241106dad5521a870fb8daef3f2b00a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SequentialBlockIdGenerator.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSequentialBlockId.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockIdManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SequentialBlockIdGenerator.java


> Decouple the management of block id and gen stamps from FSNamesystem
> 
>
> Key: HDFS-7381
> URL: https://issues.apache.org/jira/browse/HDFS-7381
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7381.000.patch
>
>
> The block layer should be responsible of managing block ids and generation 
> stamps. Currently the functionality is misplace into {{FSNamesystem}}.
> This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7387) NFS may only do partial commit due to a race between COMMIT and write

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208114#comment-14208114
 ] 

Hudson commented on HDFS-7387:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1955 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1955/])
HDFS-7387. NFS may only do partial commit due to a race between COMMIT and 
write. Contributed by Brandon Li (brandonli: rev 
99d9d0c2d19b9f161b765947f3fb64619ea58090)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java


> NFS may only do partial commit due to a race between COMMIT and write
> -
>
> Key: HDFS-7387
> URL: https://issues.apache.org/jira/browse/HDFS-7387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.6.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: HDFS-7387.001.patch, HDFS-7387.002.patch
>
>
> The requested range may not be committed when the following happens:
> 1. the last pending write is removed from the queue to write to hdfs
> 2. a commit request arrives, NFS sees there is not pending write, and it will 
> do a sync
> 3. this sync request could flush only part of the last write to hdfs
> 4. if a file read happens immediately after the above steps, the user may not 
> see all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7375) Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208110#comment-14208110
 ] 

Hudson commented on HDFS-7375:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1955 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1955/])
HDFS-7375. Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement. 
Contributed by Haohui Mai. (wheat9: rev 
46f6f9d60d0a2c1f441a0e81a071b08c24dbd6d6)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSClusterStats.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/FSClusterStats.java


> Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement
> --
>
> Key: HDFS-7375
> URL: https://issues.apache.org/jira/browse/HDFS-7375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7375.000.patch, HDFS-7375.001.patch
>
>
> {{FSClusterStats}} is a private class that exports statistics for 
> {{BlockPlacementPolicy}}. This jira proposes moving it to {{ 
> o.a.h.h.hdfs.server.blockmanagement}} to simplify the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208109#comment-14208109
 ] 

Hudson commented on HDFS-7381:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1955 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1955/])
HDFS-7381. Decouple the management of block id and gen stamps from 
FSNamesystem. Contributed by Haohui Mai. (wheat9: rev 
571e9c623241106dad5521a870fb8daef3f2b00a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockIdManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SequentialBlockIdGenerator.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SequentialBlockIdGenerator.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Decouple the management of block id and gen stamps from FSNamesystem
> 
>
> Key: HDFS-7381
> URL: https://issues.apache.org/jira/browse/HDFS-7381
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7381.000.patch
>
>
> The block layer should be responsible of managing block ids and generation 
> stamps. Currently the functionality is misplace into {{FSNamesystem}}.
> This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7389) Named user ACL cannot stop the user from accessing the FS entity.

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208107#comment-14208107
 ] 

Hudson commented on HDFS-7389:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1955 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1955/])
HDFS-7389. Named user ACL cannot stop the user from accessing the FS entity. 
Contributed by Vinayakumar B. (cnauroth: rev 
163bb55067bde71246b4030a08256ba9a8182dc8)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Named user ACL cannot stop the user from accessing the FS entity.
> -
>
> Key: HDFS-7389
> URL: https://issues.apache.org/jira/browse/HDFS-7389
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.1
>Reporter: Chunjun Xiao
>Assignee: Vinayakumar B
> Fix For: 2.7.0
>
> Attachments: HDFS-7389-001.patch, HDFS-7389-002.patch
>
>
> In 
> http://hortonworks.com/blog/hdfs-acls-fine-grained-permissions-hdfs-files-hadoop/:
> {quote}
> It’s important to keep in mind the order of evaluation for ACL entries when a 
> user attempts to access a file system object:
> 1. If the user is the file owner, then the owner permission bits are enforced.
> 2. Else if the user has a named user ACL entry, then those permissions are 
> enforced.
> 3. Else if the user is a member of the file’s group or any named group in an 
> ACL entry, then the union of permissions for all matching entries are 
> enforced.  (The user may be a member of multiple groups.)
> 4. If none of the above were applicable, then the other permission bits are 
> enforced.
> {quote}
> Assume we have a user UserA from group GroupA, if we config a directory as 
> following ACL entries:
> group:GroupA:rwx
> user:UserA:---
> According to the design spec above, userA should have no access permission to 
> the file object, while actually userA still has rwx access to the dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7353) Common Erasure Coder API

2014-11-12 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-7353:

Summary: Common Erasure Coder API  (was: Common Erasure Codec API and 
plugin support)

> Common Erasure Coder API
> 
>
> Key: HDFS-7353
> URL: https://issues.apache.org/jira/browse/HDFS-7353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: HDFS-EC
>
>
> This is to abstract and define common codec API across different codec 
> algorithms like RS, XOR and etc. Such API can be implemented by utilizing 
> various library support, such as Intel ISA library and Jerasure library. It 
> provides default implementation and also allows to plugin vendor specific 
> ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7353) Common Erasure Coder API

2014-11-12 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-7353:

Description: This is to abstract and define common coder API across 
different codec algorithms like RS, XOR and etc. Such API can be implemented by 
utilizing various library support, such as Intel ISA library and Jerasure 
library. It provides default implementation and also allows to plugin vendor 
specific ones.  (was: This is to abstract and define common codec API across 
different codec algorithms like RS, XOR and etc. Such API can be implemented by 
utilizing various library support, such as Intel ISA library and Jerasure 
library. It provides default implementation and also allows to plugin vendor 
specific ones.)

> Common Erasure Coder API
> 
>
> Key: HDFS-7353
> URL: https://issues.apache.org/jira/browse/HDFS-7353
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: HDFS-EC
>
>
> This is to abstract and define common coder API across different codec 
> algorithms like RS, XOR and etc. Such API can be implemented by utilizing 
> various library support, such as Intel ISA library and Jerasure library. It 
> provides default implementation and also allows to plugin vendor specific 
> ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7393) TestDFSUpgradeFromImage#testUpgradeFromCorruptRel22Image fails in trunk

2014-11-12 Thread Ted Yu (JIRA)
Ted Yu created HDFS-7393:


 Summary: TestDFSUpgradeFromImage#testUpgradeFromCorruptRel22Image 
fails in trunk
 Key: HDFS-7393
 URL: https://issues.apache.org/jira/browse/HDFS-7393
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor


The following is reproducible:
{code}
Running org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 12.017 sec <<< 
FAILURE! - in org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
testUpgradeFromCorruptRel22Image(org.apache.hadoop.hdfs.TestDFSUpgradeFromImage)
  Time elapsed: 1.005 sec  <<< ERROR!
java.lang.IllegalStateException: null
at 
com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockIdManager.setGenerationStampV1Limit(BlockIdManager.java:85)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockIdManager.clear(BlockIdManager.java:206)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.clear(FSNamesystem.java:622)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:667)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.doUpgrade(FSImage.java:376)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:268)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:991)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:537)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:596)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:763)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:747)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1443)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1104)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:975)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:804)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:465)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:424)
at 
org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.upgradeAndVerify(TestDFSUpgradeFromImage.java:582)
at 
org.apache.hadoop.hdfs.TestDFSUpgradeFromImage.testUpgradeFromCorruptRel22Image(TestDFSUpgradeFromImage.java:318)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7375) Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208054#comment-14208054
 ] 

Hudson commented on HDFS-7375:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1931 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1931/])
HDFS-7375. Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement. 
Contributed by Haohui Mai. (wheat9: rev 
46f6f9d60d0a2c1f441a0e81a071b08c24dbd6d6)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSClusterStats.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/FSClusterStats.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java


> Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement
> --
>
> Key: HDFS-7375
> URL: https://issues.apache.org/jira/browse/HDFS-7375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7375.000.patch, HDFS-7375.001.patch
>
>
> {{FSClusterStats}} is a private class that exports statistics for 
> {{BlockPlacementPolicy}}. This jira proposes moving it to {{ 
> o.a.h.h.hdfs.server.blockmanagement}} to simplify the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7387) NFS may only do partial commit due to a race between COMMIT and write

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208058#comment-14208058
 ] 

Hudson commented on HDFS-7387:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1931 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1931/])
HDFS-7387. NFS may only do partial commit due to a race between COMMIT and 
write. Contributed by Brandon Li (brandonli: rev 
99d9d0c2d19b9f161b765947f3fb64619ea58090)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> NFS may only do partial commit due to a race between COMMIT and write
> -
>
> Key: HDFS-7387
> URL: https://issues.apache.org/jira/browse/HDFS-7387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.6.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: HDFS-7387.001.patch, HDFS-7387.002.patch
>
>
> The requested range may not be committed when the following happens:
> 1. the last pending write is removed from the queue to write to hdfs
> 2. a commit request arrives, NFS sees there is not pending write, and it will 
> do a sync
> 3. this sync request could flush only part of the last write to hdfs
> 4. if a file read happens immediately after the above steps, the user may not 
> see all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208053#comment-14208053
 ] 

Hudson commented on HDFS-7381:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1931 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1931/])
HDFS-7381. Decouple the management of block id and gen stamps from 
FSNamesystem. Contributed by Haohui Mai. (wheat9: rev 
571e9c623241106dad5521a870fb8daef3f2b00a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SequentialBlockIdGenerator.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SequentialBlockIdGenerator.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockIdManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java


> Decouple the management of block id and gen stamps from FSNamesystem
> 
>
> Key: HDFS-7381
> URL: https://issues.apache.org/jira/browse/HDFS-7381
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7381.000.patch
>
>
> The block layer should be responsible of managing block ids and generation 
> stamps. Currently the functionality is misplace into {{FSNamesystem}}.
> This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7389) Named user ACL cannot stop the user from accessing the FS entity.

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208051#comment-14208051
 ] 

Hudson commented on HDFS-7389:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1931 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1931/])
HDFS-7389. Named user ACL cannot stop the user from accessing the FS entity. 
Contributed by Vinayakumar B. (cnauroth: rev 
163bb55067bde71246b4030a08256ba9a8182dc8)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java


> Named user ACL cannot stop the user from accessing the FS entity.
> -
>
> Key: HDFS-7389
> URL: https://issues.apache.org/jira/browse/HDFS-7389
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.1
>Reporter: Chunjun Xiao
>Assignee: Vinayakumar B
> Fix For: 2.7.0
>
> Attachments: HDFS-7389-001.patch, HDFS-7389-002.patch
>
>
> In 
> http://hortonworks.com/blog/hdfs-acls-fine-grained-permissions-hdfs-files-hadoop/:
> {quote}
> It’s important to keep in mind the order of evaluation for ACL entries when a 
> user attempts to access a file system object:
> 1. If the user is the file owner, then the owner permission bits are enforced.
> 2. Else if the user has a named user ACL entry, then those permissions are 
> enforced.
> 3. Else if the user is a member of the file’s group or any named group in an 
> ACL entry, then the union of permissions for all matching entries are 
> enforced.  (The user may be a member of multiple groups.)
> 4. If none of the above were applicable, then the other permission bits are 
> enforced.
> {quote}
> Assume we have a user UserA from group GroupA, if we config a directory as 
> following ACL entries:
> group:GroupA:rwx
> user:UserA:---
> According to the design spec above, userA should have no access permission to 
> the file object, while actually userA still has rwx access to the dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7389) Named user ACL cannot stop the user from accessing the FS entity.

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208042#comment-14208042
 ] 

Hudson commented on HDFS-7389:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/3/])
HDFS-7389. Named user ACL cannot stop the user from accessing the FS entity. 
Contributed by Vinayakumar B. (cnauroth: rev 
163bb55067bde71246b4030a08256ba9a8182dc8)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java


> Named user ACL cannot stop the user from accessing the FS entity.
> -
>
> Key: HDFS-7389
> URL: https://issues.apache.org/jira/browse/HDFS-7389
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.1
>Reporter: Chunjun Xiao
>Assignee: Vinayakumar B
> Fix For: 2.7.0
>
> Attachments: HDFS-7389-001.patch, HDFS-7389-002.patch
>
>
> In 
> http://hortonworks.com/blog/hdfs-acls-fine-grained-permissions-hdfs-files-hadoop/:
> {quote}
> It’s important to keep in mind the order of evaluation for ACL entries when a 
> user attempts to access a file system object:
> 1. If the user is the file owner, then the owner permission bits are enforced.
> 2. Else if the user has a named user ACL entry, then those permissions are 
> enforced.
> 3. Else if the user is a member of the file’s group or any named group in an 
> ACL entry, then the union of permissions for all matching entries are 
> enforced.  (The user may be a member of multiple groups.)
> 4. If none of the above were applicable, then the other permission bits are 
> enforced.
> {quote}
> Assume we have a user UserA from group GroupA, if we config a directory as 
> following ACL entries:
> group:GroupA:rwx
> user:UserA:---
> According to the design spec above, userA should have no access permission to 
> the file object, while actually userA still has rwx access to the dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208044#comment-14208044
 ] 

Hudson commented on HDFS-7381:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/3/])
HDFS-7381. Decouple the management of block id and gen stamps from 
FSNamesystem. Contributed by Haohui Mai. (wheat9: rev 
571e9c623241106dad5521a870fb8daef3f2b00a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SequentialBlockIdGenerator.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockIdManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SequentialBlockIdGenerator.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Decouple the management of block id and gen stamps from FSNamesystem
> 
>
> Key: HDFS-7381
> URL: https://issues.apache.org/jira/browse/HDFS-7381
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7381.000.patch
>
>
> The block layer should be responsible of managing block ids and generation 
> stamps. Currently the functionality is misplace into {{FSNamesystem}}.
> This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7387) NFS may only do partial commit due to a race between COMMIT and write

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208049#comment-14208049
 ] 

Hudson commented on HDFS-7387:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/3/])
HDFS-7387. NFS may only do partial commit due to a race between COMMIT and 
write. Contributed by Brandon Li (brandonli: rev 
99d9d0c2d19b9f161b765947f3fb64619ea58090)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java


> NFS may only do partial commit due to a race between COMMIT and write
> -
>
> Key: HDFS-7387
> URL: https://issues.apache.org/jira/browse/HDFS-7387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.6.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: HDFS-7387.001.patch, HDFS-7387.002.patch
>
>
> The requested range may not be committed when the following happens:
> 1. the last pending write is removed from the queue to write to hdfs
> 2. a commit request arrives, NFS sees there is not pending write, and it will 
> do a sync
> 3. this sync request could flush only part of the last write to hdfs
> 4. if a file read happens immediately after the above steps, the user may not 
> see all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7375) Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208045#comment-14208045
 ] 

Hudson commented on HDFS-7375:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/3/])
HDFS-7375. Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement. 
Contributed by Haohui Mai. (wheat9: rev 
46f6f9d60d0a2c1f441a0e81a071b08c24dbd6d6)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/FSClusterStats.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSClusterStats.java


> Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement
> --
>
> Key: HDFS-7375
> URL: https://issues.apache.org/jira/browse/HDFS-7375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7375.000.patch, HDFS-7375.001.patch
>
>
> {{FSClusterStats}} is a private class that exports statistics for 
> {{BlockPlacementPolicy}}. This jira proposes moving it to {{ 
> o.a.h.h.hdfs.server.blockmanagement}} to simplify the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7392) org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever

2014-11-12 Thread Frantisek Vacek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frantisek Vacek updated HDFS-7392:
--
Description: 
In some specific circumstances, 
org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts 
and last forever.

What are specific circumstances:
1) HDFS URI (hdfs://share.merck.com:8020/someDir/someFile.txt) should point to 
valid IP address but without name node service running on it.
2) There should be at least 2 IP addresses for such a URI. See output below:
{quote}
[~/proj/quickbox]$ nslookup share.merck.com
Server: 127.0.1.1
Address:127.0.1.1#53

share.merck.com canonical name = 
internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com.
Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
Address: 54.40.29.223
Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
Address: 54.40.29.65
{quote}
In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() 
returns sometimes true (even if address didn't actually changed see img. 1) and 
the timeoutFailures counter is set to 0 (see img. 2). The 
maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is 
repeated forever.

  was:
In some specific circumstances, 
org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts 
and last forever.

What are specific circumstances:
1) HDFS URI (hdfs://share.merck.com:8020/someDir/someFile.txt) should point to 
valid IP address but without name node service running on it.
2) There should be at least 2 IP addresses for such a URI. See output below:

[~/proj/quickbox]$ nslookup share.merck.com
Server: 127.0.1.1
Address:127.0.1.1#53

share.merck.com canonical name = 
internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com.
Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
Address: 54.40.29.223
Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
Address: 54.40.29.65

In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() 
returns sometimes true (even if address didn't actually changed see img. 1) and 
the timeoutFailures counter is set to 0 (see img. 2). The 
maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is 
repeated forever.


> org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever
> -
>
> Key: HDFS-7392
> URL: https://issues.apache.org/jira/browse/HDFS-7392
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Frantisek Vacek
>Priority: Critical
> Attachments: 1.png, 2.png
>
>
> In some specific circumstances, 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts 
> and last forever.
> What are specific circumstances:
> 1) HDFS URI (hdfs://share.merck.com:8020/someDir/someFile.txt) should point 
> to valid IP address but without name node service running on it.
> 2) There should be at least 2 IP addresses for such a URI. See output below:
> {quote}
> [~/proj/quickbox]$ nslookup share.merck.com
> Server: 127.0.1.1
> Address:127.0.1.1#53
> share.merck.com canonical name = 
> internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com.
> Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
> Address: 54.40.29.223
> Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
> Address: 54.40.29.65
> {quote}
> In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() 
> returns sometimes true (even if address didn't actually changed see img. 1) 
> and the timeoutFailures counter is set to 0 (see img. 2). The 
> maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is 
> repeated forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7392) org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever

2014-11-12 Thread Frantisek Vacek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frantisek Vacek updated HDFS-7392:
--
Attachment: 2.png
1.png

> org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever
> -
>
> Key: HDFS-7392
> URL: https://issues.apache.org/jira/browse/HDFS-7392
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Frantisek Vacek
>Priority: Critical
> Attachments: 1.png, 2.png
>
>
> In some specific circumstances, 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts 
> and last forever.
> What are specific circumstances:
> 1) HDFS URI (hdfs://share.merck.com:8020/someDir/someFile.txt) should point 
> to valid IP address but without name node service running on it.
> 2) There should be at least 2 IP addresses for such a URI. See output below:
> [~/proj/quickbox]$ nslookup share.merck.com
> Server: 127.0.1.1
> Address:127.0.1.1#53
> share.merck.com canonical name = 
> internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com.
> Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
> Address: 54.40.29.223
> Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
> Address: 54.40.29.65
> In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() 
> returns sometimes true (even if address didn't actually changed see img. 1) 
> and the timeoutFailures counter is set to 0 (see img. 2). The 
> maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is 
> repeated forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7392) org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever

2014-11-12 Thread Frantisek Vacek (JIRA)
Frantisek Vacek created HDFS-7392:
-

 Summary: org.apache.hadoop.hdfs.DistributedFileSystem open invalid 
URI forever
 Key: HDFS-7392
 URL: https://issues.apache.org/jira/browse/HDFS-7392
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Frantisek Vacek
Priority: Critical


In some specific circumstances, 
org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts 
and last forever.

What are specific circumstances:
1) HDFS URI (hdfs://share.merck.com:8020/someDir/someFile.txt) should point to 
valid IP address but without name node service running on it.
2) There should be at least 2 IP addresses for such a URI. See output below:

[~/proj/quickbox]$ nslookup share.merck.com
Server: 127.0.1.1
Address:127.0.1.1#53

share.merck.com canonical name = 
internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com.
Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
Address: 54.40.29.223
Name:   internal-gicprg-share-merck-com-1538706884.us-east-1.elb.amazonaws.com
Address: 54.40.29.65

In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() 
returns sometimes true (even if address didn't actually changed see img. 1) and 
the timeoutFailures counter is set to 0 (see img. 2). The 
maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is 
repeated forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207961#comment-14207961
 ] 

Hudson commented on HDFS-7381:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #741 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/741/])
HDFS-7381. Decouple the management of block id and gen stamps from 
FSNamesystem. Contributed by Haohui Mai. (wheat9: rev 
571e9c623241106dad5521a870fb8daef3f2b00a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SequentialBlockIdGenerator.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SequentialBlockIdGenerator.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockIdManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java


> Decouple the management of block id and gen stamps from FSNamesystem
> 
>
> Key: HDFS-7381
> URL: https://issues.apache.org/jira/browse/HDFS-7381
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7381.000.patch
>
>
> The block layer should be responsible of managing block ids and generation 
> stamps. Currently the functionality is misplace into {{FSNamesystem}}.
> This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7375) Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207962#comment-14207962
 ] 

Hudson commented on HDFS-7375:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #741 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/741/])
HDFS-7375. Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement. 
Contributed by Haohui Mai. (wheat9: rev 
46f6f9d60d0a2c1f441a0e81a071b08c24dbd6d6)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSClusterStats.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/FSClusterStats.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java


> Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement
> --
>
> Key: HDFS-7375
> URL: https://issues.apache.org/jira/browse/HDFS-7375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7375.000.patch, HDFS-7375.001.patch
>
>
> {{FSClusterStats}} is a private class that exports statistics for 
> {{BlockPlacementPolicy}}. This jira proposes moving it to {{ 
> o.a.h.h.hdfs.server.blockmanagement}} to simplify the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7389) Named user ACL cannot stop the user from accessing the FS entity.

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207959#comment-14207959
 ] 

Hudson commented on HDFS-7389:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #741 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/741/])
HDFS-7389. Named user ACL cannot stop the user from accessing the FS entity. 
Contributed by Vinayakumar B. (cnauroth: rev 
163bb55067bde71246b4030a08256ba9a8182dc8)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java


> Named user ACL cannot stop the user from accessing the FS entity.
> -
>
> Key: HDFS-7389
> URL: https://issues.apache.org/jira/browse/HDFS-7389
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.1
>Reporter: Chunjun Xiao
>Assignee: Vinayakumar B
> Fix For: 2.7.0
>
> Attachments: HDFS-7389-001.patch, HDFS-7389-002.patch
>
>
> In 
> http://hortonworks.com/blog/hdfs-acls-fine-grained-permissions-hdfs-files-hadoop/:
> {quote}
> It’s important to keep in mind the order of evaluation for ACL entries when a 
> user attempts to access a file system object:
> 1. If the user is the file owner, then the owner permission bits are enforced.
> 2. Else if the user has a named user ACL entry, then those permissions are 
> enforced.
> 3. Else if the user is a member of the file’s group or any named group in an 
> ACL entry, then the union of permissions for all matching entries are 
> enforced.  (The user may be a member of multiple groups.)
> 4. If none of the above were applicable, then the other permission bits are 
> enforced.
> {quote}
> Assume we have a user UserA from group GroupA, if we config a directory as 
> following ACL entries:
> group:GroupA:rwx
> user:UserA:---
> According to the design spec above, userA should have no access permission to 
> the file object, while actually userA still has rwx access to the dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7387) NFS may only do partial commit due to a race between COMMIT and write

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207966#comment-14207966
 ] 

Hudson commented on HDFS-7387:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #741 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/741/])
HDFS-7387. NFS may only do partial commit due to a race between COMMIT and 
write. Contributed by Brandon Li (brandonli: rev 
99d9d0c2d19b9f161b765947f3fb64619ea58090)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java


> NFS may only do partial commit due to a race between COMMIT and write
> -
>
> Key: HDFS-7387
> URL: https://issues.apache.org/jira/browse/HDFS-7387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.6.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: HDFS-7387.001.patch, HDFS-7387.002.patch
>
>
> The requested range may not be committed when the following happens:
> 1. the last pending write is removed from the queue to write to hdfs
> 2. a commit request arrives, NFS sees there is not pending write, and it will 
> do a sync
> 3. this sync request could flush only part of the last write to hdfs
> 4. if a file read happens immediately after the above steps, the user may not 
> see all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7389) Named user ACL cannot stop the user from accessing the FS entity.

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207947#comment-14207947
 ] 

Hudson commented on HDFS-7389:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/3/])
HDFS-7389. Named user ACL cannot stop the user from accessing the FS entity. 
Contributed by Vinayakumar B. (cnauroth: rev 
163bb55067bde71246b4030a08256ba9a8182dc8)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSAclBaseTest.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSPermissionChecker.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Named user ACL cannot stop the user from accessing the FS entity.
> -
>
> Key: HDFS-7389
> URL: https://issues.apache.org/jira/browse/HDFS-7389
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.1
>Reporter: Chunjun Xiao
>Assignee: Vinayakumar B
> Fix For: 2.7.0
>
> Attachments: HDFS-7389-001.patch, HDFS-7389-002.patch
>
>
> In 
> http://hortonworks.com/blog/hdfs-acls-fine-grained-permissions-hdfs-files-hadoop/:
> {quote}
> It’s important to keep in mind the order of evaluation for ACL entries when a 
> user attempts to access a file system object:
> 1. If the user is the file owner, then the owner permission bits are enforced.
> 2. Else if the user has a named user ACL entry, then those permissions are 
> enforced.
> 3. Else if the user is a member of the file’s group or any named group in an 
> ACL entry, then the union of permissions for all matching entries are 
> enforced.  (The user may be a member of multiple groups.)
> 4. If none of the above were applicable, then the other permission bits are 
> enforced.
> {quote}
> Assume we have a user UserA from group GroupA, if we config a directory as 
> following ACL entries:
> group:GroupA:rwx
> user:UserA:---
> According to the design spec above, userA should have no access permission to 
> the file object, while actually userA still has rwx access to the dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7387) NFS may only do partial commit due to a race between COMMIT and write

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207954#comment-14207954
 ] 

Hudson commented on HDFS-7387:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/3/])
HDFS-7387. NFS may only do partial commit due to a race between COMMIT and 
write. Contributed by Brandon Li (brandonli: rev 
99d9d0c2d19b9f161b765947f3fb64619ea58090)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java


> NFS may only do partial commit due to a race between COMMIT and write
> -
>
> Key: HDFS-7387
> URL: https://issues.apache.org/jira/browse/HDFS-7387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.6.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Critical
> Fix For: 2.7.0
>
> Attachments: HDFS-7387.001.patch, HDFS-7387.002.patch
>
>
> The requested range may not be committed when the following happens:
> 1. the last pending write is removed from the queue to write to hdfs
> 2. a commit request arrives, NFS sees there is not pending write, and it will 
> do a sync
> 3. this sync request could flush only part of the last write to hdfs
> 4. if a file read happens immediately after the above steps, the user may not 
> see all the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7381) Decouple the management of block id and gen stamps from FSNamesystem

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207949#comment-14207949
 ] 

Hudson commented on HDFS-7381:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/3/])
HDFS-7381. Decouple the management of block id and gen stamps from 
FSNamesystem. Contributed by Haohui Mai. (wheat9: rev 
571e9c623241106dad5521a870fb8daef3f2b00a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSequentialBlockId.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SequentialBlockIdGenerator.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockIdManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSaveNamespace.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SequentialBlockIdGenerator.java


> Decouple the management of block id and gen stamps from FSNamesystem
> 
>
> Key: HDFS-7381
> URL: https://issues.apache.org/jira/browse/HDFS-7381
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7381.000.patch
>
>
> The block layer should be responsible of managing block ids and generation 
> stamps. Currently the functionality is misplace into {{FSNamesystem}}.
> This jira proposes to decouple them from the {{FSNamesystem}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7375) Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement

2014-11-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207950#comment-14207950
 ] 

Hudson commented on HDFS-7375:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #3 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/3/])
HDFS-7375. Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement. 
Contributed by Haohui Mai. (wheat9: rev 
46f6f9d60d0a2c1f441a0e81a071b08c24dbd6d6)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeCapacityReport.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/FSClusterStats.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSClusterStats.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java


> Move FSClusterStats to o.a.h.h.hdfs.server.blockmanagement
> --
>
> Key: HDFS-7375
> URL: https://issues.apache.org/jira/browse/HDFS-7375
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.7.0
>
> Attachments: HDFS-7375.000.patch, HDFS-7375.001.patch
>
>
> {{FSClusterStats}} is a private class that exports statistics for 
> {{BlockPlacementPolicy}}. This jira proposes moving it to {{ 
> o.a.h.h.hdfs.server.blockmanagement}} to simplify the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7017) Implement OutputStream for libhdfs3

2014-11-12 Thread Zhanwei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207937#comment-14207937
 ] 

Zhanwei Wang commented on HDFS-7017:


Sorry for coming back on this jira late. 

I add interface LeaseManager for LeaseManagerImpl to implement a mock object 
and do the unit test. I use google mock to implement the mock object. Using 
interface class is recommended by google mock. 

https://code.google.com/p/googlemock/wiki/V1_7_ForDummies

google mock is a good mock framework and works well with google test. Hand 
writing another mock framework is duplicated work and waste time.

Fault injection is another thing, I use a tool to do fault injection test 
without modifying source code, it hooks the functions at runtime. Since it is a 
internal tool, I canon open source related code.

I agree that this indirection makes the code hard to follow, Colin, would you 
please recommend a better way to do such unit test?

Making LeaseManager owned by the hdfsFS instance is better. Although it may 
introduce more threads if the client connect to many file system instance. I 
think it is ok.

I will make this change and separate packet memory pool into another jira. 






> Implement OutputStream for libhdfs3
> ---
>
> Key: HDFS-7017
> URL: https://issues.apache.org/jira/browse/HDFS-7017
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
>Assignee: Zhanwei Wang
> Attachments: HDFS-7017-pnative.002.patch, 
> HDFS-7017-pnative.003.patch, HDFS-7017-pnative.004.patch, HDFS-7017.patch
>
>
> Implement pipeline and OutputStream C++ interface



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-7363) Pluggable algorithms to form block groups in erasure coding

2014-11-12 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng reopened HDFS-7363:
-
  Assignee: Kai Zheng

Let me reuse this item as a sub task of HDFS-7337, where BlockGrouper is 
defined for this purpose as part of a codec. Its role:
Given desired data blocks, BlockGrouper calculates and arranges a BlockGroup 
for encoding. Different code can have different layout about a BlockGroup. In 
LRC(6, 2, 2), we have 3 child block groups: 2 local groups plus 1 global group; 
In RS, we have 1 block group. Given a BlockGroup with some blocks missing, 
BlockGroups also calculates and determines how to recover if recoverable, like 
using which blocks to recover a missing block. With such information the 
corresponding ErasureCoder can 
perform the concrete decoding work. 

> Pluggable algorithms to form block groups in erasure coding
> ---
>
> Key: HDFS-7363
> URL: https://issues.apache.org/jira/browse/HDFS-7363
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Kai Zheng
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7337) Configurable and pluggable Erasure Codec and schema

2014-11-12 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-7337:

Attachment: PluggableErasureCodec.pdf

For time saving, I composed this doc to illustrate the erasure codec framework 
for review. Your feedback is welcome.

> Configurable and pluggable Erasure Codec and schema
> ---
>
> Key: HDFS-7337
> URL: https://issues.apache.org/jira/browse/HDFS-7337
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Kai Zheng
> Attachments: HDFS-7337-prototype-v1.patch, PluggableErasureCodec.pdf
>
>
> According to HDFS-7285 and the design, this considers to support multiple 
> Erasure Codecs via pluggable approach. It allows to define and configure 
> multiple codec schemas with different coding algorithms and parameters. The 
> resultant codec schemas can be utilized and specified via command tool for 
> different file folders. While design and implement such pluggable framework, 
> it’s also to implement a concrete codec by default (Reed Solomon) to prove 
> the framework is useful and workable. Separate JIRA could be opened for the 
> RS codec implementation.
> Note HDFS-7353 will focus on the very low level codec API and implementation 
> to make concrete vendor libraries transparent to the upper layer. This JIRA 
> focuses on high level stuffs that interact with configuration, schema and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7344) Erasure Coding worker and support in DataNode

2014-11-12 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-7344:

Attachment: HDFS ECWorker Design.pdf

This is the first draft of low level design doc base on HDFS-7285 and we're 
happy to incorporate feedbacks under this JIRA.

> Erasure Coding worker and support in DataNode
> -
>
> Key: HDFS-7344
> URL: https://issues.apache.org/jira/browse/HDFS-7344
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Kai Zheng
>Assignee: Li Bo
> Attachments: HDFS ECWorker Design.pdf
>
>
> According to HDFS-7285 and the design, this handles DataNode side extension 
> and related support for Erasure Coding, and implements ECWorker. It mainly 
> covers the following aspects, and separate tasks may be opened to handle each 
> of them.
> * Process encoding work, calculating parity blocks as specified in block 
> groups and codec schema;
> * Process decoding work, recovering data blocks according to block groups and 
> codec schema;
> * Handle client requests for passive recovery blocks data and serving data on 
> demand while reconstructing;
> * Write parity blocks according to storage policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)