[jira] [Commented] (HDFS-6942) Fix typos in log messages

2014-09-02 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117924#comment-14117924
 ] 

Akira AJISAKA commented on HDFS-6942:
-

Thanks [~rchiang] for the patch. Looks good to me.
By the way, I found another typo 'targests' in DataNode.java.
{code}
  if (DataTransferProtocol.LOG.isDebugEnabled()) {
DataTransferProtocol.LOG.debug(getClass().getSimpleName() + : 
+ b +  (numBytes= + b.getNumBytes() + )
+ , stage= + stage
+ , clientname= + clientname
+ , targests= + Arrays.asList(targets));
  }
{code}
Would you include fixing the typo in the patch?

 Fix typos in log messages
 -

 Key: HDFS-6942
 URL: https://issues.apache.org/jira/browse/HDFS-6942
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial
  Labels: newbie
 Attachments: HDFS-6942-01.patch


 There are a bunch of typos in log messages. HADOOP-10946 was initially 
 created, but may have failed due to being in multiple components. Try fixing 
 typos on a per-component basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.

2014-09-02 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118004#comment-14118004
 ] 

Yi Liu commented on HDFS-6886:
--

TestOfflineEditsViewer is successful with {{editsStored}}, other three failures 
are not related.

 Use single editlog record for creating file + overwrite.
 

 Key: HDFS-6886
 URL: https://issues.apache.org/jira/browse/HDFS-6886
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Critical
 Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, 
 HDFS-6886.003.patch, editsStored


 As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we 
 could do further improvement to use one editlog record for creating file + 
 overwrite in this JIRA. We could record the overwrite flag in editlog for 
 creating file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6980) TestWebHdfsFileSystemContract fails in trunk

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118028#comment-14118028
 ] 

Hadoop QA commented on HDFS-6980:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12665841/HDFS-6980.1-2.patch
  against trunk revision 258c7d0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives
  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7869//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7869//console

This message is automatically generated.

 TestWebHdfsFileSystemContract fails in trunk
 

 Key: HDFS-6980
 URL: https://issues.apache.org/jira/browse/HDFS-6980
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Akira AJISAKA
Assignee: Tsuyoshi OZAWA
 Attachments: HDFS-6980.1-2.patch, HDFS-6980.1.patch


 Many tests in TestWebHdfsFileSystemContract fail by too many open files 
 error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6962) ACLs inheritance conflict with umaskmode

2014-09-02 Thread LINTE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118080#comment-14118080
 ] 

LINTE commented on HDFS-6962:
-

Any update on this issue ?

 ACLs inheritance conflict with umaskmode
 

 Key: HDFS-6962
 URL: https://issues.apache.org/jira/browse/HDFS-6962
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Affects Versions: 2.4.1
 Environment: CentOS release 6.5 (Final)
Reporter: LINTE
  Labels: hadoop, security

 In hdfs-site.xml 
 property
 namedfs.umaskmode/name
 value027/value
 /property
 1/ Create a directory as superuser
 bash# hdfs dfs -mkdir  /tmp/ACLS
 2/ set default ACLs on this directory rwx access for group readwrite and user 
 toto
 bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS
 bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS
 3/ check ACLs /tmp/ACLS/
 bash# hdfs dfs -getfacl /tmp/ACLS/
 # file: /tmp/ACLS
 # owner: hdfs
 # group: hadoop
 user::rwx
 group::r-x
 other::---
 default:user::rwx
 default:user:toto:rwx
 default:group::r-x
 default:group:readwrite:rwx
 default:mask::rwx
 default:other::---
 user::rwx | group::r-x | other::--- matches with the umaskmode defined in 
 hdfs-site.xml, everything ok !
 default:group:readwrite:rwx allow readwrite group with rwx access for 
 inhéritance.
 default:user:toto:rwx allow toto user with rwx access for inhéritance.
 default:mask::rwx inhéritance mask is rwx, so no mask
 4/ Create a subdir to test inheritance of ACL
 bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs
 5/ check ACLs /tmp/ACLS/hdfs
 bash# hdfs dfs -getfacl /tmp/ACLS/hdfs
 # file: /tmp/ACLS/hdfs
 # owner: hdfs
 # group: hadoop
 user::rwx
 user:toto:rwx   #effective:r-x
 group::r-x
 group:readwrite:rwx #effective:r-x
 mask::r-x
 other::---
 default:user::rwx
 default:user:toto:rwx
 default:group::r-x
 default:group:readwrite:rwx
 default:mask::rwx
 default:other::---
 Here we can see that the readwrite group has rwx ACL bu only r-x is effective 
 because the mask is r-x (mask::r-x) in spite of default mask for inheritance 
 is set to default:mask::rwx on /tmp/ACLS/
 6/ Modifiy hdfs-site.xml et restart namenode
 property
 namedfs.umaskmode/name
 value010/value
 /property
 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode
 bash# hdfs dfs -mkdir  /tmp/ACLS/hdfs2
 8/ Check ACL on /tmp/ACLS/hdfs2
 bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2
 # file: /tmp/ACLS/hdfs2
 # owner: hdfs
 # group: hadoop
 user::rwx
 user:toto:rwx   #effective:rw-
 group::r-x  #effective:r--
 group:readwrite:rwx #effective:rw-
 mask::rw-
 other::---
 default:user::rwx
 default:user:toto:rwx
 default:group::r-x
 default:group:readwrite:rwx
 default:mask::rwx
 default:other::---
 So HDFS masks the ACL value (user, group and other  -- exepted the POSIX 
 owner -- ) with the group mask of dfs.umaskmode properties when creating 
 directory with inherited ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-09-02 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118149#comment-14118149
 ] 

Yi Liu commented on HDFS-6951:
--

Thanks [~clamb], looks good to me, just one comment:

yes, this method declares {{this method is always called with writeLock of 
FSDirectory held}} but image loading breaks. Just add _writeLock_ is not good, 
we define a new method something like _unprotectedAddEncryptionZone_ for image 
loading, as Andrew's suggestion.  Or find a better way?
{code}
@@ -2074,8 +2074,13 @@ public final void addToInodeMap(INode inode) {
   for (XAttr xattr : xattrs) {
 final String xaName = XAttrHelper.getPrefixName(xattr);
 if (CRYPTO_XATTR_ENCRYPTION_ZONE.equals(xaName)) {
-  ezManager.addEncryptionZone(inode.getId(),
-  new String(xattr.getValue()));
+  writeLock();
+  try {
+ezManager.addEncryptionZone(inode.getId(),
+new String(xattr.getValue()));
+  } finally {
+writeUnlock();
+  }
 }
   }
 }
{code}

 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Charles Lamb
 Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, 
 HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, editsStored


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file

2014-09-02 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118156#comment-14118156
 ] 

Yi Liu commented on HDFS-6705:
--

Hi [~clamb] and [~andrew.wang], could this xattr be something like 
{{SECURITY_CRYPTO_UNREADABLE_BY_SUPERUSER}}, and only could be set in 
encryption zones.  Then normal files will not be affected.

 Create an XAttr that disallows the HDFS admin from accessing a file
 ---

 Key: HDFS-6705
 URL: https://issues.apache.org/jira/browse/HDFS-6705
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6705.001.patch


 There needs to be an xattr that specifies that the HDFS admin can not access 
 a file. This is needed for m/r delegation tokens and data at rest encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).

2014-09-02 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDFS-2975:
-

Assignee: Uma Maheswara Rao G  (was: Yi Liu)

 Rename with overwrite flag true can make NameNode to stuck in safemode on NN 
 (crash + restart).
 ---

 Key: HDFS-2975
 URL: https://issues.apache.org/jira/browse/HDFS-2975
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-2975.001.patch


 When we rename the file with overwrite flag as true, it will delete the 
 destination file blocks. After deleting the blocks, whenever it releases the 
 fsNameSystem lock, NN can give the invalidation work to corresponding DNs to 
 delete the blocks.
 Parallaly it will sync the rename related edits to editlog file. At this step 
 before NN sync the edits if NN crashes, NN can stuck into safemode on 
 restart. This is because block already deleted from the DN as part of 
 invalidations. But dst file still exist as rename edits not persisted in log 
 file and no DN will report that blocks now.
 This is similar to HDFS-2815
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).

2014-09-02 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118283#comment-14118283
 ] 

Uma Maheswara Rao G commented on HDFS-2975:
---

Yi,  Thanks a lot for the explanation.
+1 on the patch. [~vinayrpet], Do you have any comments?

 Rename with overwrite flag true can make NameNode to stuck in safemode on NN 
 (crash + restart).
 ---

 Key: HDFS-2975
 URL: https://issues.apache.org/jira/browse/HDFS-2975
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-2975.001.patch


 When we rename the file with overwrite flag as true, it will delete the 
 destination file blocks. After deleting the blocks, whenever it releases the 
 fsNameSystem lock, NN can give the invalidation work to corresponding DNs to 
 delete the blocks.
 Parallaly it will sync the rename related edits to editlog file. At this step 
 before NN sync the edits if NN crashes, NN can stuck into safemode on 
 restart. This is because block already deleted from the DN as part of 
 invalidations. But dst file still exist as rename edits not persisted in log 
 file and no DN will report that blocks now.
 This is similar to HDFS-2815
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-09-02 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6951:
---
Attachment: HDFS-6951.004.patch

[~hitliuyi], [~andrew.wang],

The .004 patch adds an unprotectedAddEncryptionZone method per your comment.



 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Charles Lamb
 Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, 
 HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, 
 HDFS-6951.004.patch, editsStored


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6867) For DFSOutputStream, do pipeline recovery for a single block in the background

2014-09-02 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118352#comment-14118352
 ] 

Zhe Zhang commented on HDFS-6867:
-

[~cmccabe] Could you help review the patch? In the patch I created a new 
{{ReplaceDatanodeOnFailure}} policy named {{BACKGROUND}}, for the user to 
specify that background recovery should be used -- which we didn't cover in the 
offline discussion but I think is necessary.

 For DFSOutputStream, do pipeline recovery for a single block in the background
 --

 Key: HDFS-6867
 URL: https://issues.apache.org/jira/browse/HDFS-6867
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Colin Patrick McCabe
Assignee: Zhe Zhang
 Attachments: HDFS-6867-20140827-2.patch, HDFS-6867-20140827-3.patch, 
 HDFS-6867-20140827.patch, HDFS-6867-20140828-1.patch, 
 HDFS-6867-20140828-2.patch, HDFS-6867-design-20140820.pdf, 
 HDFS-6867-design-20140821.pdf, HDFS-6867-design-20140822.pdf, 
 HDFS-6867-design-20140827.pdf


 For DFSOutputStream, we should be able to do pipeline recovery in the 
 background, while the user is continuing to write to the file.  This is 
 especially useful for long-lived clients that write to an HDFS file slowly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).

2014-09-02 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2975:
--
Assignee: Yi Liu  (was: Uma Maheswara Rao G)

 Rename with overwrite flag true can make NameNode to stuck in safemode on NN 
 (crash + restart).
 ---

 Key: HDFS-2975
 URL: https://issues.apache.org/jira/browse/HDFS-2975
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
 Attachments: HDFS-2975.001.patch


 When we rename the file with overwrite flag as true, it will delete the 
 destination file blocks. After deleting the blocks, whenever it releases the 
 fsNameSystem lock, NN can give the invalidation work to corresponding DNs to 
 delete the blocks.
 Parallaly it will sync the rename related edits to editlog file. At this step 
 before NN sync the edits if NN crashes, NN can stuck into safemode on 
 restart. This is because block already deleted from the DN as part of 
 invalidations. But dst file still exist as rename edits not persisted in log 
 file and no DN will report that blocks now.
 This is similar to HDFS-2815
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option

2014-09-02 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118393#comment-14118393
 ] 

Yongjun Zhang commented on HDFS-4257:
-

Hi [~szetszwo], thanks for the rev, it looks good! A few very minor comments:

1. Wonder if we can add a log right after calling 
{{this.dtpReplaceDatanodeOnFailure = ReplaceDatanodeOnFailure.get(conf);}}, to 
indicate what policy is used? My concern is, user may change policy between 
different sessions, it'd be nice to have a record in the log, so we can tell 
what policy is used. 

2. About method {{satisfy(...)}} in Condition interface, {{DEFAULT}} has 
final qualifier for all parameters, but the others don't. It'd be nice to be 
consistent. Having final is a good thing, to achieve both the benefit of 
final and code consistency.

3. The comments section and parameter specification for {{static final 
Condition DEFAULT = new Condition() {}} used names r, n and replication, 
nExistings in a mixed way. Can we use replication, nExistings  to be 
consistent with other places in the same file?

Thanks a lot.


 The ReplaceDatanodeOnFailure policies could have a forgiving option
 ---

 Key: HDFS-4257
 URL: https://issues.apache.org/jira/browse/HDFS-4257
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client
Affects Versions: 2.0.2-alpha
Reporter: Harsh J
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h4257_20140325.patch, h4257_20140325b.patch, 
 h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch


 Similar question has previously come over HDFS-3091 and friends, but the 
 essential problem is: Why can't I write to my cluster of 3 nodes, when I 
 just have 1 node available at a point in time..
 The policies cover the 4 options, with {{Default}} being default:
 {{Disable}} - Disables the whole replacement concept by throwing out an 
 error (at the server) or acts as {{Never}} at the client.
 {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in 
 many cases).
 {{Default}} - Replace based on a few conditions, but whose minimum never 
 touches 1. We always fail if only one DN remains and none others can be added.
 {{Always}} - Replace no matter what. Fail if can't replace.
 Would it not make sense to have an option similar to Always/Default, where 
 despite _trying_, if it isn't possible to have  1 DN in the pipeline, do not 
 fail. I think that is what the former write behavior was, and what fit with 
 the minimum replication factor allowed value.
 Why is it grossly wrong to pass a write from a client for a block with just 1 
 remaining replica in the pipeline (the minimum of 1 grows with the 
 replication factor demanded from the write), when replication is taken care 
 of immediately afterwards? How often have we seen missing blocks arise out of 
 allowing this + facing a big rack(s) failure or so?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5114) getMaxNodesPerRack() in BlockPlacementPolicyDefault does not take decommissioning nodes into account.

2014-09-02 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118394#comment-14118394
 ] 

Zhe Zhang commented on HDFS-5114:
-

[~kihwal] Since this was created a year ago, do you happen to know if it has 
been resolved in the latest code? If not I'm happy to work on it. Thanks!

 getMaxNodesPerRack() in BlockPlacementPolicyDefault does not take 
 decommissioning nodes into account.
 -

 Key: HDFS-5114
 URL: https://issues.apache.org/jira/browse/HDFS-5114
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Kihwal Lee
Assignee: Zhe Zhang

 If a large proportion of data nodes are being decommissioned, one or more 
 racks may not be writable. However this is not taken into account when the 
 default block placement policy module invokes getMaxNodesPerRack(). Some 
 blocks, especially the ones with a high replication factor, may not be able 
 to fully replicated until those nodes are taken out of dfs.include.  It can 
 actually block decommissioning itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).

2014-09-02 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118413#comment-14118413
 ] 

Vinayakumar B commented on HDFS-2975:
-

Thanks a lot for the patch [~hitliuyi],
+1 from me too.

 Rename with overwrite flag true can make NameNode to stuck in safemode on NN 
 (crash + restart).
 ---

 Key: HDFS-2975
 URL: https://issues.apache.org/jira/browse/HDFS-2975
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
 Attachments: HDFS-2975.001.patch


 When we rename the file with overwrite flag as true, it will delete the 
 destination file blocks. After deleting the blocks, whenever it releases the 
 fsNameSystem lock, NN can give the invalidation work to corresponding DNs to 
 delete the blocks.
 Parallaly it will sync the rename related edits to editlog file. At this step 
 before NN sync the edits if NN crashes, NN can stuck into safemode on 
 restart. This is because block already deleted from the DN as part of 
 invalidations. But dst file still exist as rename edits not persisted in log 
 file and no DN will report that blocks now.
 This is similar to HDFS-2815
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.

2014-09-02 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118432#comment-14118432
 ] 

Vinayakumar B commented on HDFS-6886:
-

bq.   this.overwrite = Boolean.parseBoolean(st.getValue(OVERWRITE));
Here you may need to use {{st.getValueOrNull(..)}}, otherwise 
InvalidXmlException will be thrown if tried to convert old edits.


 Use single editlog record for creating file + overwrite.
 

 Key: HDFS-6886
 URL: https://issues.apache.org/jira/browse/HDFS-6886
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Critical
 Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, 
 HDFS-6886.003.patch, editsStored


 As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we 
 could do further improvement to use one editlog record for creating file + 
 overwrite in this JIRA. We could record the overwrite flag in editlog for 
 creating file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6831) Inconsistency between 'hdfs dfsadmin' and 'hdfs dfsadmin -help'

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6831:

Assignee: Xiaoyu Yao

 Inconsistency between 'hdfs dfsadmin' and 'hdfs dfsadmin -help'
 ---

 Key: HDFS-6831
 URL: https://issues.apache.org/jira/browse/HDFS-6831
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Akira AJISAKA
Assignee: Xiaoyu Yao
Priority: Minor
  Labels: newbie
 Attachments: HDFS-6831.0.patch, HDFS-6831.1.patch


 There is an inconsistency between the console outputs of 'hdfs dfsadmin' 
 command and 'hdfs dfsadmin -help' command.
 {code}
 [root@trunk ~]# hdfs dfsadmin
 Usage: java DFSAdmin
 Note: Administrative commands can only be run as the HDFS superuser.
[-report]
[-safemode enter | leave | get | wait]
[-allowSnapshot snapshotDir]
[-disallowSnapshot snapshotDir]
[-saveNamespace]
[-rollEdits]
[-restoreFailedStorage true|false|check]
[-refreshNodes]
[-finalizeUpgrade]
[-rollingUpgrade [query|prepare|finalize]]
[-metasave filename]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshCallQueue]
[-refresh]
[-printTopology]
[-refreshNamenodes datanodehost:port]
[-deleteBlockPool datanode-host:port blockpoolId [force]]
[-setQuota quota dirname...dirname]
[-clrQuota dirname...dirname]
[-setSpaceQuota quota dirname...dirname]
[-clrSpaceQuota dirname...dirname]
[-setBalancerBandwidth bandwidth in bytes per second]
[-fetchImage local directory]
[-shutdownDatanode datanode_host:ipc_port [upgrade]]
[-getDatanodeInfo datanode_host:ipc_port]
[-help [cmd]]
 {code}
 {code}
 [root@trunk ~]# hdfs dfsadmin -help
 hadoop dfsadmin performs DFS administrative commands.
 The full syntax is: 
 hadoop dfsadmin
   [-report [-live] [-dead] [-decommissioning]]
   [-safemode enter | leave | get | wait]
   [-saveNamespace]
   [-rollEdits]
   [-restoreFailedStorage true|false|check]
   [-refreshNodes]
   [-setQuota quota dirname...dirname]
   [-clrQuota dirname...dirname]
   [-setSpaceQuota quota dirname...dirname]
   [-clrSpaceQuota dirname...dirname]
   [-finalizeUpgrade]
   [-rollingUpgrade [query|prepare|finalize]]
   [-refreshServiceAcl]
   [-refreshUserToGroupsMappings]
   [-refreshSuperUserGroupsConfiguration]
   [-refreshCallQueue]
   [-refresh host:ipc_port key [arg1..argn]
   [-printTopology]
   [-refreshNamenodes datanodehost:port]
   [-deleteBlockPool datanodehost:port blockpoolId [force]]
   [-setBalancerBandwidth bandwidth]
   [-fetchImage local directory]
   [-allowSnapshot snapshotDir]
   [-disallowSnapshot snapshotDir]
   [-shutdownDatanode datanode_host:ipc_port [upgrade]]
   [-getDatanodeInfo datanode_host:ipc_port
   [-help [cmd]
 {code}
 These two outputs should be the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6848) Lack of synchronization on access to datanodeUuid in DataStorage#format()

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6848:

Assignee: Xiaoyu Yao

 Lack of synchronization on access to datanodeUuid in DataStorage#format() 
 --

 Key: HDFS-6848
 URL: https://issues.apache.org/jira/browse/HDFS-6848
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Xiaoyu Yao
Priority: Minor
 Attachments: HDFS-6848.0.patch


 {code}
 this.datanodeUuid = datanodeUuid;
 {code}
 The above assignment should be done holding lock DataStorage.this - as is 
 done in two other places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6942) Fix typos in log messages

2014-09-02 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated HDFS-6942:
-
Attachment: HDFS-6942-02.patch

Adding fix from [~ajisakaa].  Thanks for finding it.

 Fix typos in log messages
 -

 Key: HDFS-6942
 URL: https://issues.apache.org/jira/browse/HDFS-6942
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial
  Labels: newbie
 Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch


 There are a bunch of typos in log messages. HADOOP-10946 was initially 
 created, but may have failed due to being in multiple components. Try fixing 
 typos on a per-component basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118488#comment-14118488
 ] 

Hadoop QA commented on HDFS-6951:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12665928/HDFS-6951.004.patch
  against trunk revision 258c7d0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestFsck
  org.apache.hadoop.hdfs.server.namenode.TestParallelImageWrite
  org.apache.hadoop.hdfs.TestAppendDifferentChecksum
  org.apache.hadoop.hdfs.server.namenode.TestHDFSConcat
  
org.apache.hadoop.hdfs.server.datanode.TestReadOnlySharedStorage
  org.apache.hadoop.fs.TestSymlinkHdfsFileContext
  org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot
  org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics
  org.apache.hadoop.hdfs.TestDFSMkdirs
  
org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
  org.apache.hadoop.hdfs.server.namenode.TestNameNodeMXBean
  org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer
  
org.apache.hadoop.hdfs.server.namenode.TestEditLogJournalFailures
  org.apache.hadoop.fs.TestGlobPaths
  org.apache.hadoop.hdfs.server.namenode.TestEditLogRace
  
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
  org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMkdir
  org.apache.hadoop.fs.TestHDFSFileContextMainOperations
  
org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
  org.apache.hadoop.hdfs.server.namenode.TestFSEditLogLoader
  org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery
  org.apache.hadoop.hdfs.TestDFSRename
  org.apache.hadoop.hdfs.server.namenode.ha.TestXAttrsWithHA
  
org.apache.hadoop.hdfs.server.namenode.ha.TestDelegationTokensWithHA
  
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica
  org.apache.hadoop.hdfs.web.TestWebHDFS
  org.apache.hadoop.hdfs.server.namenode.TestAddBlock
  
org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks
  org.apache.hadoop.hdfs.server.namenode.ha.TestHAMetrics
  org.apache.hadoop.hdfs.server.namenode.ha.TestNNHealthCheck
  org.apache.hadoop.fs.viewfs.TestViewFsWithAcls
  
org.apache.hadoop.hdfs.server.datanode.TestBlockHasMultipleReplicasOnSameDN
  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
  org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives
  org.apache.hadoop.fs.contract.hdfs.TestHDFSContractDelete
  org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes
  org.apache.hadoop.hdfs.server.namenode.TestBackupNode
  org.apache.hadoop.hdfs.TestDFSUpgrade
  
org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage
  org.apache.hadoop.hdfs.server.namenode.TestHostsFiles
  
org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks
  org.apache.hadoop.hdfs.server.namenode.TestDeleteRace
  org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade
  
org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogsDuringFailover
  org.apache.hadoop.fs.viewfs.TestViewFileSystemWithXAttrs
  org.apache.hadoop.fs.viewfs.TestViewFsHdfs
  org.apache.hadoop.hdfs.web.TestHttpsFileSystem
  org.apache.hadoop.fs.TestResolveHdfsSymlink
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestDisallowModifyROSnapshot
  

[jira] [Commented] (HDFS-6954) With crypto, no native lib systems are too verbose

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118489#comment-14118489
 ] 

Hadoop QA commented on HDFS-6954:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12665324/HDFS-6954.003.patch
  against trunk revision e1109fb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl

  The test build failed in 
hadoop-hdfs-project/hadoop-hdfs 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7871//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7871//console

This message is automatically generated.

 With crypto, no native lib systems are too verbose
 --

 Key: HDFS-6954
 URL: https://issues.apache.org/jira/browse/HDFS-6954
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Assignee: Charles Lamb
 Attachments: HDFS-6954.001.patch, HDFS-6954.002.patch, 
 HDFS-6954.003.patch


 Running commands on a machine without a native library results in:
 {code}
 $ bin/hdfs dfs -put /etc/hosts /tmp
 14/08/27 07:16:10 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 14/08/27 07:16:11 WARN crypto.CryptoCodec: Crypto codec 
 org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available.
 14/08/27 07:16:11 INFO hdfs.DFSClient: No KeyProvider found.
 {code}
 This is way too much.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6831) Inconsistency between 'hdfs dfsadmin' and 'hdfs dfsadmin -help'

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6831:

Assignee: Xiaoyu Yao  (was: Xiaoyu Yao)

 Inconsistency between 'hdfs dfsadmin' and 'hdfs dfsadmin -help'
 ---

 Key: HDFS-6831
 URL: https://issues.apache.org/jira/browse/HDFS-6831
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Akira AJISAKA
Assignee: Xiaoyu Yao
Priority: Minor
  Labels: newbie
 Attachments: HDFS-6831.0.patch, HDFS-6831.1.patch


 There is an inconsistency between the console outputs of 'hdfs dfsadmin' 
 command and 'hdfs dfsadmin -help' command.
 {code}
 [root@trunk ~]# hdfs dfsadmin
 Usage: java DFSAdmin
 Note: Administrative commands can only be run as the HDFS superuser.
[-report]
[-safemode enter | leave | get | wait]
[-allowSnapshot snapshotDir]
[-disallowSnapshot snapshotDir]
[-saveNamespace]
[-rollEdits]
[-restoreFailedStorage true|false|check]
[-refreshNodes]
[-finalizeUpgrade]
[-rollingUpgrade [query|prepare|finalize]]
[-metasave filename]
[-refreshServiceAcl]
[-refreshUserToGroupsMappings]
[-refreshSuperUserGroupsConfiguration]
[-refreshCallQueue]
[-refresh]
[-printTopology]
[-refreshNamenodes datanodehost:port]
[-deleteBlockPool datanode-host:port blockpoolId [force]]
[-setQuota quota dirname...dirname]
[-clrQuota dirname...dirname]
[-setSpaceQuota quota dirname...dirname]
[-clrSpaceQuota dirname...dirname]
[-setBalancerBandwidth bandwidth in bytes per second]
[-fetchImage local directory]
[-shutdownDatanode datanode_host:ipc_port [upgrade]]
[-getDatanodeInfo datanode_host:ipc_port]
[-help [cmd]]
 {code}
 {code}
 [root@trunk ~]# hdfs dfsadmin -help
 hadoop dfsadmin performs DFS administrative commands.
 The full syntax is: 
 hadoop dfsadmin
   [-report [-live] [-dead] [-decommissioning]]
   [-safemode enter | leave | get | wait]
   [-saveNamespace]
   [-rollEdits]
   [-restoreFailedStorage true|false|check]
   [-refreshNodes]
   [-setQuota quota dirname...dirname]
   [-clrQuota dirname...dirname]
   [-setSpaceQuota quota dirname...dirname]
   [-clrSpaceQuota dirname...dirname]
   [-finalizeUpgrade]
   [-rollingUpgrade [query|prepare|finalize]]
   [-refreshServiceAcl]
   [-refreshUserToGroupsMappings]
   [-refreshSuperUserGroupsConfiguration]
   [-refreshCallQueue]
   [-refresh host:ipc_port key [arg1..argn]
   [-printTopology]
   [-refreshNamenodes datanodehost:port]
   [-deleteBlockPool datanodehost:port blockpoolId [force]]
   [-setBalancerBandwidth bandwidth]
   [-fetchImage local directory]
   [-allowSnapshot snapshotDir]
   [-disallowSnapshot snapshotDir]
   [-shutdownDatanode datanode_host:ipc_port [upgrade]]
   [-getDatanodeInfo datanode_host:ipc_port
   [-help [cmd]
 {code}
 These two outputs should be the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6848) Lack of synchronization on access to datanodeUuid in DataStorage#format()

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6848:

Assignee: Xiaoyu Yao  (was: Xiaoyu Yao)

 Lack of synchronization on access to datanodeUuid in DataStorage#format() 
 --

 Key: HDFS-6848
 URL: https://issues.apache.org/jira/browse/HDFS-6848
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Xiaoyu Yao
Priority: Minor
 Attachments: HDFS-6848.0.patch


 {code}
 this.datanodeUuid = datanodeUuid;
 {code}
 The above assignment should be done holding lock DataStorage.this - as is 
 done in two other places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118513#comment-14118513
 ] 

Colin Patrick McCabe commented on HDFS-6482:


Yeah, it would be great to have this in 2.6.  Is HDFS-6981 blocking merging 
this to 2.6?

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes

2014-09-02 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118554#comment-14118554
 ] 

Arpit Agarwal commented on HDFS-6482:
-

That is the known issue, yes.

 Use block ID-based block layout on datanodes
 

 Key: HDFS-6482
 URL: https://issues.apache.org/jira/browse/HDFS-6482
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 3.0.0

 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
 HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
 HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, 
 hadoop-24-datanode-dir.tgz


 Right now blocks are placed into directories that are split into many 
 subdirectories when capacity is reached. Instead we can use a block's ID to 
 determine the path it should go in. This eliminates the need for the LDir 
 data structure that facilitates the splitting of directories when they reach 
 capacity as well as fields in ReplicaInfo that keep track of a replica's 
 location.
 An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6942) Fix typos in log messages

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118615#comment-14118615
 ] 

Hadoop QA commented on HDFS-6942:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12665940/HDFS-6942-02.patch
  against trunk revision 329b659.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-nfs:

  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7872//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7872//console

This message is automatically generated.

 Fix typos in log messages
 -

 Key: HDFS-6942
 URL: https://issues.apache.org/jira/browse/HDFS-6942
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial
  Labels: newbie
 Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch


 There are a bunch of typos in log messages. HADOOP-10946 was initially 
 created, but may have failed due to being in multiple components. Try fixing 
 typos on a per-component basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6930) Improve replica eviction from RAM disk

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118657#comment-14118657
 ] 

Colin Patrick McCabe commented on HDFS-6930:


bq. Eviction is done when we have Less than 10% free space or Insufficient 
space for 3 default length blocks.

One thing that might be suboptimal here is that we're using the 
{{dfs.blocksize}} configuration key on the DataNode and assuming that will be 
the same value used by the client.  Clearly, the client could use 256 MB 
blocks, whereas the DN could use 128 MB blocks.  Etc.

Also, we don't really know how big the ramdisks are going to be.  I can easily 
see a 300 GB ramdisk being used in a few years.  Just defaulting to keeping 10% 
free seems like too much.

So, why not just have a minimum free space configuration key.  It could be 
specified as a number of bytes, rather than as a percentage.  So we could 
default it to 128 MB * 3 to get your current default of leaving space for 3 
blocks.  This would work better for bigger ramdisks (unlike a percentage-based 
scheme) and wouldn't make assumptions about the client's and DN's block size 
configuration being the same.

 Improve replica eviction from RAM disk
 --

 Key: HDFS-6930
 URL: https://issues.apache.org/jira/browse/HDFS-6930
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6930.01.patch


 The current replica eviction scheme is inefficient since it performs multiple 
 file operations in the context of block allocation.
 A better implementation would be asynchronous eviction when free space on RAM 
 disk falls below a low watermark to make block allocation faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6974) MiniHDFScluster breaks if there is an out of date hadoop.lib on the lib path

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118668#comment-14118668
 ] 

Colin Patrick McCabe commented on HDFS-6974:


Is this really any different than needing to set {{HADOOP_CLASSPATH}} 
correctly?  We don't handle mixing old jars into the classpath, so why should 
we handle mixing old {{hadoop.dll}} files into the path?  It seems 
inconsistent.  But maybe I'm missing something that makes this case different.

bq. There's another extension too: have a getVersion() call that returns 
version info (build info etc), which can be used to help in diags. I'd add 
that, but still look for hadoop-2.6.lib so that you could have 1 lib on the 
path

We don't make any guarantees that the libhadoop supplied with 2.6 will work 
with Hadoop 2.6.1.  libhadoop doesn't have a fixed or standardized API; it's 
just the C half of random bits of Hadoop code.

Think if you were making changes to the JNI code and redeploying.  You need to 
redeploy with the correct, new JNI code, not the old stuff.  This again, the 
same as with jar files... you wouldn't mix jar files from Hadoop 2.6 and Hadoop 
2.6.1 in the same directory.  So I would argue for your solution #1.

We could perhaps give a better error message here.  We might be able to inject 
the git hash into the library, and error out if it didn't match the git hash in 
the jar files.  But then that means that partial rebuilds of the source tree no 
longer work, so maybe not.

 MiniHDFScluster breaks if there is an out of date hadoop.lib on the lib path 
 -

 Key: HDFS-6974
 URL: https://issues.apache.org/jira/browse/HDFS-6974
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
 Environment: Windows with a version of Hadoop (HDP2.1) installed 
 somewhere via an MSI
Reporter: Steve Loughran
Priority: Minor

 SLIDER-377 shows the trace of a MiniHDFSCluster test failing on native 
 library calls ... the root cause appears to be the 2.4.1 hadoop lib on the 
 path doesn't have all the methods needed by branch-2.
 When this situation arises, MiniHDFS cluster fails to work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6942) Fix typos in log messages

2014-09-02 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118670#comment-14118670
 ] 

Ray Chiang commented on HDFS-6942:
--

RE: TestRenameWithSnapshots test.  Works fine in my tree.

 Fix typos in log messages
 -

 Key: HDFS-6942
 URL: https://issues.apache.org/jira/browse/HDFS-6942
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial
  Labels: newbie
 Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch


 There are a bunch of typos in log messages. HADOOP-10946 was initially 
 created, but may have failed due to being in multiple components. Try fixing 
 typos on a per-component basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-09-02 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6951:
---
Attachment: (was: HDFS-6951.004.patch)

 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Charles Lamb
 Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, 
 HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, editsStored


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-09-02 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6951:
---
Attachment: HDFS-6951.004.patch

Resubmitting the exact same HDFS-6951.004.patch to see if the weird test-patch 
failures disappear.


 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Charles Lamb
 Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, 
 HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, 
 HDFS-6951.004.patch, editsStored


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-6982) nntop: top­-like tool for name node users

2014-09-02 Thread Maysam Yabandeh (JIRA)
Maysam Yabandeh created HDFS-6982:
-

 Summary: nntop: top­-like tool for name node users
 Key: HDFS-6982
 URL: https://issues.apache.org/jira/browse/HDFS-6982
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Maysam Yabandeh


In this jira we motivate the need for nntop, a tool that, similarly to what top 
does in Linux, gives the list of top users of the HDFS name node and gives 
insight about which users are sending majority of each traffic type to the name 
node. This information turns out to be the most critical when the name node is 
under pressure and the HDFS admin needs to know which user is hammering the 
name node and with what kind of requests. Here we present the design of nntop 
which has been in production at Twitter in the past 10 months. nntop proved to 
have low cpu overhead ( 2% in a cluster of 4K nodes), low memory footprint 
(less than a few MB), and quite efficient for the write path (only two hash 
lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6982) nntop: top­-like tool for name node users

2014-09-02 Thread Maysam Yabandeh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maysam Yabandeh updated HDFS-6982:
--
Attachment: nntop-design-v1.pdf

A design doc that also shows how the tool looks like in action is attached. I 
will try to polish our code and prepare a patch in the new few days. 

Comments are highly appreciated.

 nntop: top­-like tool for name node users
 -

 Key: HDFS-6982
 URL: https://issues.apache.org/jira/browse/HDFS-6982
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Maysam Yabandeh
 Attachments: nntop-design-v1.pdf


 In this jira we motivate the need for nntop, a tool that, similarly to what 
 top does in Linux, gives the list of top users of the HDFS name node and 
 gives insight about which users are sending majority of each traffic type to 
 the name node. This information turns out to be the most critical when the 
 name node is under pressure and the HDFS admin needs to know which user is 
 hammering the name node and with what kind of requests. Here we present the 
 design of nntop which has been in production at Twitter in the past 10 
 months. nntop proved to have low cpu overhead ( 2% in a cluster of 4K 
 nodes), low memory footprint (less than a few MB), and quite efficient for 
 the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-6983) TestBalancer#testExitZeroOnSuccess fails intermittently

2014-09-02 Thread Mit Desai (JIRA)
Mit Desai created HDFS-6983:
---

 Summary: TestBalancer#testExitZeroOnSuccess fails intermittently
 Key: HDFS-6983
 URL: https://issues.apache.org/jira/browse/HDFS-6983
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Mit Desai


TestBalancer#testExitZeroOnSuccess fails intermittently on branch-2. And 
probably fails on trunk too.

The test fails 1 in 20 times when I ran it in a loop. Here is the how it fails.

{noformat}
org.apache.hadoop.hdfs.server.balancer.TestBalancer
testExitZeroOnSuccess(org.apache.hadoop.hdfs.server.balancer.TestBalancer)  
Time elapsed: 53.965 sec   ERROR!
java.util.concurrent.TimeoutException: Rebalancing expected avg utilization to 
become 0.2, but on datanode 127.0.0.1:35502 it remains at 0.08 after more than 
4 msec.
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForBalancer(TestBalancer.java:321)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancerCli(TestBalancer.java:632)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:549)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.testExitZeroOnSuccess(TestBalancer.java:845)


Results :

Tests in error: 
  
TestBalancer.testExitZeroOnSuccess:845-oneNodeTest:645-doTest:437-doTest:549-runBalancerCli:632-waitForBalancer:321
 Timeout
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable

2014-09-02 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-6984:
--

 Summary: In Hadoop 3, make FileStatus no longer a Writable
 Key: HDFS-6984
 URL: https://issues.apache.org/jira/browse/HDFS-6984
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this to 
serialize it and send it over the wire.  But in Hadoop 2 and later, we have the 
protobuf {{HdfsFileStatusProto}} which serves to serialize this information.  
The protobuf form is preferable, since it allows us to add new fields in a 
backwards-compatible way.  Another issue is that already a lot of subclasses of 
FileStatus don't override the Writable methods of the superclass, breaking the 
interface contract that read(status.write) should be equal to the original 
status.

In Hadoop 3, we should just make FileStatus no longer a writable so that we 
don't have to deal with these issues.  It's probably too late to do this in 
Hadoop 2, since user code may be relying on the ability to use the Writable 
methods on FileStatus objects there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6634) inotify in HDFS

2014-09-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6634:
--
   Resolution: Fixed
Fix Version/s: 2.6.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, nice work James!

 inotify in HDFS
 ---

 Key: HDFS-6634
 URL: https://issues.apache.org/jira/browse/HDFS-6634
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client, namenode, qjm
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, 
 HDFS-6634.5.patch, HDFS-6634.6.patch, HDFS-6634.7.patch, HDFS-6634.8.patch, 
 HDFS-6634.9.patch, HDFS-6634.patch, inotify-design.2.pdf, 
 inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, 
 inotify-intro.2.pdf, inotify-intro.pdf


 Design a mechanism for applications like search engines to access the HDFS 
 edit stream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6954) With crypto, no native lib systems are too verbose

2014-09-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6954:
--
   Resolution: Fixed
Fix Version/s: 2.6.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks Charles.

 With crypto, no native lib systems are too verbose
 --

 Key: HDFS-6954
 URL: https://issues.apache.org/jira/browse/HDFS-6954
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Assignee: Charles Lamb
 Fix For: 2.6.0

 Attachments: HDFS-6954.001.patch, HDFS-6954.002.patch, 
 HDFS-6954.003.patch


 Running commands on a machine without a native library results in:
 {code}
 $ bin/hdfs dfs -put /etc/hosts /tmp
 14/08/27 07:16:10 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 14/08/27 07:16:11 WARN crypto.CryptoCodec: Crypto codec 
 org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available.
 14/08/27 07:16:11 INFO hdfs.DFSClient: No KeyProvider found.
 {code}
 This is way too much.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable

2014-09-02 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6984:
---
Status: Patch Available  (was: Open)

 In Hadoop 3, make FileStatus no longer a Writable
 -

 Key: HDFS-6984
 URL: https://issues.apache.org/jira/browse/HDFS-6984
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6984.001.patch


 FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this 
 to serialize it and send it over the wire.  But in Hadoop 2 and later, we 
 have the protobuf {{HdfsFileStatusProto}} which serves to serialize this 
 information.  The protobuf form is preferable, since it allows us to add new 
 fields in a backwards-compatible way.  Another issue is that already a lot of 
 subclasses of FileStatus don't override the Writable methods of the 
 superclass, breaking the interface contract that read(status.write) should be 
 equal to the original status.
 In Hadoop 3, we should just make FileStatus no longer a writable so that we 
 don't have to deal with these issues.  It's probably too late to do this in 
 Hadoop 2, since user code may be relying on the ability to use the Writable 
 methods on FileStatus objects there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable

2014-09-02 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6984:
---
Attachment: HDFS-6984.001.patch

 In Hadoop 3, make FileStatus no longer a Writable
 -

 Key: HDFS-6984
 URL: https://issues.apache.org/jira/browse/HDFS-6984
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6984.001.patch


 FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this 
 to serialize it and send it over the wire.  But in Hadoop 2 and later, we 
 have the protobuf {{HdfsFileStatusProto}} which serves to serialize this 
 information.  The protobuf form is preferable, since it allows us to add new 
 fields in a backwards-compatible way.  Another issue is that already a lot of 
 subclasses of FileStatus don't override the Writable methods of the 
 superclass, breaking the interface contract that read(status.write) should be 
 equal to the original status.
 In Hadoop 3, we should just make FileStatus no longer a writable so that we 
 don't have to deal with these issues.  It's probably too late to do this in 
 Hadoop 2, since user code may be relying on the ability to use the Writable 
 methods on FileStatus objects there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118784#comment-14118784
 ] 

Colin Patrick McCabe commented on HDFS-6984:


I don't anticipate any maintenance issues from having this change in Hadoop 3 
but not in Hadoop 2.x.  We already are unable to change the write/read methods 
of that class due to compatibility woes, so that code is effectively frozen.  
This patch just drops the frozen code out of Hadoop 3.  The main motivation is 
that this will make it easier for us to add more stuff to FileStatus in the 
future without worrying about the read/write methods of the Writable.

 In Hadoop 3, make FileStatus no longer a Writable
 -

 Key: HDFS-6984
 URL: https://issues.apache.org/jira/browse/HDFS-6984
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6984.001.patch


 FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this 
 to serialize it and send it over the wire.  But in Hadoop 2 and later, we 
 have the protobuf {{HdfsFileStatusProto}} which serves to serialize this 
 information.  The protobuf form is preferable, since it allows us to add new 
 fields in a backwards-compatible way.  Another issue is that already a lot of 
 subclasses of FileStatus don't override the Writable methods of the 
 superclass, breaking the interface contract that read(status.write) should be 
 equal to the original status.
 In Hadoop 3, we should just make FileStatus no longer a writable so that we 
 don't have to deal with these issues.  It's probably too late to do this in 
 Hadoop 2, since user code may be relying on the ability to use the Writable 
 methods on FileStatus objects there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-09-02 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118786#comment-14118786
 ] 

Philip Zeyliger commented on HDFS-6982:
---

Looks neat!

Since you're proposing this for inclusion in HDFS proper, I'd suggest that the 
implementation skips the auditing stuff and just works directly.  Obviously, 
you would introduce a configuration property turning the feature on or off as 
desired, since people may have memory management concerns or performance 
concerns.  It would also be useful to have a structured way to get the output 
for monitoring tools, which it sounds like you already have.  Could you give 
some sample output for that mechanism?


 nntop: top­-like tool for name node users
 -

 Key: HDFS-6982
 URL: https://issues.apache.org/jira/browse/HDFS-6982
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Maysam Yabandeh
 Attachments: nntop-design-v1.pdf


 In this jira we motivate the need for nntop, a tool that, similarly to what 
 top does in Linux, gives the list of top users of the HDFS name node and 
 gives insight about which users are sending majority of each traffic type to 
 the name node. This information turns out to be the most critical when the 
 name node is under pressure and the HDFS admin needs to know which user is 
 hammering the name node and with what kind of requests. Here we present the 
 design of nntop which has been in production at Twitter in the past 10 
 months. nntop proved to have low cpu overhead ( 2% in a cluster of 4K 
 nodes), low memory footprint (less than a few MB), and quite efficient for 
 the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6921) Add LazyPersist flag to FileStatus

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118798#comment-14118798
 ] 

Colin Patrick McCabe commented on HDFS-6921:


Interesting discussion.  I don't think adding this field will cause DistCp to 
fail.  DistCp doesn't currently check this field, so it will have no idea 
whether it's there or not.

It is a little concerning that FileStatus#read(FileStatus.write) will no longer 
return the original object (we can't round trip it) but this is already true 
of many (all?) the subclasses of FileStatus, like LocatedFileStatus.  They just 
don't bother serializing the new fields they add so they already have this 
problem.

I filed HDFS-6984 to remove the Writable interface from FileStatus completely 
in Hadoop 3.0.

In the meantime, we could support round tripping FileStatus by packing the 
isLazyPersist bit into the sign bit of the replication field.  Would that 
address the compatibility concerns?

bq. Another issue is that lazy persist should be internal to the HDFS itself, 
it is much better to keep it fully inside.

If there is nothing in FileStatus, how can users find out this information?  
Perhaps by using a extended attribute?  That might actually be a good choice.

 Add LazyPersist flag to FileStatus
 --

 Key: HDFS-6921
 URL: https://issues.apache.org/jira/browse/HDFS-6921
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: HDFS-6581

 Attachments: HDFS-6921.01.patch, HDFS-6921.02.patch


 A new flag will be added to FileStatus to indicate that a file can be lazily 
 persisted to disk i.e. trading reduced durability for better write 
 performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118807#comment-14118807
 ] 

Hadoop QA commented on HDFS-6984:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12665993/HDFS-6984.001.patch
  against trunk revision a0ccf83.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7874//console

This message is automatically generated.

 In Hadoop 3, make FileStatus no longer a Writable
 -

 Key: HDFS-6984
 URL: https://issues.apache.org/jira/browse/HDFS-6984
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6984.001.patch


 FileStatus was a Writable in Hadoop 2 and earlier.  Originally, we used this 
 to serialize it and send it over the wire.  But in Hadoop 2 and later, we 
 have the protobuf {{HdfsFileStatusProto}} which serves to serialize this 
 information.  The protobuf form is preferable, since it allows us to add new 
 fields in a backwards-compatible way.  Another issue is that already a lot of 
 subclasses of FileStatus don't override the Writable methods of the 
 superclass, breaking the interface contract that read(status.write) should be 
 equal to the original status.
 In Hadoop 3, we should just make FileStatus no longer a writable so that we 
 don't have to deal with these issues.  It's probably too late to do this in 
 Hadoop 2, since user code may be relying on the ability to use the Writable 
 methods on FileStatus objects there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6634) inotify in HDFS

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118813#comment-14118813
 ] 

Colin Patrick McCabe commented on HDFS-6634:


Great work, James!

 inotify in HDFS
 ---

 Key: HDFS-6634
 URL: https://issues.apache.org/jira/browse/HDFS-6634
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client, namenode, qjm
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, 
 HDFS-6634.5.patch, HDFS-6634.6.patch, HDFS-6634.7.patch, HDFS-6634.8.patch, 
 HDFS-6634.9.patch, HDFS-6634.patch, inotify-design.2.pdf, 
 inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, 
 inotify-intro.2.pdf, inotify-intro.pdf


 Design a mechanism for applications like search engines to access the HDFS 
 edit stream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118827#comment-14118827
 ] 

Colin Patrick McCabe commented on HDFS-4257:


Yongjun, I'm going to file a follow-up JIRA to address your comments.

+1, will commit shortly.

 The ReplaceDatanodeOnFailure policies could have a forgiving option
 ---

 Key: HDFS-4257
 URL: https://issues.apache.org/jira/browse/HDFS-4257
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client
Affects Versions: 2.0.2-alpha
Reporter: Harsh J
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h4257_20140325.patch, h4257_20140325b.patch, 
 h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch


 Similar question has previously come over HDFS-3091 and friends, but the 
 essential problem is: Why can't I write to my cluster of 3 nodes, when I 
 just have 1 node available at a point in time..
 The policies cover the 4 options, with {{Default}} being default:
 {{Disable}} - Disables the whole replacement concept by throwing out an 
 error (at the server) or acts as {{Never}} at the client.
 {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in 
 many cases).
 {{Default}} - Replace based on a few conditions, but whose minimum never 
 touches 1. We always fail if only one DN remains and none others can be added.
 {{Always}} - Replace no matter what. Fail if can't replace.
 Would it not make sense to have an option similar to Always/Default, where 
 despite _trying_, if it isn't possible to have  1 DN in the pipeline, do not 
 fail. I think that is what the former write behavior was, and what fit with 
 the minimum replication factor allowed value.
 Why is it grossly wrong to pass a write from a client for a block with just 1 
 remaining replica in the pipeline (the minimum of 1 grows with the 
 replication factor demanded from the write), when replication is taken care 
 of immediately afterwards? How often have we seen missing blocks arise out of 
 allowing this + facing a big rack(s) failure or so?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-6985) Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure

2014-09-02 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-6985:
--

 Summary: Add final keywords, documentation, etc. to 
ReplaceDatanodeOnFailure
 Key: HDFS-6985
 URL: https://issues.apache.org/jira/browse/HDFS-6985
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor


* use final qualifier consistently in {{ReplaceDatanodeOnFailure#Condition 
classes}}

* add a debug log message in the DFSClient explaining which pipeline failure 
policy is being used.

* add JavaDoc for ReplaceDatanodeOnFailure

* documentation dfs.client.block.write.replace-datanode-on-failure.best-effort 
should make it clear that that the configuration key refers to pipeline 
recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option

2014-09-02 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4257:
---
  Resolution: Fixed
   Fix Version/s: 2.6.0
Target Version/s: 2.6.0
  Status: Resolved  (was: Patch Available)

 The ReplaceDatanodeOnFailure policies could have a forgiving option
 ---

 Key: HDFS-4257
 URL: https://issues.apache.org/jira/browse/HDFS-4257
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client
Affects Versions: 2.0.2-alpha
Reporter: Harsh J
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.6.0

 Attachments: h4257_20140325.patch, h4257_20140325b.patch, 
 h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch


 Similar question has previously come over HDFS-3091 and friends, but the 
 essential problem is: Why can't I write to my cluster of 3 nodes, when I 
 just have 1 node available at a point in time..
 The policies cover the 4 options, with {{Default}} being default:
 {{Disable}} - Disables the whole replacement concept by throwing out an 
 error (at the server) or acts as {{Never}} at the client.
 {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in 
 many cases).
 {{Default}} - Replace based on a few conditions, but whose minimum never 
 touches 1. We always fail if only one DN remains and none others can be added.
 {{Always}} - Replace no matter what. Fail if can't replace.
 Would it not make sense to have an option similar to Always/Default, where 
 despite _trying_, if it isn't possible to have  1 DN in the pipeline, do not 
 fail. I think that is what the former write behavior was, and what fit with 
 the minimum replication factor allowed value.
 Why is it grossly wrong to pass a write from a client for a block with just 1 
 remaining replica in the pipeline (the minimum of 1 grows with the 
 replication factor demanded from the write), when replication is taken care 
 of immediately afterwards? How often have we seen missing blocks arise out of 
 allowing this + facing a big rack(s) failure or so?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-6986) DistributedFileSystem must get deletagiontokens from configured KeyProvider

2014-09-02 Thread Alejandro Abdelnur (JIRA)
Alejandro Abdelnur created HDFS-6986:


 Summary: DistributedFileSystem must get deletagiontokens from 
configured KeyProvider
 Key: HDFS-6986
 URL: https://issues.apache.org/jira/browse/HDFS-6986
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: 2.6.0
Reporter: Alejandro Abdelnur


{{KeyProvider}} via {{KeyProviderDelegationTokenExtension}} provides delegation 
tokens. {{DistributedFileSystem}} should augment the HDFS delegation tokens 
with the keyprovider ones so tasks can interact with keyprovider when it is a 
client/server impl (KMS).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-6987) Move CipherSuite xattr information up to the encryption zone root

2014-09-02 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-6987:
-

 Summary: Move CipherSuite xattr information up to the encryption 
zone root
 Key: HDFS-6987
 URL: https://issues.apache.org/jira/browse/HDFS-6987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 2.6.0
Reporter: Andrew Wang
Assignee: Zhe Zhang


All files within a single EZ need to be encrypted with the same CipherSuite. 
Because of this, I think we can store the CipherSuite once in the EZ rather 
than on each file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6986) DistributedFileSystem must get delegation tokens from configured KeyProvider

2014-09-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6986:
--
Summary: DistributedFileSystem must get delegation tokens from configured 
KeyProvider  (was: DistributedFileSystem must get deletagiontokens from 
configured KeyProvider)

 DistributedFileSystem must get delegation tokens from configured KeyProvider
 

 Key: HDFS-6986
 URL: https://issues.apache.org/jira/browse/HDFS-6986
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: 2.6.0
Reporter: Alejandro Abdelnur

 {{KeyProvider}} via {{KeyProviderDelegationTokenExtension}} provides 
 delegation tokens. {{DistributedFileSystem}} should augment the HDFS 
 delegation tokens with the keyprovider ones so tasks can interact with 
 keyprovider when it is a client/server impl (KMS).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-6971) Bounded staleness of EDEK caches on the NN

2014-09-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HDFS-6971:
-

Assignee: Zhe Zhang  (was: Andrew Wang)

 Bounded staleness of EDEK caches on the NN
 --

 Key: HDFS-6971
 URL: https://issues.apache.org/jira/browse/HDFS-6971
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 2.5.0
Reporter: Andrew Wang
Assignee: Zhe Zhang

 The EDEK cache on the NN can hold onto keys after the admin has rolled the 
 key. It'd be good to time-bound the caches, perhaps also providing an 
 explicit flush command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6986) DistributedFileSystem must get delegation tokens from configured KeyProvider

2014-09-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-6986:
--
Assignee: Zhe Zhang

 DistributedFileSystem must get delegation tokens from configured KeyProvider
 

 Key: HDFS-6986
 URL: https://issues.apache.org/jira/browse/HDFS-6986
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: 2.6.0
Reporter: Alejandro Abdelnur
Assignee: Zhe Zhang

 {{KeyProvider}} via {{KeyProviderDelegationTokenExtension}} provides 
 delegation tokens. {{DistributedFileSystem}} should augment the HDFS 
 delegation tokens with the keyprovider ones so tasks can interact with 
 keyprovider when it is a client/server impl (KMS).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6987) Move CipherSuite xattr information up to the encryption zone root

2014-09-02 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118884#comment-14118884
 ] 

Andrew Wang commented on HDFS-6987:
---

It'd be good to also protobuf the EZ root xattr as part of this, since it'd be 
good to do anyway.

 Move CipherSuite xattr information up to the encryption zone root
 -

 Key: HDFS-6987
 URL: https://issues.apache.org/jira/browse/HDFS-6987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 2.6.0
Reporter: Andrew Wang
Assignee: Zhe Zhang

 All files within a single EZ need to be encrypted with the same CipherSuite. 
 Because of this, I think we can store the CipherSuite once in the EZ rather 
 than on each file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6985) Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure

2014-09-02 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6985:
---
Description: 
* use final qualifier and variable names consistently in 
{{ReplaceDatanodeOnFailure#Condition classes}}

* add a debug log message in the DFSClient explaining which pipeline failure 
policy is being used.

* add JavaDoc for ReplaceDatanodeOnFailure

* documentation dfs.client.block.write.replace-datanode-on-failure.best-effort 
should make it clear that that the configuration key refers to pipeline 
recovery.

  was:
* use final qualifier consistently in {{ReplaceDatanodeOnFailure#Condition 
classes}}

* add a debug log message in the DFSClient explaining which pipeline failure 
policy is being used.

* add JavaDoc for ReplaceDatanodeOnFailure

* documentation dfs.client.block.write.replace-datanode-on-failure.best-effort 
should make it clear that that the configuration key refers to pipeline 
recovery.


 Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure
 ---

 Key: HDFS-6985
 URL: https://issues.apache.org/jira/browse/HDFS-6985
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6985.001.patch


 * use final qualifier and variable names consistently in 
 {{ReplaceDatanodeOnFailure#Condition classes}}
 * add a debug log message in the DFSClient explaining which pipeline failure 
 policy is being used.
 * add JavaDoc for ReplaceDatanodeOnFailure
 * documentation 
 dfs.client.block.write.replace-datanode-on-failure.best-effort should make it 
 clear that that the configuration key refers to pipeline recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6985) Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure

2014-09-02 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6985:
---
Attachment: HDFS-6985.001.patch

I changed it so that all implementations of 
{{ReplaceDatanodeOnFailure#Condition}} use the same names for variables (i.e. 
short replication, final DatanodeInfo[] existings, int nExistings rather than 
short replication, final DatanodeInfo[] existings, int n)

I didn't use the 'final' qualifier on primitives, since most Hadoop code 
doesn't do that.  I add the 'final' qualitifer to all uses of the 'existings' 
array.

I added some JavaDoc to {{ReplaceDatanodeOnFailure#Condition}}.

Clarified that dfs.client.block.write.replace-datanode-on-failure.best-effort 
applies to pipeline recovery.

 Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure
 ---

 Key: HDFS-6985
 URL: https://issues.apache.org/jira/browse/HDFS-6985
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-6985.001.patch


 * use final qualifier consistently in {{ReplaceDatanodeOnFailure#Condition 
 classes}}
 * add a debug log message in the DFSClient explaining which pipeline failure 
 policy is being used.
 * add JavaDoc for ReplaceDatanodeOnFailure
 * documentation 
 dfs.client.block.write.replace-datanode-on-failure.best-effort should make it 
 clear that that the configuration key refers to pipeline recovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118895#comment-14118895
 ] 

Colin Patrick McCabe commented on HDFS-4257:


Yongjun, check out HDFS-6985 where I addressed your comments.  Thanks, all.

 The ReplaceDatanodeOnFailure policies could have a forgiving option
 ---

 Key: HDFS-4257
 URL: https://issues.apache.org/jira/browse/HDFS-4257
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client
Affects Versions: 2.0.2-alpha
Reporter: Harsh J
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.6.0

 Attachments: h4257_20140325.patch, h4257_20140325b.patch, 
 h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch


 Similar question has previously come over HDFS-3091 and friends, but the 
 essential problem is: Why can't I write to my cluster of 3 nodes, when I 
 just have 1 node available at a point in time..
 The policies cover the 4 options, with {{Default}} being default:
 {{Disable}} - Disables the whole replacement concept by throwing out an 
 error (at the server) or acts as {{Never}} at the client.
 {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in 
 many cases).
 {{Default}} - Replace based on a few conditions, but whose minimum never 
 touches 1. We always fail if only one DN remains and none others can be added.
 {{Always}} - Replace no matter what. Fail if can't replace.
 Would it not make sense to have an option similar to Always/Default, where 
 despite _trying_, if it isn't possible to have  1 DN in the pipeline, do not 
 fail. I think that is what the former write behavior was, and what fit with 
 the minimum replication factor allowed value.
 Why is it grossly wrong to pass a write from a client for a block with just 1 
 remaining replica in the pipeline (the minimum of 1 grows with the 
 replication factor demanded from the write), when replication is taken care 
 of immediately afterwards? How often have we seen missing blocks arise out of 
 allowing this + facing a big rack(s) failure or so?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6942) Fix typos in log messages

2014-09-02 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118897#comment-14118897
 ] 

Haohui Mai commented on HDFS-6942:
--

+1. I'll commit it shortly.

 Fix typos in log messages
 -

 Key: HDFS-6942
 URL: https://issues.apache.org/jira/browse/HDFS-6942
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial
  Labels: newbie
 Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch


 There are a bunch of typos in log messages. HADOOP-10946 was initially 
 created, but may have failed due to being in multiple components. Try fixing 
 typos on a per-component basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6986) DistributedFileSystem must get delegation tokens from configured KeyProvider

2014-09-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118902#comment-14118902
 ] 

Alejandro Abdelnur commented on HDFS-6986:
--

The changes in {{DistributedFileSystem.java}} should be something like:

{code}
  @Override
  public Token?[] addDelegationTokens(String renewer, Credentials 
credentials) 
  throws IOException {
Token?[] tokens = super.addDelegationTokens(renewer, credentials);
if (dfs.getKeyProvider() != null) {
  KeyProviderDelegationTokenExtension keyProviderDelegationTokenExtension = 
  KeyProviderDelegationTokenExtension.
  createKeyProviderDelegationTokenExtension(dfs.getKeyProvider());
  Token?[] kpTokens = keyProviderDelegationTokenExtension.
  addDelegationTokens(renewer, credentials);
  if (tokens != null  kpTokens != null) {
Token?[] all = new Token?[tokens.length + kpTokens.length];
System.arraycopy(tokens, 0, all, 0, tokens.length);
System.arraycopy(kpTokens, 0, all, tokens.length, kpTokens.length);
tokens = all;
  } else {
tokens = (tokens != null) ? tokens : kpTokens;
  }
}
return tokens;
  }
{code}

And {{DFSClient}} should expose  the keyprovider via a {{getKeyProvider()}} 
method.


  
 

 DistributedFileSystem must get delegation tokens from configured KeyProvider
 

 Key: HDFS-6986
 URL: https://issues.apache.org/jira/browse/HDFS-6986
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: security
Affects Versions: 2.6.0
Reporter: Alejandro Abdelnur
Assignee: Zhe Zhang

 {{KeyProvider}} via {{KeyProviderDelegationTokenExtension}} provides 
 delegation tokens. {{DistributedFileSystem}} should augment the HDFS 
 delegation tokens with the keyprovider ones so tasks can interact with 
 keyprovider when it is a client/server impl (KMS).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.

2014-09-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118915#comment-14118915
 ] 

Jing Zhao commented on HDFS-6886:
-

The patch looks pretty good to me. Besides the comment from Vinay, only two 
minor comments:
# Nit: Let's still keep toLogRpcIds as the last parameter in 
FSEditLog#logOpenFile (i.e., move overwrite before toLogRpcIds)
# It will be better to have an overwrite transaction for TestOfflineEditsViewer

 Use single editlog record for creating file + overwrite.
 

 Key: HDFS-6886
 URL: https://issues.apache.org/jira/browse/HDFS-6886
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Critical
 Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, 
 HDFS-6886.003.patch, editsStored


 As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we 
 could do further improvement to use one editlog record for creating file + 
 overwrite in this JIRA. We could record the overwrite flag in editlog for 
 creating file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6930) Improve replica eviction from RAM disk

2014-09-02 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118932#comment-14118932
 ] 

Xiaoyu Yao commented on HDFS-6930:
--

+1

Can you check if capacity  0? It can be 0 when the RAM_DISK volume is allowed 
to add/remove dynamically, 

{code}
 int percentFree = (int) (free * 100 / capacity);
{code}

 Improve replica eviction from RAM disk
 --

 Key: HDFS-6930
 URL: https://issues.apache.org/jira/browse/HDFS-6930
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6930.01.patch


 The current replica eviction scheme is inefficient since it performs multiple 
 file operations in the context of block allocation.
 A better implementation would be asynchronous eviction when free space on RAM 
 disk falls below a low watermark to make block allocation faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118936#comment-14118936
 ] 

Hadoop QA commented on HDFS-6951:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12665981/HDFS-6951.004.patch
  against trunk revision 6595e92.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7873//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7873//console

This message is automatically generated.

 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Charles Lamb
 Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, 
 HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, 
 HDFS-6951.004.patch, editsStored


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6977) Delete all copies when a block is deleted from the block space

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6977:

Attachment: HDFS-6977.02.patch

Slight update to remove one call to {{Block.metaToBlockFile}}.

 Delete all copies when a block is deleted from the block space
 --

 Key: HDFS-6977
 URL: https://issues.apache.org/jira/browse/HDFS-6977
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Xiaoyu Yao
Assignee: Arpit Agarwal
 Attachments: HDFS-6977.01.patch, HDFS-6977.02.patch


 When a block is deleted from RAM disk we should also delete the copies 
 written to lazyPersist/.
 Reported by [~xyao]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6942) Fix typos in log messages

2014-09-02 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6942:
-
   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~rchiang] for the 
contribution.

 Fix typos in log messages
 -

 Key: HDFS-6942
 URL: https://issues.apache.org/jira/browse/HDFS-6942
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial
  Labels: newbie
 Fix For: 2.6.0

 Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch


 There are a bunch of typos in log messages. HADOOP-10946 was initially 
 created, but may have failed due to being in multiple components. Try fixing 
 typos on a per-component basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option

2014-09-02 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118949#comment-14118949
 ] 

Yongjun Zhang commented on HDFS-4257:
-

Thanks Colin, for reviewing and following-up.

Hi [~szetszwo], thanks for fixing the problem here. Colin created HDFS-6985 to 
address the comments I made earlier. Would you please take a look whether it 
looks good to you when you have time?  Thanks.



 The ReplaceDatanodeOnFailure policies could have a forgiving option
 ---

 Key: HDFS-4257
 URL: https://issues.apache.org/jira/browse/HDFS-4257
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client
Affects Versions: 2.0.2-alpha
Reporter: Harsh J
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.6.0

 Attachments: h4257_20140325.patch, h4257_20140325b.patch, 
 h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch


 Similar question has previously come over HDFS-3091 and friends, but the 
 essential problem is: Why can't I write to my cluster of 3 nodes, when I 
 just have 1 node available at a point in time..
 The policies cover the 4 options, with {{Default}} being default:
 {{Disable}} - Disables the whole replacement concept by throwing out an 
 error (at the server) or acts as {{Never}} at the client.
 {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in 
 many cases).
 {{Default}} - Replace based on a few conditions, but whose minimum never 
 touches 1. We always fail if only one DN remains and none others can be added.
 {{Always}} - Replace no matter what. Fail if can't replace.
 Would it not make sense to have an option similar to Always/Default, where 
 despite _trying_, if it isn't possible to have  1 DN in the pipeline, do not 
 fail. I think that is what the former write behavior was, and what fit with 
 the minimum replication factor allowed value.
 Why is it grossly wrong to pass a write from a client for a block with just 1 
 remaining replica in the pipeline (the minimum of 1 grows with the 
 replication factor demanded from the write), when replication is taken care 
 of immediately afterwards? How often have we seen missing blocks arise out of 
 allowing this + facing a big rack(s) failure or so?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6956) Allow dynamically changing the tracing level in Hadoop servers

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118963#comment-14118963
 ] 

Colin Patrick McCabe commented on HDFS-6956:


daemonlog is about log4j, this is about tracing.  htrace can send trace events 
to a system like zipkin, which is more useful than just catting them to a file 
or to log4j.

I imagine the implementation might be kind of like daemonlog, though.  A 
command that you could run to enable tracing while in production.

 Allow dynamically changing the tracing level in Hadoop servers
 --

 Key: HDFS-6956
 URL: https://issues.apache.org/jira/browse/HDFS-6956
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Colin Patrick McCabe

 We should allow users to dynamically change the tracing level in Hadoop 
 servers.  The easiest way to do this is probably to have an RPC accessible 
 only to the superuser that changes tracing settings.  This would allow us to 
 turn on and off tracing on the NameNode, DataNode, etc. at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6982) nntop: top­-like tool for name node users

2014-09-02 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118971#comment-14118971
 ] 

Maysam Yabandeh commented on HDFS-6982:
---

Thanks [~philip]. I agree with you. I actually was planning to skip the audit 
log tailing stuff altogether to keep the patch simple. If there was interest in 
future I can submit a separate patch for that.

The metric key format is operation.user. Here is a sample output from the jmx 
interface:
{code}
[myabandeh@smf1-aro-39-sr1(hadoop-tst-nn) ~]$ curl localhost:12333/jmx | grep 
Hadoop:service=nntop,name=topusers -B1 -A8
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0  
}, {
name : Hadoop:service=nntop,name=topusers,
modelerType : topusers,
tag.Context : namenode,
tag.ProcessName : DummyProcessName,
tag.SessionId : DummySessionId,
tag.Hostname : hhh,
delete.xxx : 1,
setPermission.ALL : 0,
getfileinfo.ALL : 3159,
{code}

 nntop: top­-like tool for name node users
 -

 Key: HDFS-6982
 URL: https://issues.apache.org/jira/browse/HDFS-6982
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Maysam Yabandeh
 Attachments: nntop-design-v1.pdf


 In this jira we motivate the need for nntop, a tool that, similarly to what 
 top does in Linux, gives the list of top users of the HDFS name node and 
 gives insight about which users are sending majority of each traffic type to 
 the name node. This information turns out to be the most critical when the 
 name node is under pressure and the HDFS admin needs to know which user is 
 hammering the name node and with what kind of requests. Here we present the 
 design of nntop which has been in production at Twitter in the past 10 
 months. nntop proved to have low cpu overhead ( 2% in a cluster of 4K 
 nodes), low memory footprint (less than a few MB), and quite efficient for 
 the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6942) Fix typos in log messages

2014-09-02 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118996#comment-14118996
 ] 

Ray Chiang commented on HDFS-6942:
--

Thanks for the commit!  Glad to get this first batch of typos fixed.

 Fix typos in log messages
 -

 Key: HDFS-6942
 URL: https://issues.apache.org/jira/browse/HDFS-6942
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial
  Labels: newbie
 Fix For: 2.6.0

 Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch


 There are a bunch of typos in log messages. HADOOP-10946 was initially 
 created, but may have failed due to being in multiple components. Try fixing 
 typos on a per-component basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-02 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-6966:
--
Attachment: HDFS-6966.2.patch

Adding new patch with more tests.

In the most recent patch we:

* Add HA test to verify standby NN tracks encryption zones.
* Assert null when calling getEncryptionZoneForPath on a nonexistent path.
* Verify success of renaming a dir and file within an encryption zone
* Run fsck on a system with encryption zones
* Add more snapshot unit testing. In particular, after snapshotting an 
encryption zone, remove the encryption zone and recreate the dir and take a 
snapshot. Verify that the new snapshot does not have an encryption zone. Delete 
the snapshots out of order and verify that the remaining snapshots have the 
correct encryption zone paths.
* Add tests for symlinks within the same encryption zone and within different 
encryption zones.
* Add test to run the OfflineImageViewer on a system of encryption zones.

Again, if it's better, I can merge some tests to save on MiniDFSCluster spin up 
and shutdown time.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file

2014-09-02 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6705:
---
Attachment: HDFS-6705.002.patch

Submitting for a test run.


 Create an XAttr that disallows the HDFS admin from accessing a file
 ---

 Key: HDFS-6705
 URL: https://issues.apache.org/jira/browse/HDFS-6705
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6705.001.patch, HDFS-6705.002.patch


 There needs to be an xattr that specifies that the HDFS admin can not access 
 a file. This is needed for m/r delegation tokens and data at rest encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-09-02 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119015#comment-14119015
 ] 

Charles Lamb commented on HDFS-6951:


org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

both fail on my local machine with and without the patch so is unrelated.

org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer - this 
is an expected failure until the testEdits file gets checked in. It passes on 
my machine.


 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Charles Lamb
 Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, 
 HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, 
 HDFS-6951.004.patch, editsStored


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6959) make user home directory customizable

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119033#comment-14119033
 ] 

Colin Patrick McCabe commented on HDFS-6959:


bq. In that case, would you please take a look at rev 001 instead? the change 
there is restricted to HDFS. Thanks

OK, reviewing v1.

bq. +  private String home_dir_base = 
DFSConfigKeys.DFS_USER_HOME_BASE_DIR_DEFAULT;

Should be final

{code}
+property
+  namedfs.user.home.base.dir/name
+  value/user/value
+  descriptionBase directory of user home./description
+/property
{code}

This description is a bit terse.  Maybe something like: the directory to 
prepend to the user name to get the user's home directory

looks good aside from that

 make user home directory customizable
 -

 Key: HDFS-6959
 URL: https://issues.apache.org/jira/browse/HDFS-6959
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Kevin Odell
Assignee: Yongjun Zhang
Priority: Minor
 Attachments: HADOOP-10334.001.patch, HADOOP-10334.002.patch, 
 HADOOP-10334.002.patch


 The path is currently hardcoded:
 public Path getHomeDirectory() {
 return makeQualified(new Path(/user/ + dfs.ugi.getShortUserName()));
   }
 It would be nice to have that as a customizable value.  
 Thank you



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119041#comment-14119041
 ] 

Colin Patrick McCabe commented on HDFS-6658:


I didn't realize the patches were up; don't forget to hit submit patch!

I'll try to check it out in the next few days-- thanks for your patience, guys.

 Namenode memory optimization - Block replicas list 
 ---

 Key: HDFS-6658
 URL: https://issues.apache.org/jira/browse/HDFS-6658
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.4.1
Reporter: Amir Langer
Assignee: Amir Langer
 Attachments: BlockListOptimizationComparison.xlsx, Namenode Memory 
 Optimizations - Block replicas list.docx


 Part of the memory consumed by every BlockInfo object in the Namenode is a 
 linked list of block references for every DatanodeStorageInfo (called 
 triplets). 
 We propose to change the way we store the list in memory. 
 Using primitive integer indexes instead of object references will reduce the 
 memory needed for every block replica (when compressed oops is disabled) and 
 in our new design the list overhead will be per DatanodeStorageInfo and not 
 per block replica.
 see attached design doc. for details and evaluation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones

2014-09-02 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119097#comment-14119097
 ] 

Yi Liu commented on HDFS-6951:
--

Thanks [~clamb], It's OK for me.  +1 (non-binding).

 Saving namespace and restarting NameNode will remove existing encryption zones
 --

 Key: HDFS-6951
 URL: https://issues.apache.org/jira/browse/HDFS-6951
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0
Reporter: Stephen Chu
Assignee: Charles Lamb
 Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, 
 HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, 
 HDFS-6951.004.patch, editsStored


 Currently, when users save namespace and restart the NameNode, pre-existing 
 encryption zones will be wiped out.
 I could reproduce this on a pseudo-distributed cluster:
 * Create an encryption zone
 * List encryption zones and verify the newly created zone is present
 * Save the namespace
 * Kill and restart the NameNode
 * List the encryption zones and you'll find the encryption zone is missing
 I've attached a test case for {{TestEncryptionZones}} that reproduces this as 
 well. Removing the saveNamespace call will get the test to pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.

2014-09-02 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119105#comment-14119105
 ] 

Yi Liu commented on HDFS-6886:
--

Thanks [~vinayrpet] and [~jingzhao] for review, I will update the patch for 
your comments later.

 Use single editlog record for creating file + overwrite.
 

 Key: HDFS-6886
 URL: https://issues.apache.org/jira/browse/HDFS-6886
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Critical
 Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, 
 HDFS-6886.003.patch, editsStored


 As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we 
 could do further improvement to use one editlog record for creating file + 
 overwrite in this JIRA. We could record the overwrite flag in editlog for 
 creating file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6634) inotify in HDFS

2014-09-02 Thread James Thomas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119104#comment-14119104
 ] 

James Thomas commented on HDFS-6634:


Thanks guys

 inotify in HDFS
 ---

 Key: HDFS-6634
 URL: https://issues.apache.org/jira/browse/HDFS-6634
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client, namenode, qjm
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, 
 HDFS-6634.5.patch, HDFS-6634.6.patch, HDFS-6634.7.patch, HDFS-6634.8.patch, 
 HDFS-6634.9.patch, HDFS-6634.patch, inotify-design.2.pdf, 
 inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, 
 inotify-intro.2.pdf, inotify-intro.pdf


 Design a mechanism for applications like search engines to access the HDFS 
 edit stream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).

2014-09-02 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119111#comment-14119111
 ] 

Yi Liu commented on HDFS-2975:
--

Thanks [~umamaheswararao] and [~vinayrpet] for the review.

 Rename with overwrite flag true can make NameNode to stuck in safemode on NN 
 (crash + restart).
 ---

 Key: HDFS-2975
 URL: https://issues.apache.org/jira/browse/HDFS-2975
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Uma Maheswara Rao G
Assignee: Yi Liu
 Attachments: HDFS-2975.001.patch


 When we rename the file with overwrite flag as true, it will delete the 
 destination file blocks. After deleting the blocks, whenever it releases the 
 fsNameSystem lock, NN can give the invalidation work to corresponding DNs to 
 delete the blocks.
 Parallaly it will sync the rename related edits to editlog file. At this step 
 before NN sync the edits if NN crashes, NN can stuck into safemode on 
 restart. This is because block already deleted from the DN as part of 
 invalidations. But dst file still exist as rename edits not persisted in log 
 file and no DN will report that blocks now.
 This is similar to HDFS-2815
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long

2014-09-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119140#comment-14119140
 ] 

Colin Patrick McCabe commented on HDFS-6036:


bq. The slf4j style uses {} as a template to avoid string concatenation, let's 
make sure that's used for all the LOG calls.

ok

bq. shouldDefer, the !anchored case, could we lower the LOG to debug?

ok

bq. In UncachingTask#run, there's a little ternary to add Deferred before. We 
could have it switch between Deferred u and U so the capitalization of 
Uncaching is always correct.

I just added an 'if' statement, to avoid making this too complex :)

bq. The default is set to 15 hours, isn't this a really long time? I expected 
something like a few mins.

Sorry, this was supposed to be 15 minutes.  Fixed.

bq. New keys should be added to hdfs-default.xml as well.

added

bq. Regarding the minimum polling rate, I'd prefer to abort if it's not 
configured correctly. Silent correction means bad conf values live a continued 
existence, and confs get copy pasted around.

ok

bq. Having the min be revocation/2 is also somewhat arbitrary, but I'll go 
along with it. Nyquist-ish?

Yeah, that was the motivation.

bq. It'd also be nice to print which client is holding on to anchors for too 
long.

Yeah, very good idea.  I implemented this...

 Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that 
 extend too long
 -

 Key: HDFS-6036
 URL: https://issues.apache.org/jira/browse/HDFS-6036
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, datanode
Affects Versions: 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6036.001.patch


 We should forcibly timeout misbehaving DFSClients that try to do no-checksum 
 reads that extend too long.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long

2014-09-02 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6036:
---
Attachment: HDFS-6036.002.patch

 Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that 
 extend too long
 -

 Key: HDFS-6036
 URL: https://issues.apache.org/jira/browse/HDFS-6036
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, datanode
Affects Versions: 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6036.001.patch, HDFS-6036.002.patch


 We should forcibly timeout misbehaving DFSClients that try to do no-checksum 
 reads that extend too long.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6886) Use single editlog record for creating file + overwrite.

2014-09-02 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-6886:
-
Attachment: (was: editsStored)

 Use single editlog record for creating file + overwrite.
 

 Key: HDFS-6886
 URL: https://issues.apache.org/jira/browse/HDFS-6886
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Critical
 Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, 
 HDFS-6886.003.patch


 As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we 
 could do further improvement to use one editlog record for creating file + 
 overwrite in this JIRA. We could record the overwrite flag in editlog for 
 creating file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6886) Use single editlog record for creating file + overwrite.

2014-09-02 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-6886:
-
Attachment: editsStored
HDFS-6886.004.patch

[~vinayrpet] and [~jingzhao], I update the patch for all your comments, thanks.

 Use single editlog record for creating file + overwrite.
 

 Key: HDFS-6886
 URL: https://issues.apache.org/jira/browse/HDFS-6886
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Critical
 Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, 
 HDFS-6886.003.patch, HDFS-6886.004.patch, editsStored


 As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we 
 could do further improvement to use one editlog record for creating file + 
 overwrite in this JIRA. We could record the overwrite flag in editlog for 
 creating file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119203#comment-14119203
 ] 

Hadoop QA commented on HDFS-6036:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666110/HDFS-6036.002.patch
  against trunk revision 08a9ac7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-hdfs-project/hadoop-hdfs 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7877//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7877//console

This message is automatically generated.

 Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that 
 extend too long
 -

 Key: HDFS-6036
 URL: https://issues.apache.org/jira/browse/HDFS-6036
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, datanode
Affects Versions: 2.5.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-6036.001.patch, HDFS-6036.002.patch


 We should forcibly timeout misbehaving DFSClients that try to do no-checksum 
 reads that extend too long.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119202#comment-14119202
 ] 

Hadoop QA commented on HDFS-6966:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666087/HDFS-6966.2.patch
  against trunk revision 08a9ac7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7876//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7876//console

This message is automatically generated.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6966) Add additional unit tests for encryption zones

2014-09-02 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119216#comment-14119216
 ] 

Stephen Chu commented on HDFS-6966:
---

TestPipelinesFailover is unrelated to this patch.

 Add additional unit tests for encryption zones
 --

 Key: HDFS-6966
 URL: https://issues.apache.org/jira/browse/HDFS-6966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: encryption
Affects Versions: 3.0.0, 2.6.0
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch


 There are some more unit tests that can be added for test encryption zones. 
 For example, more encryption zone + snapshot tests, running fsck on 
 encryption zones, and more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119232#comment-14119232
 ] 

Hadoop QA commented on HDFS-6705:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12666088/HDFS-6705.002.patch
  against trunk revision 08a9ac7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
  org.apache.hadoop.hdfs.TestDistributedFileSystem
  org.apache.hadoop.fs.TestSymlinkHdfsFileSystem
  org.apache.hadoop.hdfs.web.TestWebHDFSXAttr
  org.apache.hadoop.hdfs.TestRollingUpgrade
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7875//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7875//console

This message is automatically generated.

 Create an XAttr that disallows the HDFS admin from accessing a file
 ---

 Key: HDFS-6705
 URL: https://issues.apache.org/jira/browse/HDFS-6705
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Affects Versions: 3.0.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-6705.001.patch, HDFS-6705.002.patch


 There needs to be an xattr that specifies that the HDFS admin can not access 
 a file. This is needed for m/r delegation tokens and data at rest encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-6981) DN upgrade with layout version change should not use trash

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-6981 started by Arpit Agarwal.
---
 DN upgrade with layout version change should not use trash
 --

 Key: HDFS-6981
 URL: https://issues.apache.org/jira/browse/HDFS-6981
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: Arpit Agarwal

 Post HDFS-6800, we can encounter the following scenario:
 # We start with DN software version -55 and initiate a rolling upgrade to 
 version -56
 # We delete some blocks, and they are moved to trash
 # We roll back to DN software version -55 using the -rollback flag – since we 
 are running the old code (prior to this patch), we will restore the previous 
 directory but will not delete the trash
 # We append to some of the blocks that were deleted in step 2
 # We then restart a DN that contains blocks that were appended to – since the 
 trash still exists, it will be restored at this point, the appended-to blocks 
 will be overwritten, and we will lose the appended data
 So I think we need to avoid writing anything to the trash directory if we 
 have a previous directory.
 Thanks to [~james.thomas] for reporting this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-6981) DN upgrade with layout version change should not use trash

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned HDFS-6981:
---

Assignee: Arpit Agarwal

 DN upgrade with layout version change should not use trash
 --

 Key: HDFS-6981
 URL: https://issues.apache.org/jira/browse/HDFS-6981
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: James Thomas
Assignee: Arpit Agarwal

 Post HDFS-6800, we can encounter the following scenario:
 # We start with DN software version -55 and initiate a rolling upgrade to 
 version -56
 # We delete some blocks, and they are moved to trash
 # We roll back to DN software version -55 using the -rollback flag – since we 
 are running the old code (prior to this patch), we will restore the previous 
 directory but will not delete the trash
 # We append to some of the blocks that were deleted in step 2
 # We then restart a DN that contains blocks that were appended to – since the 
 trash still exists, it will be restored at this point, the appended-to blocks 
 will be overwritten, and we will lose the appended data
 So I think we need to avoid writing anything to the trash directory if we 
 have a previous directory.
 Thanks to [~james.thomas] for reporting this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-6988) Make RAM disk eviction thresholds configurable

2014-09-02 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-6988:
---

 Summary: Make RAM disk eviction thresholds configurable
 Key: HDFS-6988
 URL: https://issues.apache.org/jira/browse/HDFS-6988
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal


Per feedback from [~cmccabe] on HDFS-6930, we can make the eviction thresholds 
configurable. The hard-coded thresholds may not be appropriate for very large 
RAM disks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6930:

Attachment: HDFS-6930.02.patch

Thanks for the reviews.

Colin, I filed HDFS-6988 to make the thresholds configurable.

Xiaoyu, updated patch attached to check capacity before division.

 Improve replica eviction from RAM disk
 --

 Key: HDFS-6930
 URL: https://issues.apache.org/jira/browse/HDFS-6930
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6930.01.patch, HDFS-6930.02.patch


 The current replica eviction scheme is inefficient since it performs multiple 
 file operations in the context of block allocation.
 A better implementation would be asynchronous eviction when free space on RAM 
 disk falls below a low watermark to make block allocation faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6950:

Attachment: (was: HDFS-6930.02.patch)

 Add Additional unit tests for HDFS-6581
 ---

 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-6950.0.patch


 Create additional unit tests for HDFS-6581 in addition to existing ones in 
 HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6950:

Attachment: HDFS-6930.02.patch

 Add Additional unit tests for HDFS-6581
 ---

 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-6950.0.patch


 Create additional unit tests for HDFS-6581 in addition to existing ones in 
 HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6950:

Attachment: (was: HDFS-6950.1.patch)

 Add Additional unit tests for HDFS-6581
 ---

 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-6950.0.patch


 Create additional unit tests for HDFS-6581 in addition to existing ones in 
 HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6930:

Attachment: (was: HDFS-6930.02.patch)

 Improve replica eviction from RAM disk
 --

 Key: HDFS-6930
 URL: https://issues.apache.org/jira/browse/HDFS-6930
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6930.01.patch


 The current replica eviction scheme is inefficient since it performs multiple 
 file operations in the context of block allocation.
 A better implementation would be asynchronous eviction when free space on RAM 
 disk falls below a low watermark to make block allocation faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6930:

Attachment: (was: HDFS-6930.02.patch)

 Improve replica eviction from RAM disk
 --

 Key: HDFS-6930
 URL: https://issues.apache.org/jira/browse/HDFS-6930
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6930.01.patch


 The current replica eviction scheme is inefficient since it performs multiple 
 file operations in the context of block allocation.
 A better implementation would be asynchronous eviction when free space on RAM 
 disk falls below a low watermark to make block allocation faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6930:

Attachment: HDFS-6930.02.patch

 Improve replica eviction from RAM disk
 --

 Key: HDFS-6930
 URL: https://issues.apache.org/jira/browse/HDFS-6930
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6930.01.patch


 The current replica eviction scheme is inefficient since it performs multiple 
 file operations in the context of block allocation.
 A better implementation would be asynchronous eviction when free space on RAM 
 disk falls below a low watermark to make block allocation faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6950:

Attachment: HDFS-6950.1.patch

Reattaching patch I deleted mistakenly. Sorry about that.

 Add Additional unit tests for HDFS-6581
 ---

 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-6950.0.patch, HDFS-6950.1.patch


 Create additional unit tests for HDFS-6581 in addition to existing ones in 
 HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6930:

Attachment: (was: HDFS-6950.1.patch)

 Improve replica eviction from RAM disk
 --

 Key: HDFS-6930
 URL: https://issues.apache.org/jira/browse/HDFS-6930
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6930.01.patch, HDFS-6930.02.patch


 The current replica eviction scheme is inefficient since it performs multiple 
 file operations in the context of block allocation.
 A better implementation would be asynchronous eviction when free space on RAM 
 disk falls below a low watermark to make block allocation faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6930:

Attachment: HDFS-6950.1.patch

 Improve replica eviction from RAM disk
 --

 Key: HDFS-6930
 URL: https://issues.apache.org/jira/browse/HDFS-6930
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6930.01.patch, HDFS-6930.02.patch


 The current replica eviction scheme is inefficient since it performs multiple 
 file operations in the context of block allocation.
 A better implementation would be asynchronous eviction when free space on RAM 
 disk falls below a low watermark to make block allocation faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk

2014-09-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-6930:

Attachment: HDFS-6930.02.patch

 Improve replica eviction from RAM disk
 --

 Key: HDFS-6930
 URL: https://issues.apache.org/jira/browse/HDFS-6930
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-6930.01.patch, HDFS-6930.02.patch


 The current replica eviction scheme is inefficient since it performs multiple 
 file operations in the context of block allocation.
 A better implementation would be asynchronous eviction when free space on RAM 
 disk falls below a low watermark to make block allocation faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-6950) Add Additional unit tests for HDFS-6581

2014-09-02 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao reassigned HDFS-6950:


Assignee: Xiaoyu Yao  (was: Xiaoyu Yao)

 Add Additional unit tests for HDFS-6581
 ---

 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-6950.0.patch, HDFS-6950.1.patch


 Create additional unit tests for HDFS-6581 in addition to existing ones in 
 HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-6950) Add Additional unit tests for HDFS-6581

2014-09-02 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-6950 started by Xiaoyu Yao.

 Add Additional unit tests for HDFS-6581
 ---

 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-6950.0.patch, HDFS-6950.1.patch


 Create additional unit tests for HDFS-6581 in addition to existing ones in 
 HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6950) Add Additional unit tests for HDFS-6581

2014-09-02 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119338#comment-14119338
 ] 

Arpit Agarwal commented on HDFS-6950:
-

Thanks for adding these test cases [~xyao]!

Few comments:
# testFallbackToDiskPartial is failing. The test looks fine so I think it has 
uncovered a bug. I'll investigate.
# testScopeWriteSameNodeRamDiskOnly - the test case looks incomplete. I don't 
think there is an easy way to do multi-node testing with our unit tests. Let's 
move this to another Jira, we can investigate if there is a way to add the test.
# testRamDiskEvictionBeforePersist - The comment _Ensure that both paths exist 
even after eviction and are readable_ looks unrelated to the test. We can 
delete it.
# testRamDiskEvictionWithOpenHandle - Let's move this to a separate Jira too. 
The test will not work as expected.
# testDeleteWithOpenHandle - Same as previous one.
# testDfsUsageCreateDelete - I think the test is not doing what you expect. The 
DFSUsage is 
# verifyDeletedBlocks - You can reduce the sleep interval in the loop to 
1000ms. Also this function can verify that there is no block file copy in the 
finalized directory of the transient volume.

 Add Additional unit tests for HDFS-6581
 ---

 Key: HDFS-6950
 URL: https://issues.apache.org/jira/browse/HDFS-6950
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-6950.0.patch, HDFS-6950.1.patch


 Create additional unit tests for HDFS-6581 in addition to existing ones in 
 HDFS-6927.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >