[jira] [Commented] (HDFS-6942) Fix typos in log messages
[ https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117924#comment-14117924 ] Akira AJISAKA commented on HDFS-6942: - Thanks [~rchiang] for the patch. Looks good to me. By the way, I found another typo 'targests' in DataNode.java. {code} if (DataTransferProtocol.LOG.isDebugEnabled()) { DataTransferProtocol.LOG.debug(getClass().getSimpleName() + : + b + (numBytes= + b.getNumBytes() + ) + , stage= + stage + , clientname= + clientname + , targests= + Arrays.asList(targets)); } {code} Would you include fixing the typo in the patch? Fix typos in log messages - Key: HDFS-6942 URL: https://issues.apache.org/jira/browse/HDFS-6942 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial Labels: newbie Attachments: HDFS-6942-01.patch There are a bunch of typos in log messages. HADOOP-10946 was initially created, but may have failed due to being in multiple components. Try fixing typos on a per-component basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118004#comment-14118004 ] Yi Liu commented on HDFS-6886: -- TestOfflineEditsViewer is successful with {{editsStored}}, other three failures are not related. Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, HDFS-6886.003.patch, editsStored As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6980) TestWebHdfsFileSystemContract fails in trunk
[ https://issues.apache.org/jira/browse/HDFS-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118028#comment-14118028 ] Hadoop QA commented on HDFS-6980: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12665841/HDFS-6980.1-2.patch against trunk revision 258c7d0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7869//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7869//console This message is automatically generated. TestWebHdfsFileSystemContract fails in trunk Key: HDFS-6980 URL: https://issues.apache.org/jira/browse/HDFS-6980 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Akira AJISAKA Assignee: Tsuyoshi OZAWA Attachments: HDFS-6980.1-2.patch, HDFS-6980.1.patch Many tests in TestWebHdfsFileSystemContract fail by too many open files error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6962) ACLs inheritance conflict with umaskmode
[ https://issues.apache.org/jira/browse/HDFS-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118080#comment-14118080 ] LINTE commented on HDFS-6962: - Any update on this issue ? ACLs inheritance conflict with umaskmode Key: HDFS-6962 URL: https://issues.apache.org/jira/browse/HDFS-6962 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: 2.4.1 Environment: CentOS release 6.5 (Final) Reporter: LINTE Labels: hadoop, security In hdfs-site.xml property namedfs.umaskmode/name value027/value /property 1/ Create a directory as superuser bash# hdfs dfs -mkdir /tmp/ACLS 2/ set default ACLs on this directory rwx access for group readwrite and user toto bash# hdfs dfs -setfacl -m default:group:readwrite:rwx /tmp/ACLS bash# hdfs dfs -setfacl -m default:user:toto:rwx /tmp/ACLS 3/ check ACLs /tmp/ACLS/ bash# hdfs dfs -getfacl /tmp/ACLS/ # file: /tmp/ACLS # owner: hdfs # group: hadoop user::rwx group::r-x other::--- default:user::rwx default:user:toto:rwx default:group::r-x default:group:readwrite:rwx default:mask::rwx default:other::--- user::rwx | group::r-x | other::--- matches with the umaskmode defined in hdfs-site.xml, everything ok ! default:group:readwrite:rwx allow readwrite group with rwx access for inhéritance. default:user:toto:rwx allow toto user with rwx access for inhéritance. default:mask::rwx inhéritance mask is rwx, so no mask 4/ Create a subdir to test inheritance of ACL bash# hdfs dfs -mkdir /tmp/ACLS/hdfs 5/ check ACLs /tmp/ACLS/hdfs bash# hdfs dfs -getfacl /tmp/ACLS/hdfs # file: /tmp/ACLS/hdfs # owner: hdfs # group: hadoop user::rwx user:toto:rwx #effective:r-x group::r-x group:readwrite:rwx #effective:r-x mask::r-x other::--- default:user::rwx default:user:toto:rwx default:group::r-x default:group:readwrite:rwx default:mask::rwx default:other::--- Here we can see that the readwrite group has rwx ACL bu only r-x is effective because the mask is r-x (mask::r-x) in spite of default mask for inheritance is set to default:mask::rwx on /tmp/ACLS/ 6/ Modifiy hdfs-site.xml et restart namenode property namedfs.umaskmode/name value010/value /property 7/ Create a subdir to test inheritance of ACL with new parameter umaskmode bash# hdfs dfs -mkdir /tmp/ACLS/hdfs2 8/ Check ACL on /tmp/ACLS/hdfs2 bash# hdfs dfs -getfacl /tmp/ACLS/hdfs2 # file: /tmp/ACLS/hdfs2 # owner: hdfs # group: hadoop user::rwx user:toto:rwx #effective:rw- group::r-x #effective:r-- group:readwrite:rwx #effective:rw- mask::rw- other::--- default:user::rwx default:user:toto:rwx default:group::r-x default:group:readwrite:rwx default:mask::rwx default:other::--- So HDFS masks the ACL value (user, group and other -- exepted the POSIX owner -- ) with the group mask of dfs.umaskmode properties when creating directory with inherited ACL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118149#comment-14118149 ] Yi Liu commented on HDFS-6951: -- Thanks [~clamb], looks good to me, just one comment: yes, this method declares {{this method is always called with writeLock of FSDirectory held}} but image loading breaks. Just add _writeLock_ is not good, we define a new method something like _unprotectedAddEncryptionZone_ for image loading, as Andrew's suggestion. Or find a better way? {code} @@ -2074,8 +2074,13 @@ public final void addToInodeMap(INode inode) { for (XAttr xattr : xattrs) { final String xaName = XAttrHelper.getPrefixName(xattr); if (CRYPTO_XATTR_ENCRYPTION_ZONE.equals(xaName)) { - ezManager.addEncryptionZone(inode.getId(), - new String(xattr.getValue())); + writeLock(); + try { +ezManager.addEncryptionZone(inode.getId(), +new String(xattr.getValue())); + } finally { +writeUnlock(); + } } } } {code} Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Assignee: Charles Lamb Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, editsStored Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file
[ https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118156#comment-14118156 ] Yi Liu commented on HDFS-6705: -- Hi [~clamb] and [~andrew.wang], could this xattr be something like {{SECURITY_CRYPTO_UNREADABLE_BY_SUPERUSER}}, and only could be set in encryption zones. Then normal files will not be affected. Create an XAttr that disallows the HDFS admin from accessing a file --- Key: HDFS-6705 URL: https://issues.apache.org/jira/browse/HDFS-6705 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6705.001.patch There needs to be an xattr that specifies that the HDFS admin can not access a file. This is needed for m/r delegation tokens and data at rest encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G reassigned HDFS-2975: - Assignee: Uma Maheswara Rao G (was: Yi Liu) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118283#comment-14118283 ] Uma Maheswara Rao G commented on HDFS-2975: --- Yi, Thanks a lot for the explanation. +1 on the patch. [~vinayrpet], Do you have any comments? Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6951: --- Attachment: HDFS-6951.004.patch [~hitliuyi], [~andrew.wang], The .004 patch adds an unprotectedAddEncryptionZone method per your comment. Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Assignee: Charles Lamb Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, HDFS-6951.004.patch, editsStored Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6867) For DFSOutputStream, do pipeline recovery for a single block in the background
[ https://issues.apache.org/jira/browse/HDFS-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118352#comment-14118352 ] Zhe Zhang commented on HDFS-6867: - [~cmccabe] Could you help review the patch? In the patch I created a new {{ReplaceDatanodeOnFailure}} policy named {{BACKGROUND}}, for the user to specify that background recovery should be used -- which we didn't cover in the offline discussion but I think is necessary. For DFSOutputStream, do pipeline recovery for a single block in the background -- Key: HDFS-6867 URL: https://issues.apache.org/jira/browse/HDFS-6867 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Reporter: Colin Patrick McCabe Assignee: Zhe Zhang Attachments: HDFS-6867-20140827-2.patch, HDFS-6867-20140827-3.patch, HDFS-6867-20140827.patch, HDFS-6867-20140828-1.patch, HDFS-6867-20140828-2.patch, HDFS-6867-design-20140820.pdf, HDFS-6867-design-20140821.pdf, HDFS-6867-design-20140822.pdf, HDFS-6867-design-20140827.pdf For DFSOutputStream, we should be able to do pipeline recovery in the background, while the user is continuing to write to the file. This is especially useful for long-lived clients that write to an HDFS file slowly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-2975: -- Assignee: Yi Liu (was: Uma Maheswara Rao G) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Uma Maheswara Rao G Assignee: Yi Liu Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option
[ https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118393#comment-14118393 ] Yongjun Zhang commented on HDFS-4257: - Hi [~szetszwo], thanks for the rev, it looks good! A few very minor comments: 1. Wonder if we can add a log right after calling {{this.dtpReplaceDatanodeOnFailure = ReplaceDatanodeOnFailure.get(conf);}}, to indicate what policy is used? My concern is, user may change policy between different sessions, it'd be nice to have a record in the log, so we can tell what policy is used. 2. About method {{satisfy(...)}} in Condition interface, {{DEFAULT}} has final qualifier for all parameters, but the others don't. It'd be nice to be consistent. Having final is a good thing, to achieve both the benefit of final and code consistency. 3. The comments section and parameter specification for {{static final Condition DEFAULT = new Condition() {}} used names r, n and replication, nExistings in a mixed way. Can we use replication, nExistings to be consistent with other places in the same file? Thanks a lot. The ReplaceDatanodeOnFailure policies could have a forgiving option --- Key: HDFS-4257 URL: https://issues.apache.org/jira/browse/HDFS-4257 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Affects Versions: 2.0.2-alpha Reporter: Harsh J Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h4257_20140325.patch, h4257_20140325b.patch, h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch Similar question has previously come over HDFS-3091 and friends, but the essential problem is: Why can't I write to my cluster of 3 nodes, when I just have 1 node available at a point in time.. The policies cover the 4 options, with {{Default}} being default: {{Disable}} - Disables the whole replacement concept by throwing out an error (at the server) or acts as {{Never}} at the client. {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in many cases). {{Default}} - Replace based on a few conditions, but whose minimum never touches 1. We always fail if only one DN remains and none others can be added. {{Always}} - Replace no matter what. Fail if can't replace. Would it not make sense to have an option similar to Always/Default, where despite _trying_, if it isn't possible to have 1 DN in the pipeline, do not fail. I think that is what the former write behavior was, and what fit with the minimum replication factor allowed value. Why is it grossly wrong to pass a write from a client for a block with just 1 remaining replica in the pipeline (the minimum of 1 grows with the replication factor demanded from the write), when replication is taken care of immediately afterwards? How often have we seen missing blocks arise out of allowing this + facing a big rack(s) failure or so? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5114) getMaxNodesPerRack() in BlockPlacementPolicyDefault does not take decommissioning nodes into account.
[ https://issues.apache.org/jira/browse/HDFS-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118394#comment-14118394 ] Zhe Zhang commented on HDFS-5114: - [~kihwal] Since this was created a year ago, do you happen to know if it has been resolved in the latest code? If not I'm happy to work on it. Thanks! getMaxNodesPerRack() in BlockPlacementPolicyDefault does not take decommissioning nodes into account. - Key: HDFS-5114 URL: https://issues.apache.org/jira/browse/HDFS-5114 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0, 2.1.0-beta Reporter: Kihwal Lee Assignee: Zhe Zhang If a large proportion of data nodes are being decommissioned, one or more racks may not be writable. However this is not taken into account when the default block placement policy module invokes getMaxNodesPerRack(). Some blocks, especially the ones with a high replication factor, may not be able to fully replicated until those nodes are taken out of dfs.include. It can actually block decommissioning itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118413#comment-14118413 ] Vinayakumar B commented on HDFS-2975: - Thanks a lot for the patch [~hitliuyi], +1 from me too. Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Uma Maheswara Rao G Assignee: Yi Liu Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118432#comment-14118432 ] Vinayakumar B commented on HDFS-6886: - bq. this.overwrite = Boolean.parseBoolean(st.getValue(OVERWRITE)); Here you may need to use {{st.getValueOrNull(..)}}, otherwise InvalidXmlException will be thrown if tried to convert old edits. Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, HDFS-6886.003.patch, editsStored As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6831) Inconsistency between 'hdfs dfsadmin' and 'hdfs dfsadmin -help'
[ https://issues.apache.org/jira/browse/HDFS-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6831: Assignee: Xiaoyu Yao Inconsistency between 'hdfs dfsadmin' and 'hdfs dfsadmin -help' --- Key: HDFS-6831 URL: https://issues.apache.org/jira/browse/HDFS-6831 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Akira AJISAKA Assignee: Xiaoyu Yao Priority: Minor Labels: newbie Attachments: HDFS-6831.0.patch, HDFS-6831.1.patch There is an inconsistency between the console outputs of 'hdfs dfsadmin' command and 'hdfs dfsadmin -help' command. {code} [root@trunk ~]# hdfs dfsadmin Usage: java DFSAdmin Note: Administrative commands can only be run as the HDFS superuser. [-report] [-safemode enter | leave | get | wait] [-allowSnapshot snapshotDir] [-disallowSnapshot snapshotDir] [-saveNamespace] [-rollEdits] [-restoreFailedStorage true|false|check] [-refreshNodes] [-finalizeUpgrade] [-rollingUpgrade [query|prepare|finalize]] [-metasave filename] [-refreshServiceAcl] [-refreshUserToGroupsMappings] [-refreshSuperUserGroupsConfiguration] [-refreshCallQueue] [-refresh] [-printTopology] [-refreshNamenodes datanodehost:port] [-deleteBlockPool datanode-host:port blockpoolId [force]] [-setQuota quota dirname...dirname] [-clrQuota dirname...dirname] [-setSpaceQuota quota dirname...dirname] [-clrSpaceQuota dirname...dirname] [-setBalancerBandwidth bandwidth in bytes per second] [-fetchImage local directory] [-shutdownDatanode datanode_host:ipc_port [upgrade]] [-getDatanodeInfo datanode_host:ipc_port] [-help [cmd]] {code} {code} [root@trunk ~]# hdfs dfsadmin -help hadoop dfsadmin performs DFS administrative commands. The full syntax is: hadoop dfsadmin [-report [-live] [-dead] [-decommissioning]] [-safemode enter | leave | get | wait] [-saveNamespace] [-rollEdits] [-restoreFailedStorage true|false|check] [-refreshNodes] [-setQuota quota dirname...dirname] [-clrQuota dirname...dirname] [-setSpaceQuota quota dirname...dirname] [-clrSpaceQuota dirname...dirname] [-finalizeUpgrade] [-rollingUpgrade [query|prepare|finalize]] [-refreshServiceAcl] [-refreshUserToGroupsMappings] [-refreshSuperUserGroupsConfiguration] [-refreshCallQueue] [-refresh host:ipc_port key [arg1..argn] [-printTopology] [-refreshNamenodes datanodehost:port] [-deleteBlockPool datanodehost:port blockpoolId [force]] [-setBalancerBandwidth bandwidth] [-fetchImage local directory] [-allowSnapshot snapshotDir] [-disallowSnapshot snapshotDir] [-shutdownDatanode datanode_host:ipc_port [upgrade]] [-getDatanodeInfo datanode_host:ipc_port [-help [cmd] {code} These two outputs should be the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6848) Lack of synchronization on access to datanodeUuid in DataStorage#format()
[ https://issues.apache.org/jira/browse/HDFS-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6848: Assignee: Xiaoyu Yao Lack of synchronization on access to datanodeUuid in DataStorage#format() -- Key: HDFS-6848 URL: https://issues.apache.org/jira/browse/HDFS-6848 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Xiaoyu Yao Priority: Minor Attachments: HDFS-6848.0.patch {code} this.datanodeUuid = datanodeUuid; {code} The above assignment should be done holding lock DataStorage.this - as is done in two other places. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6942) Fix typos in log messages
[ https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated HDFS-6942: - Attachment: HDFS-6942-02.patch Adding fix from [~ajisakaa]. Thanks for finding it. Fix typos in log messages - Key: HDFS-6942 URL: https://issues.apache.org/jira/browse/HDFS-6942 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial Labels: newbie Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch There are a bunch of typos in log messages. HADOOP-10946 was initially created, but may have failed due to being in multiple components. Try fixing typos on a per-component basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118488#comment-14118488 ] Hadoop QA commented on HDFS-6951: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12665928/HDFS-6951.004.patch against trunk revision 258c7d0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestFsck org.apache.hadoop.hdfs.server.namenode.TestParallelImageWrite org.apache.hadoop.hdfs.TestAppendDifferentChecksum org.apache.hadoop.hdfs.server.namenode.TestHDFSConcat org.apache.hadoop.hdfs.server.datanode.TestReadOnlySharedStorage org.apache.hadoop.fs.TestSymlinkHdfsFileContext org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics org.apache.hadoop.hdfs.TestDFSMkdirs org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics org.apache.hadoop.hdfs.server.namenode.TestNameNodeMXBean org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer org.apache.hadoop.hdfs.server.namenode.TestEditLogJournalFailures org.apache.hadoop.fs.TestGlobPaths org.apache.hadoop.hdfs.server.namenode.TestEditLogRace org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMkdir org.apache.hadoop.fs.TestHDFSFileContextMainOperations org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus org.apache.hadoop.hdfs.server.namenode.TestFSEditLogLoader org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery org.apache.hadoop.hdfs.TestDFSRename org.apache.hadoop.hdfs.server.namenode.ha.TestXAttrsWithHA org.apache.hadoop.hdfs.server.namenode.ha.TestDelegationTokensWithHA org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica org.apache.hadoop.hdfs.web.TestWebHDFS org.apache.hadoop.hdfs.server.namenode.TestAddBlock org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks org.apache.hadoop.hdfs.server.namenode.ha.TestHAMetrics org.apache.hadoop.hdfs.server.namenode.ha.TestNNHealthCheck org.apache.hadoop.fs.viewfs.TestViewFsWithAcls org.apache.hadoop.hdfs.server.datanode.TestBlockHasMultipleReplicasOnSameDN org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives org.apache.hadoop.fs.contract.hdfs.TestHDFSContractDelete org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.TestDFSUpgrade org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage org.apache.hadoop.hdfs.server.namenode.TestHostsFiles org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks org.apache.hadoop.hdfs.server.namenode.TestDeleteRace org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogsDuringFailover org.apache.hadoop.fs.viewfs.TestViewFileSystemWithXAttrs org.apache.hadoop.fs.viewfs.TestViewFsHdfs org.apache.hadoop.hdfs.web.TestHttpsFileSystem org.apache.hadoop.fs.TestResolveHdfsSymlink org.apache.hadoop.hdfs.server.namenode.snapshot.TestDisallowModifyROSnapshot
[jira] [Commented] (HDFS-6954) With crypto, no native lib systems are too verbose
[ https://issues.apache.org/jira/browse/HDFS-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118489#comment-14118489 ] Hadoop QA commented on HDFS-6954: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12665324/HDFS-6954.003.patch against trunk revision e1109fb. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl The test build failed in hadoop-hdfs-project/hadoop-hdfs {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7871//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7871//console This message is automatically generated. With crypto, no native lib systems are too verbose -- Key: HDFS-6954 URL: https://issues.apache.org/jira/browse/HDFS-6954 Project: Hadoop HDFS Issue Type: Bug Components: encryption Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Charles Lamb Attachments: HDFS-6954.001.patch, HDFS-6954.002.patch, HDFS-6954.003.patch Running commands on a machine without a native library results in: {code} $ bin/hdfs dfs -put /etc/hosts /tmp 14/08/27 07:16:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/08/27 07:16:11 WARN crypto.CryptoCodec: Crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. 14/08/27 07:16:11 INFO hdfs.DFSClient: No KeyProvider found. {code} This is way too much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6831) Inconsistency between 'hdfs dfsadmin' and 'hdfs dfsadmin -help'
[ https://issues.apache.org/jira/browse/HDFS-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6831: Assignee: Xiaoyu Yao (was: Xiaoyu Yao) Inconsistency between 'hdfs dfsadmin' and 'hdfs dfsadmin -help' --- Key: HDFS-6831 URL: https://issues.apache.org/jira/browse/HDFS-6831 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Akira AJISAKA Assignee: Xiaoyu Yao Priority: Minor Labels: newbie Attachments: HDFS-6831.0.patch, HDFS-6831.1.patch There is an inconsistency between the console outputs of 'hdfs dfsadmin' command and 'hdfs dfsadmin -help' command. {code} [root@trunk ~]# hdfs dfsadmin Usage: java DFSAdmin Note: Administrative commands can only be run as the HDFS superuser. [-report] [-safemode enter | leave | get | wait] [-allowSnapshot snapshotDir] [-disallowSnapshot snapshotDir] [-saveNamespace] [-rollEdits] [-restoreFailedStorage true|false|check] [-refreshNodes] [-finalizeUpgrade] [-rollingUpgrade [query|prepare|finalize]] [-metasave filename] [-refreshServiceAcl] [-refreshUserToGroupsMappings] [-refreshSuperUserGroupsConfiguration] [-refreshCallQueue] [-refresh] [-printTopology] [-refreshNamenodes datanodehost:port] [-deleteBlockPool datanode-host:port blockpoolId [force]] [-setQuota quota dirname...dirname] [-clrQuota dirname...dirname] [-setSpaceQuota quota dirname...dirname] [-clrSpaceQuota dirname...dirname] [-setBalancerBandwidth bandwidth in bytes per second] [-fetchImage local directory] [-shutdownDatanode datanode_host:ipc_port [upgrade]] [-getDatanodeInfo datanode_host:ipc_port] [-help [cmd]] {code} {code} [root@trunk ~]# hdfs dfsadmin -help hadoop dfsadmin performs DFS administrative commands. The full syntax is: hadoop dfsadmin [-report [-live] [-dead] [-decommissioning]] [-safemode enter | leave | get | wait] [-saveNamespace] [-rollEdits] [-restoreFailedStorage true|false|check] [-refreshNodes] [-setQuota quota dirname...dirname] [-clrQuota dirname...dirname] [-setSpaceQuota quota dirname...dirname] [-clrSpaceQuota dirname...dirname] [-finalizeUpgrade] [-rollingUpgrade [query|prepare|finalize]] [-refreshServiceAcl] [-refreshUserToGroupsMappings] [-refreshSuperUserGroupsConfiguration] [-refreshCallQueue] [-refresh host:ipc_port key [arg1..argn] [-printTopology] [-refreshNamenodes datanodehost:port] [-deleteBlockPool datanodehost:port blockpoolId [force]] [-setBalancerBandwidth bandwidth] [-fetchImage local directory] [-allowSnapshot snapshotDir] [-disallowSnapshot snapshotDir] [-shutdownDatanode datanode_host:ipc_port [upgrade]] [-getDatanodeInfo datanode_host:ipc_port [-help [cmd] {code} These two outputs should be the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6848) Lack of synchronization on access to datanodeUuid in DataStorage#format()
[ https://issues.apache.org/jira/browse/HDFS-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6848: Assignee: Xiaoyu Yao (was: Xiaoyu Yao) Lack of synchronization on access to datanodeUuid in DataStorage#format() -- Key: HDFS-6848 URL: https://issues.apache.org/jira/browse/HDFS-6848 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Xiaoyu Yao Priority: Minor Attachments: HDFS-6848.0.patch {code} this.datanodeUuid = datanodeUuid; {code} The above assignment should be done holding lock DataStorage.this - as is done in two other places. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes
[ https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118513#comment-14118513 ] Colin Patrick McCabe commented on HDFS-6482: Yeah, it would be great to have this in 2.6. Is HDFS-6981 blocking merging this to 2.6? Use block ID-based block layout on datanodes Key: HDFS-6482 URL: https://issues.apache.org/jira/browse/HDFS-6482 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.0.0 Reporter: James Thomas Assignee: James Thomas Fix For: 3.0.0 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, hadoop-24-datanode-dir.tgz Right now blocks are placed into directories that are split into many subdirectories when capacity is reached. Instead we can use a block's ID to determine the path it should go in. This eliminates the need for the LDir data structure that facilitates the splitting of directories when they reach capacity as well as fields in ReplicaInfo that keep track of a replica's location. An extension of the work in HDFS-3290. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes
[ https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118554#comment-14118554 ] Arpit Agarwal commented on HDFS-6482: - That is the known issue, yes. Use block ID-based block layout on datanodes Key: HDFS-6482 URL: https://issues.apache.org/jira/browse/HDFS-6482 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 3.0.0 Reporter: James Thomas Assignee: James Thomas Fix For: 3.0.0 Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.9.patch, HDFS-6482.patch, hadoop-24-datanode-dir.tgz Right now blocks are placed into directories that are split into many subdirectories when capacity is reached. Instead we can use a block's ID to determine the path it should go in. This eliminates the need for the LDir data structure that facilitates the splitting of directories when they reach capacity as well as fields in ReplicaInfo that keep track of a replica's location. An extension of the work in HDFS-3290. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6942) Fix typos in log messages
[ https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118615#comment-14118615 ] Hadoop QA commented on HDFS-6942: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12665940/HDFS-6942-02.patch against trunk revision 329b659. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-nfs: org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7872//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7872//console This message is automatically generated. Fix typos in log messages - Key: HDFS-6942 URL: https://issues.apache.org/jira/browse/HDFS-6942 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial Labels: newbie Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch There are a bunch of typos in log messages. HADOOP-10946 was initially created, but may have failed due to being in multiple components. Try fixing typos on a per-component basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6930) Improve replica eviction from RAM disk
[ https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118657#comment-14118657 ] Colin Patrick McCabe commented on HDFS-6930: bq. Eviction is done when we have Less than 10% free space or Insufficient space for 3 default length blocks. One thing that might be suboptimal here is that we're using the {{dfs.blocksize}} configuration key on the DataNode and assuming that will be the same value used by the client. Clearly, the client could use 256 MB blocks, whereas the DN could use 128 MB blocks. Etc. Also, we don't really know how big the ramdisks are going to be. I can easily see a 300 GB ramdisk being used in a few years. Just defaulting to keeping 10% free seems like too much. So, why not just have a minimum free space configuration key. It could be specified as a number of bytes, rather than as a percentage. So we could default it to 128 MB * 3 to get your current default of leaving space for 3 blocks. This would work better for bigger ramdisks (unlike a percentage-based scheme) and wouldn't make assumptions about the client's and DN's block size configuration being the same. Improve replica eviction from RAM disk -- Key: HDFS-6930 URL: https://issues.apache.org/jira/browse/HDFS-6930 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6930.01.patch The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation. A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6974) MiniHDFScluster breaks if there is an out of date hadoop.lib on the lib path
[ https://issues.apache.org/jira/browse/HDFS-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118668#comment-14118668 ] Colin Patrick McCabe commented on HDFS-6974: Is this really any different than needing to set {{HADOOP_CLASSPATH}} correctly? We don't handle mixing old jars into the classpath, so why should we handle mixing old {{hadoop.dll}} files into the path? It seems inconsistent. But maybe I'm missing something that makes this case different. bq. There's another extension too: have a getVersion() call that returns version info (build info etc), which can be used to help in diags. I'd add that, but still look for hadoop-2.6.lib so that you could have 1 lib on the path We don't make any guarantees that the libhadoop supplied with 2.6 will work with Hadoop 2.6.1. libhadoop doesn't have a fixed or standardized API; it's just the C half of random bits of Hadoop code. Think if you were making changes to the JNI code and redeploying. You need to redeploy with the correct, new JNI code, not the old stuff. This again, the same as with jar files... you wouldn't mix jar files from Hadoop 2.6 and Hadoop 2.6.1 in the same directory. So I would argue for your solution #1. We could perhaps give a better error message here. We might be able to inject the git hash into the library, and error out if it didn't match the git hash in the jar files. But then that means that partial rebuilds of the source tree no longer work, so maybe not. MiniHDFScluster breaks if there is an out of date hadoop.lib on the lib path - Key: HDFS-6974 URL: https://issues.apache.org/jira/browse/HDFS-6974 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0 Environment: Windows with a version of Hadoop (HDP2.1) installed somewhere via an MSI Reporter: Steve Loughran Priority: Minor SLIDER-377 shows the trace of a MiniHDFSCluster test failing on native library calls ... the root cause appears to be the 2.4.1 hadoop lib on the path doesn't have all the methods needed by branch-2. When this situation arises, MiniHDFS cluster fails to work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6942) Fix typos in log messages
[ https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118670#comment-14118670 ] Ray Chiang commented on HDFS-6942: -- RE: TestRenameWithSnapshots test. Works fine in my tree. Fix typos in log messages - Key: HDFS-6942 URL: https://issues.apache.org/jira/browse/HDFS-6942 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial Labels: newbie Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch There are a bunch of typos in log messages. HADOOP-10946 was initially created, but may have failed due to being in multiple components. Try fixing typos on a per-component basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6951: --- Attachment: (was: HDFS-6951.004.patch) Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Assignee: Charles Lamb Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, editsStored Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6951: --- Attachment: HDFS-6951.004.patch Resubmitting the exact same HDFS-6951.004.patch to see if the weird test-patch failures disappear. Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Assignee: Charles Lamb Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, HDFS-6951.004.patch, editsStored Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-6982) nntop: top-like tool for name node users
Maysam Yabandeh created HDFS-6982: - Summary: nntop: top-like tool for name node users Key: HDFS-6982 URL: https://issues.apache.org/jira/browse/HDFS-6982 Project: Hadoop HDFS Issue Type: New Feature Reporter: Maysam Yabandeh In this jira we motivate the need for nntop, a tool that, similarly to what top does in Linux, gives the list of top users of the HDFS name node and gives insight about which users are sending majority of each traffic type to the name node. This information turns out to be the most critical when the name node is under pressure and the HDFS admin needs to know which user is hammering the name node and with what kind of requests. Here we present the design of nntop which has been in production at Twitter in the past 10 months. nntop proved to have low cpu overhead ( 2% in a cluster of 4K nodes), low memory footprint (less than a few MB), and quite efficient for the write path (only two hash lookup for updating a metric). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6982) nntop: top-like tool for name node users
[ https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maysam Yabandeh updated HDFS-6982: -- Attachment: nntop-design-v1.pdf A design doc that also shows how the tool looks like in action is attached. I will try to polish our code and prepare a patch in the new few days. Comments are highly appreciated. nntop: top-like tool for name node users - Key: HDFS-6982 URL: https://issues.apache.org/jira/browse/HDFS-6982 Project: Hadoop HDFS Issue Type: New Feature Reporter: Maysam Yabandeh Attachments: nntop-design-v1.pdf In this jira we motivate the need for nntop, a tool that, similarly to what top does in Linux, gives the list of top users of the HDFS name node and gives insight about which users are sending majority of each traffic type to the name node. This information turns out to be the most critical when the name node is under pressure and the HDFS admin needs to know which user is hammering the name node and with what kind of requests. Here we present the design of nntop which has been in production at Twitter in the past 10 months. nntop proved to have low cpu overhead ( 2% in a cluster of 4K nodes), low memory footprint (less than a few MB), and quite efficient for the write path (only two hash lookup for updating a metric). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-6983) TestBalancer#testExitZeroOnSuccess fails intermittently
Mit Desai created HDFS-6983: --- Summary: TestBalancer#testExitZeroOnSuccess fails intermittently Key: HDFS-6983 URL: https://issues.apache.org/jira/browse/HDFS-6983 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.1 Reporter: Mit Desai TestBalancer#testExitZeroOnSuccess fails intermittently on branch-2. And probably fails on trunk too. The test fails 1 in 20 times when I ran it in a loop. Here is the how it fails. {noformat} org.apache.hadoop.hdfs.server.balancer.TestBalancer testExitZeroOnSuccess(org.apache.hadoop.hdfs.server.balancer.TestBalancer) Time elapsed: 53.965 sec ERROR! java.util.concurrent.TimeoutException: Rebalancing expected avg utilization to become 0.2, but on datanode 127.0.0.1:35502 it remains at 0.08 after more than 4 msec. at org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForBalancer(TestBalancer.java:321) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.runBalancerCli(TestBalancer.java:632) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:549) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.doTest(TestBalancer.java:437) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.oneNodeTest(TestBalancer.java:645) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testExitZeroOnSuccess(TestBalancer.java:845) Results : Tests in error: TestBalancer.testExitZeroOnSuccess:845-oneNodeTest:645-doTest:437-doTest:549-runBalancerCli:632-waitForBalancer:321 Timeout {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable
Colin Patrick McCabe created HDFS-6984: -- Summary: In Hadoop 3, make FileStatus no longer a Writable Key: HDFS-6984 URL: https://issues.apache.org/jira/browse/HDFS-6984 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this to serialize it and send it over the wire. But in Hadoop 2 and later, we have the protobuf {{HdfsFileStatusProto}} which serves to serialize this information. The protobuf form is preferable, since it allows us to add new fields in a backwards-compatible way. Another issue is that already a lot of subclasses of FileStatus don't override the Writable methods of the superclass, breaking the interface contract that read(status.write) should be equal to the original status. In Hadoop 3, we should just make FileStatus no longer a writable so that we don't have to deal with these issues. It's probably too late to do this in Hadoop 2, since user code may be relying on the ability to use the Writable methods on FileStatus objects there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6634: -- Resolution: Fixed Fix Version/s: 2.6.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2, nice work James! inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Fix For: 2.6.0 Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, HDFS-6634.5.patch, HDFS-6634.6.patch, HDFS-6634.7.patch, HDFS-6634.8.patch, HDFS-6634.9.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6954) With crypto, no native lib systems are too verbose
[ https://issues.apache.org/jira/browse/HDFS-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6954: -- Resolution: Fixed Fix Version/s: 2.6.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2, thanks Charles. With crypto, no native lib systems are too verbose -- Key: HDFS-6954 URL: https://issues.apache.org/jira/browse/HDFS-6954 Project: Hadoop HDFS Issue Type: Bug Components: encryption Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Charles Lamb Fix For: 2.6.0 Attachments: HDFS-6954.001.patch, HDFS-6954.002.patch, HDFS-6954.003.patch Running commands on a machine without a native library results in: {code} $ bin/hdfs dfs -put /etc/hosts /tmp 14/08/27 07:16:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/08/27 07:16:11 WARN crypto.CryptoCodec: Crypto codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. 14/08/27 07:16:11 INFO hdfs.DFSClient: No KeyProvider found. {code} This is way too much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6984: --- Status: Patch Available (was: Open) In Hadoop 3, make FileStatus no longer a Writable - Key: HDFS-6984 URL: https://issues.apache.org/jira/browse/HDFS-6984 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6984.001.patch FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this to serialize it and send it over the wire. But in Hadoop 2 and later, we have the protobuf {{HdfsFileStatusProto}} which serves to serialize this information. The protobuf form is preferable, since it allows us to add new fields in a backwards-compatible way. Another issue is that already a lot of subclasses of FileStatus don't override the Writable methods of the superclass, breaking the interface contract that read(status.write) should be equal to the original status. In Hadoop 3, we should just make FileStatus no longer a writable so that we don't have to deal with these issues. It's probably too late to do this in Hadoop 2, since user code may be relying on the ability to use the Writable methods on FileStatus objects there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6984: --- Attachment: HDFS-6984.001.patch In Hadoop 3, make FileStatus no longer a Writable - Key: HDFS-6984 URL: https://issues.apache.org/jira/browse/HDFS-6984 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6984.001.patch FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this to serialize it and send it over the wire. But in Hadoop 2 and later, we have the protobuf {{HdfsFileStatusProto}} which serves to serialize this information. The protobuf form is preferable, since it allows us to add new fields in a backwards-compatible way. Another issue is that already a lot of subclasses of FileStatus don't override the Writable methods of the superclass, breaking the interface contract that read(status.write) should be equal to the original status. In Hadoop 3, we should just make FileStatus no longer a writable so that we don't have to deal with these issues. It's probably too late to do this in Hadoop 2, since user code may be relying on the ability to use the Writable methods on FileStatus objects there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118784#comment-14118784 ] Colin Patrick McCabe commented on HDFS-6984: I don't anticipate any maintenance issues from having this change in Hadoop 3 but not in Hadoop 2.x. We already are unable to change the write/read methods of that class due to compatibility woes, so that code is effectively frozen. This patch just drops the frozen code out of Hadoop 3. The main motivation is that this will make it easier for us to add more stuff to FileStatus in the future without worrying about the read/write methods of the Writable. In Hadoop 3, make FileStatus no longer a Writable - Key: HDFS-6984 URL: https://issues.apache.org/jira/browse/HDFS-6984 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6984.001.patch FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this to serialize it and send it over the wire. But in Hadoop 2 and later, we have the protobuf {{HdfsFileStatusProto}} which serves to serialize this information. The protobuf form is preferable, since it allows us to add new fields in a backwards-compatible way. Another issue is that already a lot of subclasses of FileStatus don't override the Writable methods of the superclass, breaking the interface contract that read(status.write) should be equal to the original status. In Hadoop 3, we should just make FileStatus no longer a writable so that we don't have to deal with these issues. It's probably too late to do this in Hadoop 2, since user code may be relying on the ability to use the Writable methods on FileStatus objects there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6982) nntop: top-like tool for name node users
[ https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118786#comment-14118786 ] Philip Zeyliger commented on HDFS-6982: --- Looks neat! Since you're proposing this for inclusion in HDFS proper, I'd suggest that the implementation skips the auditing stuff and just works directly. Obviously, you would introduce a configuration property turning the feature on or off as desired, since people may have memory management concerns or performance concerns. It would also be useful to have a structured way to get the output for monitoring tools, which it sounds like you already have. Could you give some sample output for that mechanism? nntop: top-like tool for name node users - Key: HDFS-6982 URL: https://issues.apache.org/jira/browse/HDFS-6982 Project: Hadoop HDFS Issue Type: New Feature Reporter: Maysam Yabandeh Attachments: nntop-design-v1.pdf In this jira we motivate the need for nntop, a tool that, similarly to what top does in Linux, gives the list of top users of the HDFS name node and gives insight about which users are sending majority of each traffic type to the name node. This information turns out to be the most critical when the name node is under pressure and the HDFS admin needs to know which user is hammering the name node and with what kind of requests. Here we present the design of nntop which has been in production at Twitter in the past 10 months. nntop proved to have low cpu overhead ( 2% in a cluster of 4K nodes), low memory footprint (less than a few MB), and quite efficient for the write path (only two hash lookup for updating a metric). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6921) Add LazyPersist flag to FileStatus
[ https://issues.apache.org/jira/browse/HDFS-6921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118798#comment-14118798 ] Colin Patrick McCabe commented on HDFS-6921: Interesting discussion. I don't think adding this field will cause DistCp to fail. DistCp doesn't currently check this field, so it will have no idea whether it's there or not. It is a little concerning that FileStatus#read(FileStatus.write) will no longer return the original object (we can't round trip it) but this is already true of many (all?) the subclasses of FileStatus, like LocatedFileStatus. They just don't bother serializing the new fields they add so they already have this problem. I filed HDFS-6984 to remove the Writable interface from FileStatus completely in Hadoop 3.0. In the meantime, we could support round tripping FileStatus by packing the isLazyPersist bit into the sign bit of the replication field. Would that address the compatibility concerns? bq. Another issue is that lazy persist should be internal to the HDFS itself, it is much better to keep it fully inside. If there is nothing in FileStatus, how can users find out this information? Perhaps by using a extended attribute? That might actually be a good choice. Add LazyPersist flag to FileStatus -- Key: HDFS-6921 URL: https://issues.apache.org/jira/browse/HDFS-6921 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: HDFS-6581 Attachments: HDFS-6921.01.patch, HDFS-6921.02.patch A new flag will be added to FileStatus to indicate that a file can be lazily persisted to disk i.e. trading reduced durability for better write performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6984) In Hadoop 3, make FileStatus no longer a Writable
[ https://issues.apache.org/jira/browse/HDFS-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118807#comment-14118807 ] Hadoop QA commented on HDFS-6984: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12665993/HDFS-6984.001.patch against trunk revision a0ccf83. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7874//console This message is automatically generated. In Hadoop 3, make FileStatus no longer a Writable - Key: HDFS-6984 URL: https://issues.apache.org/jira/browse/HDFS-6984 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6984.001.patch FileStatus was a Writable in Hadoop 2 and earlier. Originally, we used this to serialize it and send it over the wire. But in Hadoop 2 and later, we have the protobuf {{HdfsFileStatusProto}} which serves to serialize this information. The protobuf form is preferable, since it allows us to add new fields in a backwards-compatible way. Another issue is that already a lot of subclasses of FileStatus don't override the Writable methods of the superclass, breaking the interface contract that read(status.write) should be equal to the original status. In Hadoop 3, we should just make FileStatus no longer a writable so that we don't have to deal with these issues. It's probably too late to do this in Hadoop 2, since user code may be relying on the ability to use the Writable methods on FileStatus objects there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118813#comment-14118813 ] Colin Patrick McCabe commented on HDFS-6634: Great work, James! inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Fix For: 2.6.0 Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, HDFS-6634.5.patch, HDFS-6634.6.patch, HDFS-6634.7.patch, HDFS-6634.8.patch, HDFS-6634.9.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option
[ https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118827#comment-14118827 ] Colin Patrick McCabe commented on HDFS-4257: Yongjun, I'm going to file a follow-up JIRA to address your comments. +1, will commit shortly. The ReplaceDatanodeOnFailure policies could have a forgiving option --- Key: HDFS-4257 URL: https://issues.apache.org/jira/browse/HDFS-4257 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Affects Versions: 2.0.2-alpha Reporter: Harsh J Assignee: Tsz Wo Nicholas Sze Priority: Minor Attachments: h4257_20140325.patch, h4257_20140325b.patch, h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch Similar question has previously come over HDFS-3091 and friends, but the essential problem is: Why can't I write to my cluster of 3 nodes, when I just have 1 node available at a point in time.. The policies cover the 4 options, with {{Default}} being default: {{Disable}} - Disables the whole replacement concept by throwing out an error (at the server) or acts as {{Never}} at the client. {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in many cases). {{Default}} - Replace based on a few conditions, but whose minimum never touches 1. We always fail if only one DN remains and none others can be added. {{Always}} - Replace no matter what. Fail if can't replace. Would it not make sense to have an option similar to Always/Default, where despite _trying_, if it isn't possible to have 1 DN in the pipeline, do not fail. I think that is what the former write behavior was, and what fit with the minimum replication factor allowed value. Why is it grossly wrong to pass a write from a client for a block with just 1 remaining replica in the pipeline (the minimum of 1 grows with the replication factor demanded from the write), when replication is taken care of immediately afterwards? How often have we seen missing blocks arise out of allowing this + facing a big rack(s) failure or so? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-6985) Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure
Colin Patrick McCabe created HDFS-6985: -- Summary: Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure Key: HDFS-6985 URL: https://issues.apache.org/jira/browse/HDFS-6985 Project: Hadoop HDFS Issue Type: Improvement Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor * use final qualifier consistently in {{ReplaceDatanodeOnFailure#Condition classes}} * add a debug log message in the DFSClient explaining which pipeline failure policy is being used. * add JavaDoc for ReplaceDatanodeOnFailure * documentation dfs.client.block.write.replace-datanode-on-failure.best-effort should make it clear that that the configuration key refers to pipeline recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option
[ https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-4257: --- Resolution: Fixed Fix Version/s: 2.6.0 Target Version/s: 2.6.0 Status: Resolved (was: Patch Available) The ReplaceDatanodeOnFailure policies could have a forgiving option --- Key: HDFS-4257 URL: https://issues.apache.org/jira/browse/HDFS-4257 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Affects Versions: 2.0.2-alpha Reporter: Harsh J Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.6.0 Attachments: h4257_20140325.patch, h4257_20140325b.patch, h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch Similar question has previously come over HDFS-3091 and friends, but the essential problem is: Why can't I write to my cluster of 3 nodes, when I just have 1 node available at a point in time.. The policies cover the 4 options, with {{Default}} being default: {{Disable}} - Disables the whole replacement concept by throwing out an error (at the server) or acts as {{Never}} at the client. {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in many cases). {{Default}} - Replace based on a few conditions, but whose minimum never touches 1. We always fail if only one DN remains and none others can be added. {{Always}} - Replace no matter what. Fail if can't replace. Would it not make sense to have an option similar to Always/Default, where despite _trying_, if it isn't possible to have 1 DN in the pipeline, do not fail. I think that is what the former write behavior was, and what fit with the minimum replication factor allowed value. Why is it grossly wrong to pass a write from a client for a block with just 1 remaining replica in the pipeline (the minimum of 1 grows with the replication factor demanded from the write), when replication is taken care of immediately afterwards? How often have we seen missing blocks arise out of allowing this + facing a big rack(s) failure or so? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-6986) DistributedFileSystem must get deletagiontokens from configured KeyProvider
Alejandro Abdelnur created HDFS-6986: Summary: DistributedFileSystem must get deletagiontokens from configured KeyProvider Key: HDFS-6986 URL: https://issues.apache.org/jira/browse/HDFS-6986 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: 2.6.0 Reporter: Alejandro Abdelnur {{KeyProvider}} via {{KeyProviderDelegationTokenExtension}} provides delegation tokens. {{DistributedFileSystem}} should augment the HDFS delegation tokens with the keyprovider ones so tasks can interact with keyprovider when it is a client/server impl (KMS). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-6987) Move CipherSuite xattr information up to the encryption zone root
Andrew Wang created HDFS-6987: - Summary: Move CipherSuite xattr information up to the encryption zone root Key: HDFS-6987 URL: https://issues.apache.org/jira/browse/HDFS-6987 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Zhe Zhang All files within a single EZ need to be encrypted with the same CipherSuite. Because of this, I think we can store the CipherSuite once in the EZ rather than on each file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6986) DistributedFileSystem must get delegation tokens from configured KeyProvider
[ https://issues.apache.org/jira/browse/HDFS-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6986: -- Summary: DistributedFileSystem must get delegation tokens from configured KeyProvider (was: DistributedFileSystem must get deletagiontokens from configured KeyProvider) DistributedFileSystem must get delegation tokens from configured KeyProvider Key: HDFS-6986 URL: https://issues.apache.org/jira/browse/HDFS-6986 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: 2.6.0 Reporter: Alejandro Abdelnur {{KeyProvider}} via {{KeyProviderDelegationTokenExtension}} provides delegation tokens. {{DistributedFileSystem}} should augment the HDFS delegation tokens with the keyprovider ones so tasks can interact with keyprovider when it is a client/server impl (KMS). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-6971) Bounded staleness of EDEK caches on the NN
[ https://issues.apache.org/jira/browse/HDFS-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang reassigned HDFS-6971: - Assignee: Zhe Zhang (was: Andrew Wang) Bounded staleness of EDEK caches on the NN -- Key: HDFS-6971 URL: https://issues.apache.org/jira/browse/HDFS-6971 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.5.0 Reporter: Andrew Wang Assignee: Zhe Zhang The EDEK cache on the NN can hold onto keys after the admin has rolled the key. It'd be good to time-bound the caches, perhaps also providing an explicit flush command. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6986) DistributedFileSystem must get delegation tokens from configured KeyProvider
[ https://issues.apache.org/jira/browse/HDFS-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6986: -- Assignee: Zhe Zhang DistributedFileSystem must get delegation tokens from configured KeyProvider Key: HDFS-6986 URL: https://issues.apache.org/jira/browse/HDFS-6986 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: 2.6.0 Reporter: Alejandro Abdelnur Assignee: Zhe Zhang {{KeyProvider}} via {{KeyProviderDelegationTokenExtension}} provides delegation tokens. {{DistributedFileSystem}} should augment the HDFS delegation tokens with the keyprovider ones so tasks can interact with keyprovider when it is a client/server impl (KMS). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6987) Move CipherSuite xattr information up to the encryption zone root
[ https://issues.apache.org/jira/browse/HDFS-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118884#comment-14118884 ] Andrew Wang commented on HDFS-6987: --- It'd be good to also protobuf the EZ root xattr as part of this, since it'd be good to do anyway. Move CipherSuite xattr information up to the encryption zone root - Key: HDFS-6987 URL: https://issues.apache.org/jira/browse/HDFS-6987 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 2.6.0 Reporter: Andrew Wang Assignee: Zhe Zhang All files within a single EZ need to be encrypted with the same CipherSuite. Because of this, I think we can store the CipherSuite once in the EZ rather than on each file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6985) Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure
[ https://issues.apache.org/jira/browse/HDFS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6985: --- Description: * use final qualifier and variable names consistently in {{ReplaceDatanodeOnFailure#Condition classes}} * add a debug log message in the DFSClient explaining which pipeline failure policy is being used. * add JavaDoc for ReplaceDatanodeOnFailure * documentation dfs.client.block.write.replace-datanode-on-failure.best-effort should make it clear that that the configuration key refers to pipeline recovery. was: * use final qualifier consistently in {{ReplaceDatanodeOnFailure#Condition classes}} * add a debug log message in the DFSClient explaining which pipeline failure policy is being used. * add JavaDoc for ReplaceDatanodeOnFailure * documentation dfs.client.block.write.replace-datanode-on-failure.best-effort should make it clear that that the configuration key refers to pipeline recovery. Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure --- Key: HDFS-6985 URL: https://issues.apache.org/jira/browse/HDFS-6985 Project: Hadoop HDFS Issue Type: Improvement Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6985.001.patch * use final qualifier and variable names consistently in {{ReplaceDatanodeOnFailure#Condition classes}} * add a debug log message in the DFSClient explaining which pipeline failure policy is being used. * add JavaDoc for ReplaceDatanodeOnFailure * documentation dfs.client.block.write.replace-datanode-on-failure.best-effort should make it clear that that the configuration key refers to pipeline recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6985) Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure
[ https://issues.apache.org/jira/browse/HDFS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6985: --- Attachment: HDFS-6985.001.patch I changed it so that all implementations of {{ReplaceDatanodeOnFailure#Condition}} use the same names for variables (i.e. short replication, final DatanodeInfo[] existings, int nExistings rather than short replication, final DatanodeInfo[] existings, int n) I didn't use the 'final' qualifier on primitives, since most Hadoop code doesn't do that. I add the 'final' qualitifer to all uses of the 'existings' array. I added some JavaDoc to {{ReplaceDatanodeOnFailure#Condition}}. Clarified that dfs.client.block.write.replace-datanode-on-failure.best-effort applies to pipeline recovery. Add final keywords, documentation, etc. to ReplaceDatanodeOnFailure --- Key: HDFS-6985 URL: https://issues.apache.org/jira/browse/HDFS-6985 Project: Hadoop HDFS Issue Type: Improvement Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-6985.001.patch * use final qualifier consistently in {{ReplaceDatanodeOnFailure#Condition classes}} * add a debug log message in the DFSClient explaining which pipeline failure policy is being used. * add JavaDoc for ReplaceDatanodeOnFailure * documentation dfs.client.block.write.replace-datanode-on-failure.best-effort should make it clear that that the configuration key refers to pipeline recovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option
[ https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118895#comment-14118895 ] Colin Patrick McCabe commented on HDFS-4257: Yongjun, check out HDFS-6985 where I addressed your comments. Thanks, all. The ReplaceDatanodeOnFailure policies could have a forgiving option --- Key: HDFS-4257 URL: https://issues.apache.org/jira/browse/HDFS-4257 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Affects Versions: 2.0.2-alpha Reporter: Harsh J Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.6.0 Attachments: h4257_20140325.patch, h4257_20140325b.patch, h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch Similar question has previously come over HDFS-3091 and friends, but the essential problem is: Why can't I write to my cluster of 3 nodes, when I just have 1 node available at a point in time.. The policies cover the 4 options, with {{Default}} being default: {{Disable}} - Disables the whole replacement concept by throwing out an error (at the server) or acts as {{Never}} at the client. {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in many cases). {{Default}} - Replace based on a few conditions, but whose minimum never touches 1. We always fail if only one DN remains and none others can be added. {{Always}} - Replace no matter what. Fail if can't replace. Would it not make sense to have an option similar to Always/Default, where despite _trying_, if it isn't possible to have 1 DN in the pipeline, do not fail. I think that is what the former write behavior was, and what fit with the minimum replication factor allowed value. Why is it grossly wrong to pass a write from a client for a block with just 1 remaining replica in the pipeline (the minimum of 1 grows with the replication factor demanded from the write), when replication is taken care of immediately afterwards? How often have we seen missing blocks arise out of allowing this + facing a big rack(s) failure or so? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6942) Fix typos in log messages
[ https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118897#comment-14118897 ] Haohui Mai commented on HDFS-6942: -- +1. I'll commit it shortly. Fix typos in log messages - Key: HDFS-6942 URL: https://issues.apache.org/jira/browse/HDFS-6942 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial Labels: newbie Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch There are a bunch of typos in log messages. HADOOP-10946 was initially created, but may have failed due to being in multiple components. Try fixing typos on a per-component basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6986) DistributedFileSystem must get delegation tokens from configured KeyProvider
[ https://issues.apache.org/jira/browse/HDFS-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118902#comment-14118902 ] Alejandro Abdelnur commented on HDFS-6986: -- The changes in {{DistributedFileSystem.java}} should be something like: {code} @Override public Token?[] addDelegationTokens(String renewer, Credentials credentials) throws IOException { Token?[] tokens = super.addDelegationTokens(renewer, credentials); if (dfs.getKeyProvider() != null) { KeyProviderDelegationTokenExtension keyProviderDelegationTokenExtension = KeyProviderDelegationTokenExtension. createKeyProviderDelegationTokenExtension(dfs.getKeyProvider()); Token?[] kpTokens = keyProviderDelegationTokenExtension. addDelegationTokens(renewer, credentials); if (tokens != null kpTokens != null) { Token?[] all = new Token?[tokens.length + kpTokens.length]; System.arraycopy(tokens, 0, all, 0, tokens.length); System.arraycopy(kpTokens, 0, all, tokens.length, kpTokens.length); tokens = all; } else { tokens = (tokens != null) ? tokens : kpTokens; } } return tokens; } {code} And {{DFSClient}} should expose the keyprovider via a {{getKeyProvider()}} method. DistributedFileSystem must get delegation tokens from configured KeyProvider Key: HDFS-6986 URL: https://issues.apache.org/jira/browse/HDFS-6986 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: 2.6.0 Reporter: Alejandro Abdelnur Assignee: Zhe Zhang {{KeyProvider}} via {{KeyProviderDelegationTokenExtension}} provides delegation tokens. {{DistributedFileSystem}} should augment the HDFS delegation tokens with the keyprovider ones so tasks can interact with keyprovider when it is a client/server impl (KMS). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118915#comment-14118915 ] Jing Zhao commented on HDFS-6886: - The patch looks pretty good to me. Besides the comment from Vinay, only two minor comments: # Nit: Let's still keep toLogRpcIds as the last parameter in FSEditLog#logOpenFile (i.e., move overwrite before toLogRpcIds) # It will be better to have an overwrite transaction for TestOfflineEditsViewer Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, HDFS-6886.003.patch, editsStored As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6930) Improve replica eviction from RAM disk
[ https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118932#comment-14118932 ] Xiaoyu Yao commented on HDFS-6930: -- +1 Can you check if capacity 0? It can be 0 when the RAM_DISK volume is allowed to add/remove dynamically, {code} int percentFree = (int) (free * 100 / capacity); {code} Improve replica eviction from RAM disk -- Key: HDFS-6930 URL: https://issues.apache.org/jira/browse/HDFS-6930 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6930.01.patch The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation. A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118936#comment-14118936 ] Hadoop QA commented on HDFS-6951: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12665981/HDFS-6951.004.patch against trunk revision 6595e92. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7873//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7873//console This message is automatically generated. Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Assignee: Charles Lamb Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, HDFS-6951.004.patch, editsStored Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6977) Delete all copies when a block is deleted from the block space
[ https://issues.apache.org/jira/browse/HDFS-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6977: Attachment: HDFS-6977.02.patch Slight update to remove one call to {{Block.metaToBlockFile}}. Delete all copies when a block is deleted from the block space -- Key: HDFS-6977 URL: https://issues.apache.org/jira/browse/HDFS-6977 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Xiaoyu Yao Assignee: Arpit Agarwal Attachments: HDFS-6977.01.patch, HDFS-6977.02.patch When a block is deleted from RAM disk we should also delete the copies written to lazyPersist/. Reported by [~xyao] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6942) Fix typos in log messages
[ https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6942: - Resolution: Fixed Fix Version/s: 2.6.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk and branch-2. Thanks [~rchiang] for the contribution. Fix typos in log messages - Key: HDFS-6942 URL: https://issues.apache.org/jira/browse/HDFS-6942 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial Labels: newbie Fix For: 2.6.0 Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch There are a bunch of typos in log messages. HADOOP-10946 was initially created, but may have failed due to being in multiple components. Try fixing typos on a per-component basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4257) The ReplaceDatanodeOnFailure policies could have a forgiving option
[ https://issues.apache.org/jira/browse/HDFS-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118949#comment-14118949 ] Yongjun Zhang commented on HDFS-4257: - Thanks Colin, for reviewing and following-up. Hi [~szetszwo], thanks for fixing the problem here. Colin created HDFS-6985 to address the comments I made earlier. Would you please take a look whether it looks good to you when you have time? Thanks. The ReplaceDatanodeOnFailure policies could have a forgiving option --- Key: HDFS-4257 URL: https://issues.apache.org/jira/browse/HDFS-4257 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Affects Versions: 2.0.2-alpha Reporter: Harsh J Assignee: Tsz Wo Nicholas Sze Priority: Minor Fix For: 2.6.0 Attachments: h4257_20140325.patch, h4257_20140325b.patch, h4257_20140326.patch, h4257_20140819.patch, h4257_20140831.patch Similar question has previously come over HDFS-3091 and friends, but the essential problem is: Why can't I write to my cluster of 3 nodes, when I just have 1 node available at a point in time.. The policies cover the 4 options, with {{Default}} being default: {{Disable}} - Disables the whole replacement concept by throwing out an error (at the server) or acts as {{Never}} at the client. {{Never}} - Never replaces a DN upon pipeline failures (not too desirable in many cases). {{Default}} - Replace based on a few conditions, but whose minimum never touches 1. We always fail if only one DN remains and none others can be added. {{Always}} - Replace no matter what. Fail if can't replace. Would it not make sense to have an option similar to Always/Default, where despite _trying_, if it isn't possible to have 1 DN in the pipeline, do not fail. I think that is what the former write behavior was, and what fit with the minimum replication factor allowed value. Why is it grossly wrong to pass a write from a client for a block with just 1 remaining replica in the pipeline (the minimum of 1 grows with the replication factor demanded from the write), when replication is taken care of immediately afterwards? How often have we seen missing blocks arise out of allowing this + facing a big rack(s) failure or so? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6956) Allow dynamically changing the tracing level in Hadoop servers
[ https://issues.apache.org/jira/browse/HDFS-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118963#comment-14118963 ] Colin Patrick McCabe commented on HDFS-6956: daemonlog is about log4j, this is about tracing. htrace can send trace events to a system like zipkin, which is more useful than just catting them to a file or to log4j. I imagine the implementation might be kind of like daemonlog, though. A command that you could run to enable tracing while in production. Allow dynamically changing the tracing level in Hadoop servers -- Key: HDFS-6956 URL: https://issues.apache.org/jira/browse/HDFS-6956 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Colin Patrick McCabe We should allow users to dynamically change the tracing level in Hadoop servers. The easiest way to do this is probably to have an RPC accessible only to the superuser that changes tracing settings. This would allow us to turn on and off tracing on the NameNode, DataNode, etc. at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6982) nntop: top-like tool for name node users
[ https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118971#comment-14118971 ] Maysam Yabandeh commented on HDFS-6982: --- Thanks [~philip]. I agree with you. I actually was planning to skip the audit log tailing stuff altogether to keep the patch simple. If there was interest in future I can submit a separate patch for that. The metric key format is operation.user. Here is a sample output from the jmx interface: {code} [myabandeh@smf1-aro-39-sr1(hadoop-tst-nn) ~]$ curl localhost:12333/jmx | grep Hadoop:service=nntop,name=topusers -B1 -A8 % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 }, { name : Hadoop:service=nntop,name=topusers, modelerType : topusers, tag.Context : namenode, tag.ProcessName : DummyProcessName, tag.SessionId : DummySessionId, tag.Hostname : hhh, delete.xxx : 1, setPermission.ALL : 0, getfileinfo.ALL : 3159, {code} nntop: top-like tool for name node users - Key: HDFS-6982 URL: https://issues.apache.org/jira/browse/HDFS-6982 Project: Hadoop HDFS Issue Type: New Feature Reporter: Maysam Yabandeh Attachments: nntop-design-v1.pdf In this jira we motivate the need for nntop, a tool that, similarly to what top does in Linux, gives the list of top users of the HDFS name node and gives insight about which users are sending majority of each traffic type to the name node. This information turns out to be the most critical when the name node is under pressure and the HDFS admin needs to know which user is hammering the name node and with what kind of requests. Here we present the design of nntop which has been in production at Twitter in the past 10 months. nntop proved to have low cpu overhead ( 2% in a cluster of 4K nodes), low memory footprint (less than a few MB), and quite efficient for the write path (only two hash lookup for updating a metric). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6942) Fix typos in log messages
[ https://issues.apache.org/jira/browse/HDFS-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118996#comment-14118996 ] Ray Chiang commented on HDFS-6942: -- Thanks for the commit! Glad to get this first batch of typos fixed. Fix typos in log messages - Key: HDFS-6942 URL: https://issues.apache.org/jira/browse/HDFS-6942 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial Labels: newbie Fix For: 2.6.0 Attachments: HDFS-6942-01.patch, HDFS-6942-02.patch There are a bunch of typos in log messages. HADOOP-10946 was initially created, but may have failed due to being in multiple components. Try fixing typos on a per-component basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6966) Add additional unit tests for encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Chu updated HDFS-6966: -- Attachment: HDFS-6966.2.patch Adding new patch with more tests. In the most recent patch we: * Add HA test to verify standby NN tracks encryption zones. * Assert null when calling getEncryptionZoneForPath on a nonexistent path. * Verify success of renaming a dir and file within an encryption zone * Run fsck on a system with encryption zones * Add more snapshot unit testing. In particular, after snapshotting an encryption zone, remove the encryption zone and recreate the dir and take a snapshot. Verify that the new snapshot does not have an encryption zone. Delete the snapshots out of order and verify that the remaining snapshots have the correct encryption zone paths. * Add tests for symlinks within the same encryption zone and within different encryption zones. * Add test to run the OfflineImageViewer on a system of encryption zones. Again, if it's better, I can merge some tests to save on MiniDFSCluster spin up and shutdown time. Add additional unit tests for encryption zones -- Key: HDFS-6966 URL: https://issues.apache.org/jira/browse/HDFS-6966 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0, 2.6.0 Reporter: Stephen Chu Assignee: Stephen Chu Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch There are some more unit tests that can be added for test encryption zones. For example, more encryption zone + snapshot tests, running fsck on encryption zones, and more. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file
[ https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6705: --- Attachment: HDFS-6705.002.patch Submitting for a test run. Create an XAttr that disallows the HDFS admin from accessing a file --- Key: HDFS-6705 URL: https://issues.apache.org/jira/browse/HDFS-6705 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6705.001.patch, HDFS-6705.002.patch There needs to be an xattr that specifies that the HDFS admin can not access a file. This is needed for m/r delegation tokens and data at rest encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119015#comment-14119015 ] Charles Lamb commented on HDFS-6951: org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover both fail on my local machine with and without the patch so is unrelated. org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer - this is an expected failure until the testEdits file gets checked in. It passes on my machine. Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Assignee: Charles Lamb Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, HDFS-6951.004.patch, editsStored Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6959) make user home directory customizable
[ https://issues.apache.org/jira/browse/HDFS-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119033#comment-14119033 ] Colin Patrick McCabe commented on HDFS-6959: bq. In that case, would you please take a look at rev 001 instead? the change there is restricted to HDFS. Thanks OK, reviewing v1. bq. + private String home_dir_base = DFSConfigKeys.DFS_USER_HOME_BASE_DIR_DEFAULT; Should be final {code} +property + namedfs.user.home.base.dir/name + value/user/value + descriptionBase directory of user home./description +/property {code} This description is a bit terse. Maybe something like: the directory to prepend to the user name to get the user's home directory looks good aside from that make user home directory customizable - Key: HDFS-6959 URL: https://issues.apache.org/jira/browse/HDFS-6959 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 2.2.0 Reporter: Kevin Odell Assignee: Yongjun Zhang Priority: Minor Attachments: HADOOP-10334.001.patch, HADOOP-10334.002.patch, HADOOP-10334.002.patch The path is currently hardcoded: public Path getHomeDirectory() { return makeQualified(new Path(/user/ + dfs.ugi.getShortUserName())); } It would be nice to have that as a customizable value. Thank you -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list
[ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119041#comment-14119041 ] Colin Patrick McCabe commented on HDFS-6658: I didn't realize the patches were up; don't forget to hit submit patch! I'll try to check it out in the next few days-- thanks for your patience, guys. Namenode memory optimization - Block replicas list --- Key: HDFS-6658 URL: https://issues.apache.org/jira/browse/HDFS-6658 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.4.1 Reporter: Amir Langer Assignee: Amir Langer Attachments: BlockListOptimizationComparison.xlsx, Namenode Memory Optimizations - Block replicas list.docx Part of the memory consumed by every BlockInfo object in the Namenode is a linked list of block references for every DatanodeStorageInfo (called triplets). We propose to change the way we store the list in memory. Using primitive integer indexes instead of object references will reduce the memory needed for every block replica (when compressed oops is disabled) and in our new design the list overhead will be per DatanodeStorageInfo and not per block replica. see attached design doc. for details and evaluation results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6951) Saving namespace and restarting NameNode will remove existing encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119097#comment-14119097 ] Yi Liu commented on HDFS-6951: -- Thanks [~clamb], It's OK for me. +1 (non-binding). Saving namespace and restarting NameNode will remove existing encryption zones -- Key: HDFS-6951 URL: https://issues.apache.org/jira/browse/HDFS-6951 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0 Reporter: Stephen Chu Assignee: Charles Lamb Attachments: HDFS-6951-prelim.002.patch, HDFS-6951-testrepo.patch, HDFS-6951.001.patch, HDFS-6951.002.patch, HDFS-6951.003.patch, HDFS-6951.004.patch, editsStored Currently, when users save namespace and restart the NameNode, pre-existing encryption zones will be wiped out. I could reproduce this on a pseudo-distributed cluster: * Create an encryption zone * List encryption zones and verify the newly created zone is present * Save the namespace * Kill and restart the NameNode * List the encryption zones and you'll find the encryption zone is missing I've attached a test case for {{TestEncryptionZones}} that reproduces this as well. Removing the saveNamespace call will get the test to pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119105#comment-14119105 ] Yi Liu commented on HDFS-6886: -- Thanks [~vinayrpet] and [~jingzhao] for review, I will update the patch for your comments later. Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, HDFS-6886.003.patch, editsStored As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6634) inotify in HDFS
[ https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119104#comment-14119104 ] James Thomas commented on HDFS-6634: Thanks guys inotify in HDFS --- Key: HDFS-6634 URL: https://issues.apache.org/jira/browse/HDFS-6634 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: James Thomas Fix For: 2.6.0 Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, HDFS-6634.5.patch, HDFS-6634.6.patch, HDFS-6634.7.patch, HDFS-6634.8.patch, HDFS-6634.9.patch, HDFS-6634.patch, inotify-design.2.pdf, inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, inotify-intro.2.pdf, inotify-intro.pdf Design a mechanism for applications like search engines to access the HDFS edit stream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2975) Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart).
[ https://issues.apache.org/jira/browse/HDFS-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119111#comment-14119111 ] Yi Liu commented on HDFS-2975: -- Thanks [~umamaheswararao] and [~vinayrpet] for the review. Rename with overwrite flag true can make NameNode to stuck in safemode on NN (crash + restart). --- Key: HDFS-2975 URL: https://issues.apache.org/jira/browse/HDFS-2975 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Uma Maheswara Rao G Assignee: Yi Liu Attachments: HDFS-2975.001.patch When we rename the file with overwrite flag as true, it will delete the destination file blocks. After deleting the blocks, whenever it releases the fsNameSystem lock, NN can give the invalidation work to corresponding DNs to delete the blocks. Parallaly it will sync the rename related edits to editlog file. At this step before NN sync the edits if NN crashes, NN can stuck into safemode on restart. This is because block already deleted from the DN as part of invalidations. But dst file still exist as rename edits not persisted in log file and no DN will report that blocks now. This is similar to HDFS-2815 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long
[ https://issues.apache.org/jira/browse/HDFS-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119140#comment-14119140 ] Colin Patrick McCabe commented on HDFS-6036: bq. The slf4j style uses {} as a template to avoid string concatenation, let's make sure that's used for all the LOG calls. ok bq. shouldDefer, the !anchored case, could we lower the LOG to debug? ok bq. In UncachingTask#run, there's a little ternary to add Deferred before. We could have it switch between Deferred u and U so the capitalization of Uncaching is always correct. I just added an 'if' statement, to avoid making this too complex :) bq. The default is set to 15 hours, isn't this a really long time? I expected something like a few mins. Sorry, this was supposed to be 15 minutes. Fixed. bq. New keys should be added to hdfs-default.xml as well. added bq. Regarding the minimum polling rate, I'd prefer to abort if it's not configured correctly. Silent correction means bad conf values live a continued existence, and confs get copy pasted around. ok bq. Having the min be revocation/2 is also somewhat arbitrary, but I'll go along with it. Nyquist-ish? Yeah, that was the motivation. bq. It'd also be nice to print which client is holding on to anchors for too long. Yeah, very good idea. I implemented this... Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long - Key: HDFS-6036 URL: https://issues.apache.org/jira/browse/HDFS-6036 Project: Hadoop HDFS Issue Type: Sub-task Components: caching, datanode Affects Versions: 2.5.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6036.001.patch We should forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long
[ https://issues.apache.org/jira/browse/HDFS-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6036: --- Attachment: HDFS-6036.002.patch Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long - Key: HDFS-6036 URL: https://issues.apache.org/jira/browse/HDFS-6036 Project: Hadoop HDFS Issue Type: Sub-task Components: caching, datanode Affects Versions: 2.5.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6036.001.patch, HDFS-6036.002.patch We should forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-6886: - Attachment: (was: editsStored) Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, HDFS-6886.003.patch As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6886) Use single editlog record for creating file + overwrite.
[ https://issues.apache.org/jira/browse/HDFS-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-6886: - Attachment: editsStored HDFS-6886.004.patch [~vinayrpet] and [~jingzhao], I update the patch for all your comments, thanks. Use single editlog record for creating file + overwrite. Key: HDFS-6886 URL: https://issues.apache.org/jira/browse/HDFS-6886 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-6886.001.patch, HDFS-6886.002.patch, HDFS-6886.003.patch, HDFS-6886.004.patch, editsStored As discussed in HDFS-6871, as [~jingzhao] and [~cmccabe]'s suggestion, we could do further improvement to use one editlog record for creating file + overwrite in this JIRA. We could record the overwrite flag in editlog for creating file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long
[ https://issues.apache.org/jira/browse/HDFS-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119203#comment-14119203 ] Hadoop QA commented on HDFS-6036: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666110/HDFS-6036.002.patch against trunk revision 08a9ac7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The test build failed in hadoop-hdfs-project/hadoop-hdfs {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7877//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7877//console This message is automatically generated. Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long - Key: HDFS-6036 URL: https://issues.apache.org/jira/browse/HDFS-6036 Project: Hadoop HDFS Issue Type: Sub-task Components: caching, datanode Affects Versions: 2.5.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-6036.001.patch, HDFS-6036.002.patch We should forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6966) Add additional unit tests for encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119202#comment-14119202 ] Hadoop QA commented on HDFS-6966: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666087/HDFS-6966.2.patch against trunk revision 08a9ac7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7876//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7876//console This message is automatically generated. Add additional unit tests for encryption zones -- Key: HDFS-6966 URL: https://issues.apache.org/jira/browse/HDFS-6966 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0, 2.6.0 Reporter: Stephen Chu Assignee: Stephen Chu Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch There are some more unit tests that can be added for test encryption zones. For example, more encryption zone + snapshot tests, running fsck on encryption zones, and more. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6966) Add additional unit tests for encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119216#comment-14119216 ] Stephen Chu commented on HDFS-6966: --- TestPipelinesFailover is unrelated to this patch. Add additional unit tests for encryption zones -- Key: HDFS-6966 URL: https://issues.apache.org/jira/browse/HDFS-6966 Project: Hadoop HDFS Issue Type: Sub-task Components: encryption Affects Versions: 3.0.0, 2.6.0 Reporter: Stephen Chu Assignee: Stephen Chu Attachments: HDFS-6966.1.patch, HDFS-6966.2.patch There are some more unit tests that can be added for test encryption zones. For example, more encryption zone + snapshot tests, running fsck on encryption zones, and more. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file
[ https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119232#comment-14119232 ] Hadoop QA commented on HDFS-6705: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12666088/HDFS-6705.002.patch against trunk revision 08a9ac7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract org.apache.hadoop.hdfs.TestDistributedFileSystem org.apache.hadoop.fs.TestSymlinkHdfsFileSystem org.apache.hadoop.hdfs.web.TestWebHDFSXAttr org.apache.hadoop.hdfs.TestRollingUpgrade org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7875//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7875//console This message is automatically generated. Create an XAttr that disallows the HDFS admin from accessing a file --- Key: HDFS-6705 URL: https://issues.apache.org/jira/browse/HDFS-6705 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Affects Versions: 3.0.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-6705.001.patch, HDFS-6705.002.patch There needs to be an xattr that specifies that the HDFS admin can not access a file. This is needed for m/r delegation tokens and data at rest encryption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-6981) DN upgrade with layout version change should not use trash
[ https://issues.apache.org/jira/browse/HDFS-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-6981 started by Arpit Agarwal. --- DN upgrade with layout version change should not use trash -- Key: HDFS-6981 URL: https://issues.apache.org/jira/browse/HDFS-6981 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0 Reporter: James Thomas Assignee: Arpit Agarwal Post HDFS-6800, we can encounter the following scenario: # We start with DN software version -55 and initiate a rolling upgrade to version -56 # We delete some blocks, and they are moved to trash # We roll back to DN software version -55 using the -rollback flag – since we are running the old code (prior to this patch), we will restore the previous directory but will not delete the trash # We append to some of the blocks that were deleted in step 2 # We then restart a DN that contains blocks that were appended to – since the trash still exists, it will be restored at this point, the appended-to blocks will be overwritten, and we will lose the appended data So I think we need to avoid writing anything to the trash directory if we have a previous directory. Thanks to [~james.thomas] for reporting this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-6981) DN upgrade with layout version change should not use trash
[ https://issues.apache.org/jira/browse/HDFS-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal reassigned HDFS-6981: --- Assignee: Arpit Agarwal DN upgrade with layout version change should not use trash -- Key: HDFS-6981 URL: https://issues.apache.org/jira/browse/HDFS-6981 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0 Reporter: James Thomas Assignee: Arpit Agarwal Post HDFS-6800, we can encounter the following scenario: # We start with DN software version -55 and initiate a rolling upgrade to version -56 # We delete some blocks, and they are moved to trash # We roll back to DN software version -55 using the -rollback flag – since we are running the old code (prior to this patch), we will restore the previous directory but will not delete the trash # We append to some of the blocks that were deleted in step 2 # We then restart a DN that contains blocks that were appended to – since the trash still exists, it will be restored at this point, the appended-to blocks will be overwritten, and we will lose the appended data So I think we need to avoid writing anything to the trash directory if we have a previous directory. Thanks to [~james.thomas] for reporting this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-6988) Make RAM disk eviction thresholds configurable
Arpit Agarwal created HDFS-6988: --- Summary: Make RAM disk eviction thresholds configurable Key: HDFS-6988 URL: https://issues.apache.org/jira/browse/HDFS-6988 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Per feedback from [~cmccabe] on HDFS-6930, we can make the eviction thresholds configurable. The hard-coded thresholds may not be appropriate for very large RAM disks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk
[ https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6930: Attachment: HDFS-6930.02.patch Thanks for the reviews. Colin, I filed HDFS-6988 to make the thresholds configurable. Xiaoyu, updated patch attached to check capacity before division. Improve replica eviction from RAM disk -- Key: HDFS-6930 URL: https://issues.apache.org/jira/browse/HDFS-6930 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6930.01.patch, HDFS-6930.02.patch The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation. A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6950: Attachment: (was: HDFS-6930.02.patch) Add Additional unit tests for HDFS-6581 --- Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-6950.0.patch Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6950: Attachment: HDFS-6930.02.patch Add Additional unit tests for HDFS-6581 --- Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-6950.0.patch Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6950: Attachment: (was: HDFS-6950.1.patch) Add Additional unit tests for HDFS-6581 --- Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-6950.0.patch Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk
[ https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6930: Attachment: (was: HDFS-6930.02.patch) Improve replica eviction from RAM disk -- Key: HDFS-6930 URL: https://issues.apache.org/jira/browse/HDFS-6930 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6930.01.patch The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation. A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk
[ https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6930: Attachment: (was: HDFS-6930.02.patch) Improve replica eviction from RAM disk -- Key: HDFS-6930 URL: https://issues.apache.org/jira/browse/HDFS-6930 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6930.01.patch The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation. A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk
[ https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6930: Attachment: HDFS-6930.02.patch Improve replica eviction from RAM disk -- Key: HDFS-6930 URL: https://issues.apache.org/jira/browse/HDFS-6930 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6930.01.patch The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation. A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6950) Add Additional unit tests for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6950: Attachment: HDFS-6950.1.patch Reattaching patch I deleted mistakenly. Sorry about that. Add Additional unit tests for HDFS-6581 --- Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-6950.0.patch, HDFS-6950.1.patch Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk
[ https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6930: Attachment: (was: HDFS-6950.1.patch) Improve replica eviction from RAM disk -- Key: HDFS-6930 URL: https://issues.apache.org/jira/browse/HDFS-6930 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6930.01.patch, HDFS-6930.02.patch The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation. A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk
[ https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6930: Attachment: HDFS-6950.1.patch Improve replica eviction from RAM disk -- Key: HDFS-6930 URL: https://issues.apache.org/jira/browse/HDFS-6930 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6930.01.patch, HDFS-6930.02.patch The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation. A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6930) Improve replica eviction from RAM disk
[ https://issues.apache.org/jira/browse/HDFS-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-6930: Attachment: HDFS-6930.02.patch Improve replica eviction from RAM disk -- Key: HDFS-6930 URL: https://issues.apache.org/jira/browse/HDFS-6930 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: HDFS-6581 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-6930.01.patch, HDFS-6930.02.patch The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation. A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-6950) Add Additional unit tests for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao reassigned HDFS-6950: Assignee: Xiaoyu Yao (was: Xiaoyu Yao) Add Additional unit tests for HDFS-6581 --- Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-6950.0.patch, HDFS-6950.1.patch Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-6950) Add Additional unit tests for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-6950 started by Xiaoyu Yao. Add Additional unit tests for HDFS-6581 --- Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-6950.0.patch, HDFS-6950.1.patch Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6950) Add Additional unit tests for HDFS-6581
[ https://issues.apache.org/jira/browse/HDFS-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14119338#comment-14119338 ] Arpit Agarwal commented on HDFS-6950: - Thanks for adding these test cases [~xyao]! Few comments: # testFallbackToDiskPartial is failing. The test looks fine so I think it has uncovered a bug. I'll investigate. # testScopeWriteSameNodeRamDiskOnly - the test case looks incomplete. I don't think there is an easy way to do multi-node testing with our unit tests. Let's move this to another Jira, we can investigate if there is a way to add the test. # testRamDiskEvictionBeforePersist - The comment _Ensure that both paths exist even after eviction and are readable_ looks unrelated to the test. We can delete it. # testRamDiskEvictionWithOpenHandle - Let's move this to a separate Jira too. The test will not work as expected. # testDeleteWithOpenHandle - Same as previous one. # testDfsUsageCreateDelete - I think the test is not doing what you expect. The DFSUsage is # verifyDeletedBlocks - You can reduce the sleep interval in the loop to 1000ms. Also this function can verify that there is no block file copy in the finalized directory of the transient volume. Add Additional unit tests for HDFS-6581 --- Key: HDFS-6950 URL: https://issues.apache.org/jira/browse/HDFS-6950 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-6950.0.patch, HDFS-6950.1.patch Create additional unit tests for HDFS-6581 in addition to existing ones in HDFS-6927. -- This message was sent by Atlassian JIRA (v6.3.4#6332)