[jira] [Created] (HDFS-5665) Remove the unnecessary writeLock while initializing CacheManager in FsNameSystem Ctor
Uma Maheswara Rao G created HDFS-5665: - Summary: Remove the unnecessary writeLock while initializing CacheManager in FsNameSystem Ctor Key: HDFS-5665 URL: https://issues.apache.org/jira/browse/HDFS-5665 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.2.0, 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G I just saw the below piece of code in Fsnamesystem ctor. {code} writeLock(); try { this.cacheManager = new CacheManager(this, conf, blockManager); } finally { writeUnlock(); } {code} It seems unnecessary to keep writeLock here. I am not sure if there is a clear reason to keep the lock. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5665) Remove the unnecessary writeLock while initializing CacheManager in FsNameSystem Ctor
[ https://issues.apache.org/jira/browse/HDFS-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847317#comment-13847317 ] Uma Maheswara Rao G commented on HDFS-5665: --- Feel free to close this JIRA if you have a valid reason. Remove the unnecessary writeLock while initializing CacheManager in FsNameSystem Ctor - Key: HDFS-5665 URL: https://issues.apache.org/jira/browse/HDFS-5665 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 3.0.0, 2.2.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G I just saw the below piece of code in Fsnamesystem ctor. {code} writeLock(); try { this.cacheManager = new CacheManager(this, conf, blockManager); } finally { writeUnlock(); } {code} It seems unnecessary to keep writeLock here. I am not sure if there is a clear reason to keep the lock. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5632) Add Snapshot feature to INodeDirectory
[ https://issues.apache.org/jira/browse/HDFS-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847385#comment-13847385 ] Tsz Wo (Nicholas), SZE commented on HDFS-5632: -- Thanks Jing for picking this up. Some minor comments on the patch: - In INodeDirectory.recordModification, add snapshot feature only if !shouldRecordInSrcSnapshot(latest). - DirectoryWithSnapshotFeature.ChildrenDiff does not need to be public. - DirectoryDiff.getChildrenList can be private. - We still replace INodeDirectory with INodeDirectorySnapshottable. So the following comment is not accurate. {code} + /** + * Replace the given child with a new child. Note that we no longer need to + * replace an normal INodeDirectory or INodeFile into an + * INodeDirectoryWithSnapshot or INodeFileUnderConstruction. The only cases + * for child replacement is for reference nodes. + */ public void replaceChild(INode oldChild, final INode newChild, final INodeMap inodeMap) { {code} - In INodeReference.destroyAndCollectBlocks code below, the referred node must be with snapshot so that we don't need if (dir.isWithSnapshot()) but we should check precondition in DirectoryWithSnapshotFeature.destroyDstSubtree(..). BTW, the comment needs to be updated. It still refers INodeDirectoryWithSnapshot. {code} // similarly, if referred is a directory, it must be an // INodeDirectoryWithSnapshot - INodeDirectoryWithSnapshot sdir = - (INodeDirectoryWithSnapshot) referred; - try { -INodeDirectoryWithSnapshot.destroyDstSubtree(sdir, snapshot, prior, -collectedBlocks, removedINodes); - } catch (QuotaExceededException e) { -LOG.error(should not exceed quota while snapshot deletion, e); + INodeDirectory dir = referred.asDirectory(); + if (dir.isWithSnapshot()) { +try { + DirectoryWithSnapshotFeature.destroyDstSubtree(dir, snapshot, + prior, collectedBlocks, removedINodes); +} catch (QuotaExceededException e) { + LOG.error(should not exceed quota while snapshot deletion, e); +} } {code} - In DirectoryWithSnapshotFeature.cleanDeletedINode code below, we probably should getDirectoryWithSnapshotFeature() first but not calling isWithSnapshot(). Similarly, for destroyDstSubtree. {code} +if (dir.isWithSnapshot()) { + // delete files/dirs created after prior. Note that these + // files/dirs, along with inode, were deleted right after post. + DirectoryDiff priorDiff = dir.getDirectoryWithSnapshotFeature().getDiffs() + .getDiff(prior); {code} Add Snapshot feature to INodeDirectory -- Key: HDFS-5632 URL: https://issues.apache.org/jira/browse/HDFS-5632 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5632.000.patch, HDFS-5632.001.patch, HDFS-5632.002.patch We will add snapshot feature to INodeDirectory and remove INodeDirectoryWithSnapshot in this jira. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847392#comment-13847392 ] Hudson commented on HDFS-2832: -- FAILURE: Integrated in Hadoop-Yarn-trunk #420 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/420/]) HDFS-2832: Update binary file editsStored for TestOfflineEditsViewer (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550364) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored svn merge --reintegrate https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-2832 for merging Heterogeneous Storage feature branch (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550363) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/StorageType.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/UnregisteredNodeException.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java *
[jira] [Commented] (HDFS-5637) try to refeatchToken while local read InvalidToken occurred
[ https://issues.apache.org/jira/browse/HDFS-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847398#comment-13847398 ] Hudson commented on HDFS-5637: -- FAILURE: Integrated in Hadoop-Yarn-trunk #420 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/420/]) HDFS-5637. Try to refeatchToken while local read InvalidToken occurred. (Liang Xie via junping_du) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550335) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java try to refeatchToken while local read InvalidToken occurred --- Key: HDFS-5637 URL: https://issues.apache.org/jira/browse/HDFS-5637 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, security Affects Versions: 2.0.5-alpha, 2.2.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 2.4.0 Attachments: HDFS-5637-v2.txt, HDFS-5637.txt we observed several warning logs like below from region server nodes: 2013-12-05,13:22:26,042 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.2.201.110:11402 for block, add to deadNodes and continue. org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with block_token_identifier (expiryDate=1386060141977, keyId=-333530248, userId=hbase_srv, blockPoolId=BP-1310313570-10.101.10.66-1373527541386, blockId=-190217754078101701, access modes=[READ]) is expired. at org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280) at org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:88) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkBlockToken(DataNode.java:1082) at org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockLocalPathInfo(DataNode.java:1033) at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolServerSideTranslatorPB.java:112) at org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:5104) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687) org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with block_token_identifier (expiryDate=1386060141977, keyId=-333530248, userId=hbase_srv, blockPoolId=BP-1310313570-10.101.10.66-1373527541386, blockId=-190217754078101701, access modes=[READ]) is expired. at org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280) at org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:88) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkBlockToken(DataNode.java:1082) at org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockLocalPathInfo(DataNode.java:1033) at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolServerSideTranslatorPB.java:112) at org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:5104) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
[jira] [Commented] (HDFS-5652) refactoring/uniforming invalid block token exception handling in DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847402#comment-13847402 ] Hudson commented on HDFS-5652: -- FAILURE: Integrated in Hadoop-Yarn-trunk #420 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/420/]) HDFS-5652. Refactor invalid block token exception handling in DFSInputStream. (Liang Xie via junping_du) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550620) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java refactoring/uniforming invalid block token exception handling in DFSInputStream --- Key: HDFS-5652 URL: https://issues.apache.org/jira/browse/HDFS-5652 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0, 2.2.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 2.4.0 Attachments: HDFS-5652.txt See comments from Junping and Colin's from HDFS-5637 -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-4201) NPE in BPServiceActor#sendHeartBeat
[ https://issues.apache.org/jira/browse/HDFS-4201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847404#comment-13847404 ] Hudson commented on HDFS-4201: -- FAILURE: Integrated in Hadoop-Yarn-trunk #420 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/420/]) HDFS-4201. NPE in BPServiceActor#sendHeartBeat (jxiang via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550269) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java NPE in BPServiceActor#sendHeartBeat --- Key: HDFS-4201 URL: https://issues.apache.org/jira/browse/HDFS-4201 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Eli Collins Assignee: Jimmy Xiang Priority: Critical Fix For: 2.3.0 Attachments: trunk-4201.patch, trunk-4201_v2.patch, trunk-4201_v3.patch Saw the following NPE in a log. Think this is likely due to {{dn}} or {{dn.getFSDataset()}} being null, (not {{bpRegistration}}) due to a configuration or local directory failure. {code} 2012-09-25 04:33:20,782 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: For namenode svsrs00127/11.164.162.226:8020 using DELETEREPORT_INTERVAL of 30 msec BLOCKREPORT_INTERVAL of 2160msec Initial delay: 0msec; heartBeatInterval=3000 2012-09-25 04:33:20,782 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService for Block pool BP-1678908700-11.164.162.226-1342785481826 (storage id DS-1031100678-11.164.162.251-5010-1341933415989) service to svsrs00127/11.164.162.226:8020 java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:434) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:520) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:673) at java.lang.Thread.run(Thread.java:722) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5647) Merge INodeDirectory.Feature and INodeFile.Feature
[ https://issues.apache.org/jira/browse/HDFS-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847397#comment-13847397 ] Hudson commented on HDFS-5647: -- FAILURE: Integrated in Hadoop-Yarn-trunk #420 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/420/]) HDFS-5647. Merge INodeDirectory.Feature and INodeFile.Feature. Contributed by Haohui Mai. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550469) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DirectoryWithQuotaFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileUnderConstructionFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeWithAdditionalFields.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileWithSnapshotFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java Merge INodeDirectory.Feature and INodeFile.Feature -- Key: HDFS-5647 URL: https://issues.apache.org/jira/browse/HDFS-5647 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 3.0.0 Attachments: HDFS-5647.000.patch, HDFS-5647.001.patch, HDFS-5647.002.patch, HDFS-5647.003.patch HDFS-4685 implements ACLs for HDFS, which can benefit from the INode features introduced in HDFS-5284. The current code separates the INode feature of INodeFile and INodeDirectory into two different class hierarchies. This hinders the implementation of ACL since ACL is a concept that applies to both INodeFile and INodeDirectory. This jira proposes to merge the two class hierarchies (i.e., INodeDirectory.Feature and INodeFile.Feature) to simplify the implementation of ACLs. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing with jdk7
[ https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847399#comment-13847399 ] Hudson commented on HDFS-5023: -- FAILURE: Integrated in Hadoop-Yarn-trunk #420 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/420/]) HDFS-5023. TestSnapshotPathINodes.testAllowSnapshot is failing with jdk7 (Mit Desai via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550261) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSnapshotPathINodes.java TestSnapshotPathINodes.testAllowSnapshot is failing with jdk7 - Key: HDFS-5023 URL: https://issues.apache.org/jira/browse/HDFS-5023 Project: Hadoop HDFS Issue Type: Bug Components: snapshots, test Affects Versions: 3.0.0, 2.4.0 Reporter: Ravi Prakash Assignee: Mit Desai Labels: java7, test Fix For: 3.0.0, 2.4.0 Attachments: HDFS-5023.patch, HDFS-5023.patch, HDFS-5023.patch, TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt The assertion on line 91 is failing. I am using Fedora 19 + JDK7. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5647) Merge INodeDirectory.Feature and INodeFile.Feature
[ https://issues.apache.org/jira/browse/HDFS-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847485#comment-13847485 ] Hudson commented on HDFS-5647: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1611 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1611/]) HDFS-5647. Merge INodeDirectory.Feature and INodeFile.Feature. Contributed by Haohui Mai. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550469) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DirectoryWithQuotaFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileUnderConstructionFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeWithAdditionalFields.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileWithSnapshotFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java Merge INodeDirectory.Feature and INodeFile.Feature -- Key: HDFS-5647 URL: https://issues.apache.org/jira/browse/HDFS-5647 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 3.0.0 Attachments: HDFS-5647.000.patch, HDFS-5647.001.patch, HDFS-5647.002.patch, HDFS-5647.003.patch HDFS-4685 implements ACLs for HDFS, which can benefit from the INode features introduced in HDFS-5284. The current code separates the INode feature of INodeFile and INodeDirectory into two different class hierarchies. This hinders the implementation of ACL since ACL is a concept that applies to both INodeFile and INodeDirectory. This jira proposes to merge the two class hierarchies (i.e., INodeDirectory.Feature and INodeFile.Feature) to simplify the implementation of ACLs. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5652) refactoring/uniforming invalid block token exception handling in DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847488#comment-13847488 ] Hudson commented on HDFS-5652: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1611 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1611/]) HDFS-5652. Refactor invalid block token exception handling in DFSInputStream. (Liang Xie via junping_du) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550620) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java refactoring/uniforming invalid block token exception handling in DFSInputStream --- Key: HDFS-5652 URL: https://issues.apache.org/jira/browse/HDFS-5652 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0, 2.2.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 2.4.0 Attachments: HDFS-5652.txt See comments from Junping and Colin's from HDFS-5637 -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5023) TestSnapshotPathINodes.testAllowSnapshot is failing with jdk7
[ https://issues.apache.org/jira/browse/HDFS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847545#comment-13847545 ] Hudson commented on HDFS-5023: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1637/]) HDFS-5023. TestSnapshotPathINodes.testAllowSnapshot is failing with jdk7 (Mit Desai via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550261) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSnapshotPathINodes.java TestSnapshotPathINodes.testAllowSnapshot is failing with jdk7 - Key: HDFS-5023 URL: https://issues.apache.org/jira/browse/HDFS-5023 Project: Hadoop HDFS Issue Type: Bug Components: snapshots, test Affects Versions: 3.0.0, 2.4.0 Reporter: Ravi Prakash Assignee: Mit Desai Labels: java7, test Fix For: 3.0.0, 2.4.0 Attachments: HDFS-5023.patch, HDFS-5023.patch, HDFS-5023.patch, TEST-org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes.xml, org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes-output.txt The assertion on line 91 is failing. I am using Fedora 19 + JDK7. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847538#comment-13847538 ] Hudson commented on HDFS-2832: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1637/]) HDFS-2832: Update binary file editsStored for TestOfflineEditsViewer (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550364) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored svn merge --reintegrate https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-2832 for merging Heterogeneous Storage feature branch (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550363) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/StorageType.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/UnregisteredNodeException.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java *
[jira] [Commented] (HDFS-5647) Merge INodeDirectory.Feature and INodeFile.Feature
[ https://issues.apache.org/jira/browse/HDFS-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847543#comment-13847543 ] Hudson commented on HDFS-5647: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1637/]) HDFS-5647. Merge INodeDirectory.Feature and INodeFile.Feature. Contributed by Haohui Mai. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550469) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DirectoryWithQuotaFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileUnderConstructionFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeWithAdditionalFields.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileWithSnapshotFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java Merge INodeDirectory.Feature and INodeFile.Feature -- Key: HDFS-5647 URL: https://issues.apache.org/jira/browse/HDFS-5647 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 3.0.0 Attachments: HDFS-5647.000.patch, HDFS-5647.001.patch, HDFS-5647.002.patch, HDFS-5647.003.patch HDFS-4685 implements ACLs for HDFS, which can benefit from the INode features introduced in HDFS-5284. The current code separates the INode feature of INodeFile and INodeDirectory into two different class hierarchies. This hinders the implementation of ACL since ACL is a concept that applies to both INodeFile and INodeDirectory. This jira proposes to merge the two class hierarchies (i.e., INodeDirectory.Feature and INodeFile.Feature) to simplify the implementation of ACLs. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5637) try to refeatchToken while local read InvalidToken occurred
[ https://issues.apache.org/jira/browse/HDFS-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847544#comment-13847544 ] Hudson commented on HDFS-5637: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1637/]) HDFS-5637. Try to refeatchToken while local read InvalidToken occurred. (Liang Xie via junping_du) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550335) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java try to refeatchToken while local read InvalidToken occurred --- Key: HDFS-5637 URL: https://issues.apache.org/jira/browse/HDFS-5637 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, security Affects Versions: 2.0.5-alpha, 2.2.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 2.4.0 Attachments: HDFS-5637-v2.txt, HDFS-5637.txt we observed several warning logs like below from region server nodes: 2013-12-05,13:22:26,042 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.2.201.110:11402 for block, add to deadNodes and continue. org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with block_token_identifier (expiryDate=1386060141977, keyId=-333530248, userId=hbase_srv, blockPoolId=BP-1310313570-10.101.10.66-1373527541386, blockId=-190217754078101701, access modes=[READ]) is expired. at org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280) at org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:88) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkBlockToken(DataNode.java:1082) at org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockLocalPathInfo(DataNode.java:1033) at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolServerSideTranslatorPB.java:112) at org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:5104) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687) org.apache.hadoop.security.token.SecretManager$InvalidToken: Block token with block_token_identifier (expiryDate=1386060141977, keyId=-333530248, userId=hbase_srv, blockPoolId=BP-1310313570-10.101.10.66-1373527541386, blockId=-190217754078101701, access modes=[READ]) is expired. at org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.checkAccess(BlockTokenSecretManager.java:280) at org.apache.hadoop.hdfs.security.token.block.BlockPoolTokenSecretManager.checkAccess(BlockPoolTokenSecretManager.java:88) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkBlockToken(DataNode.java:1082) at org.apache.hadoop.hdfs.server.datanode.DataNode.getBlockLocalPathInfo(DataNode.java:1033) at org.apache.hadoop.hdfs.protocolPB.ClientDatanodeProtocolServerSideTranslatorPB.getBlockLocalPathInfo(ClientDatanodeProtocolServerSideTranslatorPB.java:112) at org.apache.hadoop.hdfs.protocol.proto.ClientDatanodeProtocolProtos$ClientDatanodeProtocolService$2.callBlockingMethod(ClientDatanodeProtocolProtos.java:5104) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
[jira] [Commented] (HDFS-5652) refactoring/uniforming invalid block token exception handling in DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847548#comment-13847548 ] Hudson commented on HDFS-5652: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1637/]) HDFS-5652. Refactor invalid block token exception handling in DFSInputStream. (Liang Xie via junping_du) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550620) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java refactoring/uniforming invalid block token exception handling in DFSInputStream --- Key: HDFS-5652 URL: https://issues.apache.org/jira/browse/HDFS-5652 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0, 2.2.0 Reporter: Liang Xie Assignee: Liang Xie Priority: Minor Fix For: 2.4.0 Attachments: HDFS-5652.txt See comments from Junping and Colin's from HDFS-5637 -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-4201) NPE in BPServiceActor#sendHeartBeat
[ https://issues.apache.org/jira/browse/HDFS-4201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847550#comment-13847550 ] Hudson commented on HDFS-4201: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1637/]) HDFS-4201. NPE in BPServiceActor#sendHeartBeat (jxiang via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550269) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java NPE in BPServiceActor#sendHeartBeat --- Key: HDFS-4201 URL: https://issues.apache.org/jira/browse/HDFS-4201 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Eli Collins Assignee: Jimmy Xiang Priority: Critical Fix For: 2.3.0 Attachments: trunk-4201.patch, trunk-4201_v2.patch, trunk-4201_v3.patch Saw the following NPE in a log. Think this is likely due to {{dn}} or {{dn.getFSDataset()}} being null, (not {{bpRegistration}}) due to a configuration or local directory failure. {code} 2012-09-25 04:33:20,782 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: For namenode svsrs00127/11.164.162.226:8020 using DELETEREPORT_INTERVAL of 30 msec BLOCKREPORT_INTERVAL of 2160msec Initial delay: 0msec; heartBeatInterval=3000 2012-09-25 04:33:20,782 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService for Block pool BP-1678908700-11.164.162.226-1342785481826 (storage id DS-1031100678-11.164.162.251-5010-1341933415989) service to svsrs00127/11.164.162.226:8020 java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:434) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:520) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:673) at java.lang.Thread.run(Thread.java:722) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5484) StorageType and State in DatanodeStorageInfo in NameNode is not accurate
[ https://issues.apache.org/jira/browse/HDFS-5484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847588#comment-13847588 ] Eric Sirianni commented on HDFS-5484: - This fix was basically nullified by the following change made via HDFS-5542 {code} + DatanodeStorageInfo updateStorage(DatanodeStorage s) { synchronized (storageMap) { DatanodeStorageInfo storage = storageMap.get(s.getStorageID()); if (storage == null) { @@ -670,8 +658,6 @@ for DN + getXferAddr()); storage = new DatanodeStorageInfo(this, s); storageMap.put(s.getStorageID(), storage); - } else { -storage.setState(s.getState()); } return storage; } {code} Is there a reason that 'else' was removed? By no longer updating the state in the {{BlockReport}} processing path, we effectively get the bogus state type that is set via the first heartbeat (see the fix for HDFS-5455): {code} + if (storage == null) { +// This is seen during cluster initialization when the heartbeat +// is received before the initial block reports from each storage. +storage = updateStorage(new DatanodeStorage(report.getStorageID())); {code} Even reverting the change and reintroducing the 'else' leaves the state type temporarily inaccurate until the first block report. Wouldn't a better fix be to simply include the full {{DatanodeStorage}} object in the {{StorageReport}} (as opposed to only the Storage ID)? As a matter of bookkeeping, should I reopen this JIRA, or would you prefer a new one be created? StorageType and State in DatanodeStorageInfo in NameNode is not accurate Key: HDFS-5484 URL: https://issues.apache.org/jira/browse/HDFS-5484 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Eric Sirianni Fix For: Heterogeneous Storage (HDFS-2832) Attachments: HDFS-5484-HDFS-2832--2.patch, HDFS-5484-HDFS-2832.patch The fields in DatanodeStorageInfo are updated from two distinct paths: # block reports # storage reports (via heartbeats) The {{state}} and {{storageType}} fields are updated via the Block Report. However, as seen in the code blow, these fields are populated from a dummy {{DatanodeStorage}} object constructed in the DataNode: {code} BPServiceActor.blockReport() { //... // Dummy DatanodeStorage object just for sending the block report. DatanodeStorage dnStorage = new DatanodeStorage(storageID); //... } {code} The net effect is that the {{state}} and {{storageType}} fields are always the default of {{NORMAL}} and {{DISK}} in the NameNode. The recommended fix is to change {{FsDatasetSpi.getBlockReports()}} from: {code} public MapString, BlockListAsLongs getBlockReports(String bpid); {code} to: {code} public MapDatanodeStorage, BlockListAsLongs getBlockReports(String bpid); {code} thereby allowing {{BPServiceActor}} to send the real {{DatanodeStorage}} object with the block report. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5663) make the retry time and interval value configurable in openInfo()
[ https://issues.apache.org/jira/browse/HDFS-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-5663: Attachment: HDFS-5663.txt Attached is a trivial patch for it. make the retry time and interval value configurable in openInfo() - Key: HDFS-5663 URL: https://issues.apache.org/jira/browse/HDFS-5663 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0, 2.2.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-5663.txt The original idea raised from Michael here: https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13846972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846972 It would be better to have a lower interval value especially for online service, like HBase. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Work started] (HDFS-5660) When SSL is enabled , the Namenode WEBUI redirects to Infosecport, which could be 0
[ https://issues.apache.org/jira/browse/HDFS-5660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-5660 started by Benoy Antony. When SSL is enabled , the Namenode WEBUI redirects to Infosecport, which could be 0 --- Key: HDFS-5660 URL: https://issues.apache.org/jira/browse/HDFS-5660 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5660.patch (case 1) When SSL is enabled by setting hadoop.ssl.enabled, SSL will be enabled on the regular port (infoport) on Datanode. (case 2) When SSL on HDFS is enabled by setting dfs.https.enable, SSL will be enabled on a separate port (infoSecurePort) on Datanode. if SSL is enabled , the Namenode always redirects to infoSecurePort. infoSecurePort will be 0 in case 1 above. This breaks the file browsing via web. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Work started] (HDFS-5661) Browsing FileSystem via web ui, should use datanode's hostname instead of ip address
[ https://issues.apache.org/jira/browse/HDFS-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-5661 started by Benoy Antony. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address Key: HDFS-5661 URL: https://issues.apache.org/jira/browse/HDFS-5661 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5661.patch If authentication is enabled on the web ui, then a cookie is used to keep track of the authentication information. There is normally a domain associated with the cookie. Since ip address doesn't have any domain , the cookie will not be sent by the browser while making http calls with ip address as the destination server. This will break browsing files system via web ui , if authentication is enabled. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5663) make the retry time and interval value configurable in openInfo()
[ https://issues.apache.org/jira/browse/HDFS-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-5663: Status: Patch Available (was: Open) make the retry time and interval value configurable in openInfo() - Key: HDFS-5663 URL: https://issues.apache.org/jira/browse/HDFS-5663 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.2.0, 3.0.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-5663.txt The original idea raised from Michael here: https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13846972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846972 It would be better to have a lower interval value especially for online service, like HBase. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5365) Fix libhdfs compile error on FreeBSD9
[ https://issues.apache.org/jira/browse/HDFS-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847649#comment-13847649 ] Chris Nauroth commented on HDFS-5365: - Hi Radim, Unfortunately, this can't be merged to branch-2.2. AFAIK, there are not any additional 2.2.x releases on the schedule, and the plan is to proceed with 2.3.x forward: https://wiki.apache.org/hadoop/Roadmap Therefore, if we committed code to branch-2.2 now, then it wouldn't actually go into a release. I have confirmed that your patch is in branch-2.3 for inclusion in the 2.3.0 release, currently targeted for mid-December. If the plan changes and we decide to add another 2.2.x release on the schedule, then you could nominate this patch for inclusion and we'd commit it at that time. It seems unlikely at this point though. Fix libhdfs compile error on FreeBSD9 - Key: HDFS-5365 URL: https://issues.apache.org/jira/browse/HDFS-5365 Project: Hadoop HDFS Issue Type: Bug Components: build, libhdfs Affects Versions: 3.0.0, 2.2.0 Reporter: Radim Kolar Assignee: Radim Kolar Labels: build Fix For: 3.0.0, 2.3.0 Attachments: hdfs-bsd.txt native library do not compiles on freebsd because: * dlopen is in libc * limits.h include is missing -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5663) make the retry time and interval value configurable in openInfo()
[ https://issues.apache.org/jira/browse/HDFS-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847661#comment-13847661 ] stack commented on HDFS-5663: - +1 make the retry time and interval value configurable in openInfo() - Key: HDFS-5663 URL: https://issues.apache.org/jira/browse/HDFS-5663 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0, 2.2.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-5663.txt The original idea raised from Michael here: https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13846972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846972 It would be better to have a lower interval value especially for online service, like HBase. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5496) Make replication queue initialization asynchronous
[ https://issues.apache.org/jira/browse/HDFS-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847744#comment-13847744 ] Jing Zhao commented on HDFS-5496: - Thanks for the further explanation, Vinay. Now I see what you mean. {code} blockManager.clearQueues(); blockManager.processAllPendingDNMessages(); {code} In my previous comment, I thought you also planned to remove the clearQueues() call.. So if we clear all the queues, and continue with the ongoing initialization, the blocks that were processed before the startActiveService call will not be re-processed? I.e., we finally will generate the replication queues for only part of the blocks. Make replication queue initialization asynchronous -- Key: HDFS-5496 URL: https://issues.apache.org/jira/browse/HDFS-5496 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Kihwal Lee Assignee: Vinay Attachments: HDFS-5496.patch, HDFS-5496.patch Today, initialization of replication queues blocks safe mode exit and certain HA state transitions. For a big name space, this can take hundreds of seconds with the FSNamesystem write lock held. During this time, important requests (e.g. initial block reports, heartbeat, etc) are blocked. The effect of delaying the initialization would be not starting replication right away, but I think the benefit outweighs. If we make it asynchronous, the work per iteration should be limited, so that the lock duration is capped. If full/incremental block reports and any other requests that modifies block state properly performs replication checks while the blocks are scanned and the queues populated in background, every block will be processed. (Some may be done twice) The replication monitor should run even before all blocks are processed. This will allow namenode to exit safe mode and start serving immediately even with a big name space. It will also reduce the HA failover latency. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5660) When SSL is enabled , the Namenode WEBUI redirects to Infosecport, which could be 0
[ https://issues.apache.org/jira/browse/HDFS-5660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847776#comment-13847776 ] Benoy Antony commented on HDFS-5660: I can get HDFS SSL work using dfs.https.enable and 3 additional configs ( NN port , DN port and keystore config.) With this scheme, two ports on NN and each data node will be opened. These additional ports will not be used in our case. I am doing all these steps just so that infosecurePort is set to 1006 which is essentially what this patch does. When SSL is enabled , the Namenode WEBUI redirects to Infosecport, which could be 0 --- Key: HDFS-5660 URL: https://issues.apache.org/jira/browse/HDFS-5660 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5660.patch (case 1) When SSL is enabled by setting hadoop.ssl.enabled, SSL will be enabled on the regular port (infoport) on Datanode. (case 2) When SSL on HDFS is enabled by setting dfs.https.enable, SSL will be enabled on a separate port (infoSecurePort) on Datanode. if SSL is enabled , the Namenode always redirects to infoSecurePort. infoSecurePort will be 0 in case 1 above. This breaks the file browsing via web. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5661) Browsing FileSystem via web ui, should use datanode's hostname instead of ip address
[ https://issues.apache.org/jira/browse/HDFS-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1384#comment-1384 ] Haohui Mai commented on HDFS-5661: -- The namenode and the datanode have different origins therefore the browser will not attach the cookies when making a request to the datanode. Redirecting using the domain name will not help. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address Key: HDFS-5661 URL: https://issues.apache.org/jira/browse/HDFS-5661 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5661.patch If authentication is enabled on the web ui, then a cookie is used to keep track of the authentication information. There is normally a domain associated with the cookie. Since ip address doesn't have any domain , the cookie will not be sent by the browser while making http calls with ip address as the destination server. This will break browsing files system via web ui , if authentication is enabled. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5660) When SSL is enabled , the Namenode WEBUI redirects to Infosecport, which could be 0
[ https://issues.apache.org/jira/browse/HDFS-5660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847779#comment-13847779 ] Haohui Mai commented on HDFS-5660: -- This is the expected behavior. Again, the bug no longer exists in 2.2. I suggest closing this bug as invalid. When SSL is enabled , the Namenode WEBUI redirects to Infosecport, which could be 0 --- Key: HDFS-5660 URL: https://issues.apache.org/jira/browse/HDFS-5660 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5660.patch (case 1) When SSL is enabled by setting hadoop.ssl.enabled, SSL will be enabled on the regular port (infoport) on Datanode. (case 2) When SSL on HDFS is enabled by setting dfs.https.enable, SSL will be enabled on a separate port (infoSecurePort) on Datanode. if SSL is enabled , the Namenode always redirects to infoSecurePort. infoSecurePort will be 0 in case 1 above. This breaks the file browsing via web. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Assigned] (HDFS-5659) dfsadmin -report doesn't output cache information properly
[ https://issues.apache.org/jira/browse/HDFS-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang reassigned HDFS-5659: - Assignee: Andrew Wang dfsadmin -report doesn't output cache information properly -- Key: HDFS-5659 URL: https://issues.apache.org/jira/browse/HDFS-5659 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 3.0.0 Reporter: Akira AJISAKA Assignee: Andrew Wang I tried to cache a file by hdfs cacheadmin -addDirective. I thought the file was cached because CacheUsed at jmx was more than 0. {code} { name : Hadoop:service=DataNode,name=FSDatasetState-DS-1043926324-172.28.0.102-50010-1385087929296, modelerType : org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl, Remaining : 5604772597760, StorageInfo : FSDataset{dirpath='[/hadoop/data1/dfs/data/current, /hadoop/data2/dfs/data/current, /hadoop/data3/dfs/data/current]'}, Capacity : 5905374474240, DfsUsed : 11628544, CacheCapacity : 1073741824, CacheUsed : 360448, NumFailedVolumes : 0, NumBlocksCached : 1, NumBlocksFailedToCache : 0, NumBlocksFailedToUncache : 0 }, {code} But dfsadmin -report didn't output the same value as jmx. {code} Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5663) make the retry time and interval value configurable in openInfo()
[ https://issues.apache.org/jira/browse/HDFS-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847790#comment-13847790 ] Hadoop QA commented on HDFS-5663: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618620/HDFS-5663.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestStartup {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5712//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5712//console This message is automatically generated. make the retry time and interval value configurable in openInfo() - Key: HDFS-5663 URL: https://issues.apache.org/jira/browse/HDFS-5663 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0, 2.2.0 Reporter: Liang Xie Assignee: Liang Xie Attachments: HDFS-5663.txt The original idea raised from Michael here: https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13846972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846972 It would be better to have a lower interval value especially for online service, like HBase. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5661) Browsing FileSystem via web ui, should use datanode's hostname instead of ip address
[ https://issues.apache.org/jira/browse/HDFS-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847789#comment-13847789 ] Benoy Antony commented on HDFS-5661: Browser visits namenode.domainname.com:50070 . hadoopauth cookie is dropped with domain as domainname.com. User clicks browse File system on the webui and browser gets redirected to ADataNode.domainname.com:1006 . Then the browser will send cookies to ADataNode.domainname.com . Without this patch, when user clicks browse File system on the webui and gets redirected to datanodeipaddress:1006 and hence no cookies will be sent. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address Key: HDFS-5661 URL: https://issues.apache.org/jira/browse/HDFS-5661 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5661.patch If authentication is enabled on the web ui, then a cookie is used to keep track of the authentication information. There is normally a domain associated with the cookie. Since ip address doesn't have any domain , the cookie will not be sent by the browser while making http calls with ip address as the destination server. This will break browsing files system via web ui , if authentication is enabled. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5634) allow BlockReaderLocal to switch between checksumming and not
[ https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847815#comment-13847815 ] Colin Patrick McCabe commented on HDFS-5634: bq. I examined this code more carefully, and I found that it was actually using LIFO at the moment. Err, sorry, I mis-spoke. It is FIFO. Unfortunately, {{ConcurrentLinkedDeque}} is not in JDK6 (although it is in JDK7), so it will be hard to test with LIFO here. allow BlockReaderLocal to switch between checksumming and not - Key: HDFS-5634 URL: https://issues.apache.org/jira/browse/HDFS-5634 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, HDFS-5634.003.patch, HDFS-5634.004.patch BlockReaderLocal should be able to switch between checksumming and non-checksumming, so that when we get notifications that something is mlocked (see HDFS-5182), we can avoid checksumming when reading from that block. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5484) StorageType and State in DatanodeStorageInfo in NameNode is not accurate
[ https://issues.apache.org/jira/browse/HDFS-5484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847823#comment-13847823 ] Arpit Agarwal commented on HDFS-5484: - Let's create a new Jira. I am fine with adding the {{DatanodeStorage}} object to {{StorageReportProto}}. It needs to be a new optional field and we cannot remove the existing {{StorageUuid}} for protocol compatibility. StorageType and State in DatanodeStorageInfo in NameNode is not accurate Key: HDFS-5484 URL: https://issues.apache.org/jira/browse/HDFS-5484 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Eric Sirianni Fix For: Heterogeneous Storage (HDFS-2832) Attachments: HDFS-5484-HDFS-2832--2.patch, HDFS-5484-HDFS-2832.patch The fields in DatanodeStorageInfo are updated from two distinct paths: # block reports # storage reports (via heartbeats) The {{state}} and {{storageType}} fields are updated via the Block Report. However, as seen in the code blow, these fields are populated from a dummy {{DatanodeStorage}} object constructed in the DataNode: {code} BPServiceActor.blockReport() { //... // Dummy DatanodeStorage object just for sending the block report. DatanodeStorage dnStorage = new DatanodeStorage(storageID); //... } {code} The net effect is that the {{state}} and {{storageType}} fields are always the default of {{NORMAL}} and {{DISK}} in the NameNode. The recommended fix is to change {{FsDatasetSpi.getBlockReports()}} from: {code} public MapString, BlockListAsLongs getBlockReports(String bpid); {code} to: {code} public MapDatanodeStorage, BlockListAsLongs getBlockReports(String bpid); {code} thereby allowing {{BPServiceActor}} to send the real {{DatanodeStorage}} object with the block report. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5661) Browsing FileSystem via web ui, should use datanode's hostname instead of ip address
[ https://issues.apache.org/jira/browse/HDFS-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847826#comment-13847826 ] Haohui Mai commented on HDFS-5661: -- I'm curious how you manage to pass the cookie to the datanode. Even with your patch, the cookies should not be passed from the namenode and the datanode. The browser is not supposed to shared the cookies. An origin is defined by the scheme, host, and port of a URL[1], where two hosts with the same hostname but different ports are considered different origins. The browsers implement the same-origin policy, where the cookies are isolated in different origins [2]. [1] http://www.w3.org/Security/wiki/Same_Origin_Policy [2] https://code.google.com/p/browsersec/wiki/Part2#Same-origin_policy Browsing FileSystem via web ui, should use datanode's hostname instead of ip address Key: HDFS-5661 URL: https://issues.apache.org/jira/browse/HDFS-5661 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5661.patch If authentication is enabled on the web ui, then a cookie is used to keep track of the authentication information. There is normally a domain associated with the cookie. Since ip address doesn't have any domain , the cookie will not be sent by the browser while making http calls with ip address as the destination server. This will break browsing files system via web ui , if authentication is enabled. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5661) Browsing FileSystem via web ui, should use datanode's hostname instead of ip address
[ https://issues.apache.org/jira/browse/HDFS-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847852#comment-13847852 ] Jing Zhao commented on HDFS-5661: - For the redirect, I think we are using DelegationToken which is included in the redirect URL? Thus we do not need to worry about hostname/ip address here I guess. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address Key: HDFS-5661 URL: https://issues.apache.org/jira/browse/HDFS-5661 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5661.patch If authentication is enabled on the web ui, then a cookie is used to keep track of the authentication information. There is normally a domain associated with the cookie. Since ip address doesn't have any domain , the cookie will not be sent by the browser while making http calls with ip address as the destination server. This will break browsing files system via web ui , if authentication is enabled. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HDFS-5666) TestBPOfferService#/testBPInitErrorHandling fails intermittently
Colin Patrick McCabe created HDFS-5666: -- Summary: TestBPOfferService#/testBPInitErrorHandling fails intermittently Key: HDFS-5666 URL: https://issues.apache.org/jira/browse/HDFS-5666 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Priority: Minor Intermittent failure on this test: {code} Regression org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testBPInitErrorHandling Failing for the past 1 build (Since #5698 ) Took 0.16 sec. Error Message expected:1 but was:2 Stacktrace java.lang.AssertionError: expected:1 but was:2 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) {code} see https://builds.apache.org/job/PreCommit-HDFS-Build/5698//testReport/org.apache.hadoop.hdfs.server.datanode/TestBPOfferService/testBPInitErrorHandling/ -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5634) allow BlockReaderLocal to switch between checksumming and not
[ https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847854#comment-13847854 ] Andrew Wang commented on HDFS-5634: --- Thanks for being patient with me, I think I finally grok what's going on. bq. BRL builder Alright. It seems odd for datanodeID and block to ever be null since they're used to key the FISCache, but okay. Still, we're currently not setting the caching strategy in DFSInputStream#getBlockReader when pulling something out of the FISCache, should we? bq. We always buffer at least a single chunk, even if readahead is turned off. The mechanics of checksumming require this. In {{fillDataBuf(boolean)}}, it looks like we try to fill {{maxReadaheadLength}} modulo the slop. If this is zero, what happens? I think we need to take the checksum chunk size into account here somewhere. It'd also be good to have test coverage for all these different read paths and parameter combinations. It looks like TestParallel* already cover the ByteBuffer and array variants of read with and without checksums, but we also need to permute on different readahead sizes (e.g. 0, chunk-1, chunk+1). Unfortunately even though the array and BB versions are almost the same, they don't share any code, so we should probably still permute on that axis. With the new getter, we could use that instead of passing around {{canSkipChecksum}} everywhere, but I'm unsure if mlock toggling has implications here. Your call. There's still BlockReaderFactory#newBlockReader using verifyChecksum rather than skipChecksum. Do you want to change this one too? There are only 6 usages from what I see. bq. deque Yea, let's just skip this idea for now. We really should use a profiler for guidance before trying optimizations anyway. That'd be interesting to do later. bq. readahead and buffer size linkage I was thinking user documentation in hdfs-default.xml for dfs.client.read.shortcircuit.buffer.size and dfs.datanode.readahead.bytes but I see you filed a follow-on for that. It also feels inconsistent that the datanode defaults to 4MB readahead while the client looks like it defaults to 0, but maybe there's some reasoning there. allow BlockReaderLocal to switch between checksumming and not - Key: HDFS-5634 URL: https://issues.apache.org/jira/browse/HDFS-5634 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, HDFS-5634.003.patch, HDFS-5634.004.patch BlockReaderLocal should be able to switch between checksumming and non-checksumming, so that when we get notifications that something is mlocked (see HDFS-5182), we can avoid checksumming when reading from that block. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5454) DataNode UUID should be assigned prior to FsDataset initialization
[ https://issues.apache.org/jira/browse/HDFS-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-5454: Attachment: HDFS-5454.02.patch Updated patch. DataNode UUID should be assigned prior to FsDataset initialization -- Key: HDFS-5454 URL: https://issues.apache.org/jira/browse/HDFS-5454 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Eric Sirianni Priority: Minor Attachments: HDFS-5454.01.patch, HDFS-5454.02.patch The DataNode's UUID ({{DataStorage.getDatanodeUuid()}} field) is NULL at the point where the {{FsDataset}} object is created ({{DataNode.initStorage()}}. As the {{DataStorage}} object is an input to the {{FsDataset}} factory method, it is desirable for it to be fully populated with a UUID at this point. In particular, our {{FsDatasetSpi}} implementation relies upon the DataNode UUID as a key to access our underlying block storage device. This also appears to be a regression compared to Hadoop 1.x - our 1.x {{FSDatasetInterface}} plugin has a non-NULL UUID on startup. I haven't fully traced through the code, but I suspect this came from the {{BPOfferService}}/{{BPServiceActor}} refactoring to support federated namenodes. With HDFS-5448, the DataNode is now responsible for generating its own UUID. This greatly simplifies the fix. Move the UUID check/generation in from {{DataNode.createBPRegistration()}} to {{DataNode.initStorage()}}. This more naturally co-locates UUID generation immediately subsequent to the read of the UUID from the {{DataStorage}} properties file. {code} private void initStorage(final NamespaceInfo nsInfo) throws IOException { // ... final String bpid = nsInfo.getBlockPoolID(); //read storage info, lock data dirs and transition fs state if necessary storage.recoverTransitionRead(this, bpid, nsInfo, dataDirs, startOpt); // SUGGESTED NEW PLACE TO CHECK DATANODE UUID checkDatanodeUuid(); // ... } {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5350) Name Node should report fsimage transfer time as a metric
[ https://issues.apache.org/jira/browse/HDFS-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847869#comment-13847869 ] Andrew Wang commented on HDFS-5350: --- Hey Jimmy, thanks for the patch. It looks pretty good, just a few review comments: * GetImageServlet is used to transfer both images and edits. Since these look like they should be image-only metrics, we need to move the timing out of serveFile (which is used for both) into the right if statement. I think this is also more clear, since now all the timing will be in the same function. * Could you add a new test to TestNameNodeMetrics showing that this gets updated as expected? Name Node should report fsimage transfer time as a metric - Key: HDFS-5350 URL: https://issues.apache.org/jira/browse/HDFS-5350 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Rob Weltman Assignee: Jimmy Xiang Priority: Minor Fix For: 3.0.0 Attachments: trunk-5350.patch If the (Secondary) Name Node reported fsimage transfer times (perhaps the last ten of them), monitoring tools could detect slowdowns that might jeopardize cluster stability. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5634) allow BlockReaderLocal to switch between checksumming and not
[ https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847873#comment-13847873 ] Colin Patrick McCabe commented on HDFS-5634: bq. Alright. It seems odd for datanodeID and block to ever be null since they're used to key the FISCache, but okay. Still, we're currently not setting the caching strategy in DFSInputStream#getBlockReader when pulling something out of the FISCache, should we? Definitely. Fixed. bq. In fillDataBuf(boolean), it looks like we try to fill maxReadaheadLength modulo the slop. If this is zero, what happens? I think we need to take the checksum chunk size into account here somewhere. This is a bug. Basically it's a corner case where we need to fudge the user's no-readahead request into a short-readahead request when doing a checksummed read. Will fix... bq. It'd also be good to have test coverage for all these different read paths and parameter combinations. It looks like TestParallel* already cover the ByteBuffer and array variants of read with and without checksums, but we also need to permute on different readahead sizes (e.g. 0, chunk-1, chunk+1). Unfortunately even though the array and BB versions are almost the same, they don't share any code, so we should probably still permute on that axis. I think the main thing is to have coverage for the no-readahead + checksum case, which we don't really now. Having a test that did chunk-1 readahead, to verify that the round-up functionality worked, would also be a good idea. bq. With the new getter, we could use that instead of passing around canSkipChecksum everywhere, but I'm unsure if mlock toggling has implications here. Your call. Accessing the atomic boolean does have a performance overhead that I wanted to avoid. Also, as you said, it might change, which could create problems. bq. I was thinking user documentation in hdfs-default.xml for dfs.client.read.shortcircuit.buffer.size and dfs.datanode.readahead.bytes but I see you filed a follow-on for that. It also feels inconsistent that the datanode defaults to 4MB readahead while the client looks like it defaults to 0, but maybe there's some reasoning there. Actually, the client defaults to null readahead which usually means the datanode gets to decide. I guess it's a bit of a confusing construct, but here it is: {code} Long readahead = (conf.get(DFS_CLIENT_CACHE_READAHEAD) == null) ? null : conf.getLong(DFS_CLIENT_CACHE_READAHEAD, 0); {code} In {{BlockReaderLocal}} I am defaulting to {{dfs.datanode.readahead.bytes}} when the client's value is null. If we wanted to be totally consistent with how remote reads work, we'd have to get the readahead default from the DataNode as part of the RPC. That seems a little bit like overkill, though. bq. There's still BlockReaderFactory#newBlockReader using verifyChecksum rather than skipChecksum. Do you want to change this one too? There are only 6 usages from what I see. OK. allow BlockReaderLocal to switch between checksumming and not - Key: HDFS-5634 URL: https://issues.apache.org/jira/browse/HDFS-5634 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, HDFS-5634.003.patch, HDFS-5634.004.patch BlockReaderLocal should be able to switch between checksumming and non-checksumming, so that when we get notifications that something is mlocked (see HDFS-5182), we can avoid checksumming when reading from that block. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5656) add some configuration keys to hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847874#comment-13847874 ] Colin Patrick McCabe commented on HDFS-5656: also: dfs.datanode.readahead.bytes, dfs.client.read.shortcircuit.buffer.size add some configuration keys to hdfs-default.xml --- Key: HDFS-5656 URL: https://issues.apache.org/jira/browse/HDFS-5656 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Colin Patrick McCabe Priority: Minor Some configuration keys like {{dfs.client.read.shortcircuit}} are not present in {{hdfs-default.xml}} as they should be. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5656) add some configuration keys to hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847875#comment-13847875 ] Colin Patrick McCabe commented on HDFS-5656: bq. I'll add that dfs.client.cache.readahead doesn't have a default value in DFSConfigKeys too. Its default value is null, meaning that the datanode decides what readahead is (when doing remote reads) add some configuration keys to hdfs-default.xml --- Key: HDFS-5656 URL: https://issues.apache.org/jira/browse/HDFS-5656 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Colin Patrick McCabe Priority: Minor Some configuration keys like {{dfs.client.read.shortcircuit}} are not present in {{hdfs-default.xml}} as they should be. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5665) Remove the unnecessary writeLock while initializing CacheManager in FsNameSystem Ctor
[ https://issues.apache.org/jira/browse/HDFS-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847897#comment-13847897 ] Colin Patrick McCabe commented on HDFS-5665: I agree-- I think it would be fine to initialize this without holding the writeLock. Remove the unnecessary writeLock while initializing CacheManager in FsNameSystem Ctor - Key: HDFS-5665 URL: https://issues.apache.org/jira/browse/HDFS-5665 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 3.0.0, 2.2.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G I just saw the below piece of code in Fsnamesystem ctor. {code} writeLock(); try { this.cacheManager = new CacheManager(this, conf, blockManager); } finally { writeUnlock(); } {code} It seems unnecessary to keep writeLock here. I am not sure if there is a clear reason to keep the lock. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5664) try to relieve the BlockReaderLocal read() synchronized hotspot
[ https://issues.apache.org/jira/browse/HDFS-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847906#comment-13847906 ] Colin Patrick McCabe commented on HDFS-5664: A single {{DFSInputStream}} is designed to be used by a single thread only. We don't allow multiple reads from multiple threads on the same stream to go forward at the same time. So I don't see how making {{BlockReader}} (or a subclass like {{BlockReaderLocal}}) concurrent would help at all, since there would still be a big synchronized on all the {{DFSInputStream#read}} methods which use the {{BlockReader}}. If multiple threads want to read the same file at the same time, they can open multiple distinct streams for it. At that point, they're not sharing the same {{BlockReader}}, so whether or not BRL is synchronized doesn't matter. try to relieve the BlockReaderLocal read() synchronized hotspot --- Key: HDFS-5664 URL: https://issues.apache.org/jira/browse/HDFS-5664 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0, 2.2.0 Reporter: Liang Xie Assignee: Liang Xie Current the BlockReaderLocal's read has a synchronized modifier: {code} public synchronized int read(byte[] buf, int off, int len) throws IOException { {code} In a HBase physical read heavy cluster, we observed some hotspots from dfsclient path, the detail strace trace could be found from: https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13843241page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13843241 I haven't looked into the detail yet, put some raw ideas here firstly: 1) replace synchronized with try lock with timeout pattern, so could fail-fast, 2) fallback to non-ssr mode if get a local reader lock failed. There're two suitable scenario at least to remove this hotspot: 1) Local physical read heavy, e.g. HBase block cache miss ratio is high 2) slow/bad disk. It would be helpful to achive a lower 99th percentile HBase read latency somehow. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HDFS-5667) StorageType and State in DatanodeStorageInfo in NameNode is not accurate
Eric Sirianni created HDFS-5667: --- Summary: StorageType and State in DatanodeStorageInfo in NameNode is not accurate Key: HDFS-5667 URL: https://issues.apache.org/jira/browse/HDFS-5667 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Eric Sirianni Fix For: Heterogeneous Storage (HDFS-2832) The fields in DatanodeStorageInfo are updated from two distinct paths: # block reports # storage reports (via heartbeats) The {{state}} and {{storageType}} fields are updated via the Block Report. However, as seen in the code blow, these fields are populated from a dummy {{DatanodeStorage}} object constructed in the DataNode: {code} BPServiceActor.blockReport() { //... // Dummy DatanodeStorage object just for sending the block report. DatanodeStorage dnStorage = new DatanodeStorage(storageID); //... } {code} The net effect is that the {{state}} and {{storageType}} fields are always the default of {{NORMAL}} and {{DISK}} in the NameNode. The recommended fix is to change {{FsDatasetSpi.getBlockReports()}} from: {code} public MapString, BlockListAsLongs getBlockReports(String bpid); {code} to: {code} public MapDatanodeStorage, BlockListAsLongs getBlockReports(String bpid); {code} thereby allowing {{BPServiceActor}} to send the real {{DatanodeStorage}} object with the block report. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5667) StorageType and State in DatanodeStorageInfo in NameNode is not accurate
[ https://issues.apache.org/jira/browse/HDFS-5667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Sirianni updated HDFS-5667: Description: The fix for HDFS-5484 was accidentally regressed by the following change made via HDFS-5542 {code} + DatanodeStorageInfo updateStorage(DatanodeStorage s) { synchronized (storageMap) { DatanodeStorageInfo storage = storageMap.get(s.getStorageID()); if (storage == null) { @@ -670,8 +658,6 @@ for DN + getXferAddr()); storage = new DatanodeStorageInfo(this, s); storageMap.put(s.getStorageID(), storage); - } else { -storage.setState(s.getState()); } return storage; } {code} By removing the 'else' and no longer updating the state in the BlockReport processing path, we effectively get the bogus state type that is set via the first heartbeat (see the fix for HDFS-5455): {code} + if (storage == null) { +// This is seen during cluster initialization when the heartbeat +// is received before the initial block reports from each storage. +storage = updateStorage(new DatanodeStorage(report.getStorageID())); {code} Even reverting the change and reintroducing the 'else' leaves the state type temporarily inaccurate until the first block report. As discussed with [~arpitagarwal], a better fix would be to simply include the full DatanodeStorage object in the StorageReport (as opposed to only the Storage ID). This requires adding the {{DatanodeStorage}} object to {{StorageReportProto}}. It needs to be a new optional field and we cannot remove the existing {{StorageUuid}} for protocol compatibility. was: The fields in DatanodeStorageInfo are updated from two distinct paths: # block reports # storage reports (via heartbeats) The {{state}} and {{storageType}} fields are updated via the Block Report. However, as seen in the code blow, these fields are populated from a dummy {{DatanodeStorage}} object constructed in the DataNode: {code} BPServiceActor.blockReport() { //... // Dummy DatanodeStorage object just for sending the block report. DatanodeStorage dnStorage = new DatanodeStorage(storageID); //... } {code} The net effect is that the {{state}} and {{storageType}} fields are always the default of {{NORMAL}} and {{DISK}} in the NameNode. The recommended fix is to change {{FsDatasetSpi.getBlockReports()}} from: {code} public MapString, BlockListAsLongs getBlockReports(String bpid); {code} to: {code} public MapDatanodeStorage, BlockListAsLongs getBlockReports(String bpid); {code} thereby allowing {{BPServiceActor}} to send the real {{DatanodeStorage}} object with the block report. StorageType and State in DatanodeStorageInfo in NameNode is not accurate Key: HDFS-5667 URL: https://issues.apache.org/jira/browse/HDFS-5667 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Eric Sirianni Fix For: Heterogeneous Storage (HDFS-2832) The fix for HDFS-5484 was accidentally regressed by the following change made via HDFS-5542 {code} + DatanodeStorageInfo updateStorage(DatanodeStorage s) { synchronized (storageMap) { DatanodeStorageInfo storage = storageMap.get(s.getStorageID()); if (storage == null) { @@ -670,8 +658,6 @@ for DN + getXferAddr()); storage = new DatanodeStorageInfo(this, s); storageMap.put(s.getStorageID(), storage); - } else { -storage.setState(s.getState()); } return storage; } {code} By removing the 'else' and no longer updating the state in the BlockReport processing path, we effectively get the bogus state type that is set via the first heartbeat (see the fix for HDFS-5455): {code} + if (storage == null) { +// This is seen during cluster initialization when the heartbeat +// is received before the initial block reports from each storage. +storage = updateStorage(new DatanodeStorage(report.getStorageID())); {code} Even reverting the change and reintroducing the 'else' leaves the state type temporarily inaccurate until the first block report. As discussed with [~arpitagarwal], a better fix would be to simply include the full DatanodeStorage object in the StorageReport (as opposed to only the Storage ID). This requires adding the {{DatanodeStorage}} object to {{StorageReportProto}}. It needs to be a new optional field and we cannot remove the existing {{StorageUuid}} for protocol compatibility. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5667) StorageType and State in DatanodeStorageInfo in NameNode is not accurate
[ https://issues.apache.org/jira/browse/HDFS-5667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Sirianni updated HDFS-5667: Description: The fix for HDFS-5484 was accidentally regressed by the following change made via HDFS-5542 {code} + DatanodeStorageInfo updateStorage(DatanodeStorage s) { synchronized (storageMap) { DatanodeStorageInfo storage = storageMap.get(s.getStorageID()); if (storage == null) { @@ -670,8 +658,6 @@ for DN + getXferAddr()); storage = new DatanodeStorageInfo(this, s); storageMap.put(s.getStorageID(), storage); - } else { -storage.setState(s.getState()); } return storage; } {code} By removing the 'else' and no longer updating the state in the BlockReport processing path, we effectively get the bogus state type that is set via the first heartbeat (see the fix for HDFS-5455): {code} + if (storage == null) { +// This is seen during cluster initialization when the heartbeat +// is received before the initial block reports from each storage. +storage = updateStorage(new DatanodeStorage(report.getStorageID())); {code} Even reverting the change and reintroducing the 'else' leaves the state type temporarily inaccurate until the first block report. As discussed with [~arpitagarwal], a better fix would be to simply include the full {{DatanodeStorage}} object in the {{StorageReport}} (as opposed to only the Storage ID). This requires adding the {{DatanodeStorage}} object to {{StorageReportProto}}. It needs to be a new optional field and we cannot remove the existing {{StorageUuid}} for protocol compatibility. was: The fix for HDFS-5484 was accidentally regressed by the following change made via HDFS-5542 {code} + DatanodeStorageInfo updateStorage(DatanodeStorage s) { synchronized (storageMap) { DatanodeStorageInfo storage = storageMap.get(s.getStorageID()); if (storage == null) { @@ -670,8 +658,6 @@ for DN + getXferAddr()); storage = new DatanodeStorageInfo(this, s); storageMap.put(s.getStorageID(), storage); - } else { -storage.setState(s.getState()); } return storage; } {code} By removing the 'else' and no longer updating the state in the BlockReport processing path, we effectively get the bogus state type that is set via the first heartbeat (see the fix for HDFS-5455): {code} + if (storage == null) { +// This is seen during cluster initialization when the heartbeat +// is received before the initial block reports from each storage. +storage = updateStorage(new DatanodeStorage(report.getStorageID())); {code} Even reverting the change and reintroducing the 'else' leaves the state type temporarily inaccurate until the first block report. As discussed with [~arpitagarwal], a better fix would be to simply include the full DatanodeStorage object in the StorageReport (as opposed to only the Storage ID). This requires adding the {{DatanodeStorage}} object to {{StorageReportProto}}. It needs to be a new optional field and we cannot remove the existing {{StorageUuid}} for protocol compatibility. StorageType and State in DatanodeStorageInfo in NameNode is not accurate Key: HDFS-5667 URL: https://issues.apache.org/jira/browse/HDFS-5667 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Eric Sirianni Fix For: Heterogeneous Storage (HDFS-2832) The fix for HDFS-5484 was accidentally regressed by the following change made via HDFS-5542 {code} + DatanodeStorageInfo updateStorage(DatanodeStorage s) { synchronized (storageMap) { DatanodeStorageInfo storage = storageMap.get(s.getStorageID()); if (storage == null) { @@ -670,8 +658,6 @@ for DN + getXferAddr()); storage = new DatanodeStorageInfo(this, s); storageMap.put(s.getStorageID(), storage); - } else { -storage.setState(s.getState()); } return storage; } {code} By removing the 'else' and no longer updating the state in the BlockReport processing path, we effectively get the bogus state type that is set via the first heartbeat (see the fix for HDFS-5455): {code} + if (storage == null) { +// This is seen during cluster initialization when the heartbeat +// is received before the initial block reports from each storage. +storage = updateStorage(new DatanodeStorage(report.getStorageID())); {code} Even reverting the change and reintroducing the 'else' leaves the state type temporarily inaccurate until the first block report.
[jira] [Created] (HDFS-5668) TestBPOfferService.testBPInitErrorHandling fails intermittently
Jimmy Xiang created HDFS-5668: - Summary: TestBPOfferService.testBPInitErrorHandling fails intermittently Key: HDFS-5668 URL: https://issues.apache.org/jira/browse/HDFS-5668 Project: Hadoop HDFS Issue Type: Task Components: test Affects Versions: 3.0.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor The new test introduced in HDFS-4201 is a little flaky. I got failures locally occasionally. It could be related to how we did the mockup. {noformat} Exception in thread DataNode: [file:/home/.../hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/testBPInitErrorHandling/data] heartbeating to 0.0.0.0/0.0.0.0:0 org.mockito.exceptions.misusing.WrongTypeOfReturnValue: SimulatedFSDataset$$EnhancerByMockitoWithCGLIB$$5cb7c720 cannot be returned by getStorageId() getStorageId() should return String at org.apache.hadoop.hdfs.server.datanode.BPOfferService.toString(BPOfferService.java:178) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.toString(BPServiceActor.java:133) at java.lang.String.valueOf(String.java:2854) at java.lang.StringBuilder.append(StringBuilder.java:128) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:723) 2013-12-13 13:42:03,119 DEBUG datanode.DataNode (BPServiceActor.java:sendHeartBeat(468)) - Sending heartbeat from service actor: Block pool fake bpid (storage id null) service to 0.0.0.0/0.0.0.0:1 at java.lang.Thread.run(Thread.java:722) {noformat} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5634) allow BlockReaderLocal to switch between checksumming and not
[ https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5634: --- Attachment: HDFS-5634.005.patch allow BlockReaderLocal to switch between checksumming and not - Key: HDFS-5634 URL: https://issues.apache.org/jira/browse/HDFS-5634 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, HDFS-5634.003.patch, HDFS-5634.004.patch, HDFS-5634.005.patch BlockReaderLocal should be able to switch between checksumming and non-checksumming, so that when we get notifications that something is mlocked (see HDFS-5182), we can avoid checksumming when reading from that block. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5632) Add Snapshot feature to INodeDirectory
[ https://issues.apache.org/jira/browse/HDFS-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-5632: Attachment: HDFS-5632.003.patch Thanks for the review, Nicholas! Update the patch to address the comments. bq. but we should check precondition in DirectoryWithSnapshotFeature.destroyDstSubtree(..). The destroyDstSubtree may be invoked recursively, and in some cases the given directory may be not with snapshot. Thus I put the Precondition check in INodeReference#destroyAndCollectBlocks. Add Snapshot feature to INodeDirectory -- Key: HDFS-5632 URL: https://issues.apache.org/jira/browse/HDFS-5632 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5632.000.patch, HDFS-5632.001.patch, HDFS-5632.002.patch, HDFS-5632.003.patch We will add snapshot feature to INodeDirectory and remove INodeDirectoryWithSnapshot in this jira. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5634) allow BlockReaderLocal to switch between checksumming and not
[ https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847945#comment-13847945 ] Colin Patrick McCabe commented on HDFS-5634: I addressed the stuff discussed above. I also addressed the case where there are no checksums for the block. This can lead to some weird conditions like bytesPerChecksum being 0, so we want to avoid dividing by this number before we check it. This patch also sets the skipChecksum boolean when it detects one of these blocks, which should speed things up. This is an optimization that the old code didn't do. BlockReaderLocalTest is now an abstract class rather than an interface, so that we can have defaults. I added a bunch of new tests, including one where we set readahead just less than the block size. allow BlockReaderLocal to switch between checksumming and not - Key: HDFS-5634 URL: https://issues.apache.org/jira/browse/HDFS-5634 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, HDFS-5634.003.patch, HDFS-5634.004.patch, HDFS-5634.005.patch BlockReaderLocal should be able to switch between checksumming and non-checksumming, so that when we get notifications that something is mlocked (see HDFS-5182), we can avoid checksumming when reading from that block. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5434) Write resiliency for replica count 1
[ https://issues.apache.org/jira/browse/HDFS-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847967#comment-13847967 ] Arpit Agarwal commented on HDFS-5434: - Eric, what is the advantage of retaining only one replica? Unless it is temporary data that can be easily regenerated, having one replica guarantees data loss in quick order. e.g. on a 5000-disk cluster with an AFR of 4% you'd expect at least one disk failure every other day. This setting sounds somewhat dangerous setting as it could give users a false sense of reliability. Write resiliency for replica count 1 Key: HDFS-5434 URL: https://issues.apache.org/jira/browse/HDFS-5434 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Buddy Priority: Minor If a file has a replica count of one, the HDFS client is exposed to write failures if the data node fails during a write. With a pipeline of size of one, no recovery is possible if the sole data node dies. A simple fix is to force a minimum pipeline size of 2, while leaving the replication count as 1. The implementation for this is fairly non-invasive. Although the replica count is one, the block will be written to two data nodes instead of one. If one of the data nodes fails during the write, normal pipeline recovery will ensure that the write succeeds to the surviving data node. The existing code in the name node will prune the extra replica when it receives the block received reports for the finalized block from both data nodes. This results in the intended replica count of one for the block. This behavior should be controlled by a configuration option such as {{dfs.namenode.minPipelineSize}}. This behavior can be implemented in {{FSNameSystem.getAdditionalBlock()}} by ensuring that the pipeline size passed to {{BlockPlacementPolicy.chooseTarget()}} in the replication parameter is: {code} max(replication, ${dfs.namenode.minPipelineSize}) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5454) DataNode UUID should be assigned prior to FsDataset initialization
[ https://issues.apache.org/jira/browse/HDFS-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847985#comment-13847985 ] Hadoop QA commented on HDFS-5454: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618671/HDFS-5454.02.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5713//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5713//console This message is automatically generated. DataNode UUID should be assigned prior to FsDataset initialization -- Key: HDFS-5454 URL: https://issues.apache.org/jira/browse/HDFS-5454 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Eric Sirianni Priority: Minor Attachments: HDFS-5454.01.patch, HDFS-5454.02.patch The DataNode's UUID ({{DataStorage.getDatanodeUuid()}} field) is NULL at the point where the {{FsDataset}} object is created ({{DataNode.initStorage()}}. As the {{DataStorage}} object is an input to the {{FsDataset}} factory method, it is desirable for it to be fully populated with a UUID at this point. In particular, our {{FsDatasetSpi}} implementation relies upon the DataNode UUID as a key to access our underlying block storage device. This also appears to be a regression compared to Hadoop 1.x - our 1.x {{FSDatasetInterface}} plugin has a non-NULL UUID on startup. I haven't fully traced through the code, but I suspect this came from the {{BPOfferService}}/{{BPServiceActor}} refactoring to support federated namenodes. With HDFS-5448, the DataNode is now responsible for generating its own UUID. This greatly simplifies the fix. Move the UUID check/generation in from {{DataNode.createBPRegistration()}} to {{DataNode.initStorage()}}. This more naturally co-locates UUID generation immediately subsequent to the read of the UUID from the {{DataStorage}} properties file. {code} private void initStorage(final NamespaceInfo nsInfo) throws IOException { // ... final String bpid = nsInfo.getBlockPoolID(); //read storage info, lock data dirs and transition fs state if necessary storage.recoverTransitionRead(this, bpid, nsInfo, dataDirs, startOpt); // SUGGESTED NEW PLACE TO CHECK DATANODE UUID checkDatanodeUuid(); // ... } {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5661) Browsing FileSystem via web ui, should use datanode's hostname instead of ip address
[ https://issues.apache.org/jira/browse/HDFS-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848011#comment-13848011 ] Benoy Antony commented on HDFS-5661: Only Namenode will authenticate a client using HDFS Delegation Token ( which is issued by the Namenode.) The client still needs to authenticate to the datanode and cannot use the Delegation Token for this purpose. The configuration related to the cookie domain is hadoop.http.authentication.cookie.domain and more details on the authentication of http consoles is here : https://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-common/HttpAuthentication.html When browsing the filesystem via original webui, the DelegationToken is used by the DataNode to contact the Namenode . Browsing FileSystem via web ui, should use datanode's hostname instead of ip address Key: HDFS-5661 URL: https://issues.apache.org/jira/browse/HDFS-5661 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5661.patch If authentication is enabled on the web ui, then a cookie is used to keep track of the authentication information. There is normally a domain associated with the cookie. Since ip address doesn't have any domain , the cookie will not be sent by the browser while making http calls with ip address as the destination server. This will break browsing files system via web ui , if authentication is enabled. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5431) support cachepool-based limit management in path-based caching
[ https://issues.apache.org/jira/browse/HDFS-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5431: --- Description: We should support cachepool-based limit management in path-based caching. (was: We should support cachepool-based quota management in path-based caching.) support cachepool-based limit management in path-based caching -- Key: HDFS-5431 URL: https://issues.apache.org/jira/browse/HDFS-5431 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Andrew Wang Attachments: hdfs-5431-1.patch, hdfs-5431-2.patch, hdfs-5431-3.patch We should support cachepool-based limit management in path-based caching. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5431) support cachepool-based limit management in path-based caching
[ https://issues.apache.org/jira/browse/HDFS-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5431: --- Assignee: Andrew Wang (was: Colin Patrick McCabe) support cachepool-based limit management in path-based caching -- Key: HDFS-5431 URL: https://issues.apache.org/jira/browse/HDFS-5431 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Andrew Wang Attachments: hdfs-5431-1.patch, hdfs-5431-2.patch, hdfs-5431-3.patch We should support cachepool-based quota management in path-based caching. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5434) Write resiliency for replica count 1
[ https://issues.apache.org/jira/browse/HDFS-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848046#comment-13848046 ] Colin Patrick McCabe commented on HDFS-5434: I sort of assumed when I read this JIRA that Eric was considering situations where the admin knows that the storage is reliable despite having only one replica. For example, if there is RAID and we trust it, or if the backend storage system which the DataNode is using is doing replication underneath HDFS. This is definitely a little bit outside the normal Hadoop use-case, but it's something we can support without too much difficulty with a client-side config. Write resiliency for replica count 1 Key: HDFS-5434 URL: https://issues.apache.org/jira/browse/HDFS-5434 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Buddy Priority: Minor If a file has a replica count of one, the HDFS client is exposed to write failures if the data node fails during a write. With a pipeline of size of one, no recovery is possible if the sole data node dies. A simple fix is to force a minimum pipeline size of 2, while leaving the replication count as 1. The implementation for this is fairly non-invasive. Although the replica count is one, the block will be written to two data nodes instead of one. If one of the data nodes fails during the write, normal pipeline recovery will ensure that the write succeeds to the surviving data node. The existing code in the name node will prune the extra replica when it receives the block received reports for the finalized block from both data nodes. This results in the intended replica count of one for the block. This behavior should be controlled by a configuration option such as {{dfs.namenode.minPipelineSize}}. This behavior can be implemented in {{FSNameSystem.getAdditionalBlock()}} by ensuring that the pipeline size passed to {{BlockPlacementPolicy.chooseTarget()}} in the replication parameter is: {code} max(replication, ${dfs.namenode.minPipelineSize}) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5661) Browsing FileSystem via web ui, should use datanode's hostname instead of ip address
[ https://issues.apache.org/jira/browse/HDFS-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848048#comment-13848048 ] Haohui Mai commented on HDFS-5661: -- The only way to access the data on a secure DN is to present a valid delegation token. The HTTP auth tokens do not contain the DT, presenting the HTTP auth tokens to the DN does not grant you the access, thus it makes no sense to pass them around. Regardless of what UI you're using, the NN fetches the DT on the behalf of the client, and the client presents this DT to authenticate with DN. This should the only way you can access the data. If you happen to get the data in your approach, this is a security hole and please file a jira to track it. Again, I'll encourage you to check out the new web UI. It accesses the data through WebHDFS which is much more robust. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address Key: HDFS-5661 URL: https://issues.apache.org/jira/browse/HDFS-5661 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5661.patch If authentication is enabled on the web ui, then a cookie is used to keep track of the authentication information. There is normally a domain associated with the cookie. Since ip address doesn't have any domain , the cookie will not be sent by the browser while making http calls with ip address as the destination server. This will break browsing files system via web ui , if authentication is enabled. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5434) Write resiliency for replica count 1
[ https://issues.apache.org/jira/browse/HDFS-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848057#comment-13848057 ] Arpit Agarwal commented on HDFS-5434: - Thanks for the explanation, I didn't think of that. If the target storage is reliable then what do we gain by adding an extra replica in the pipeline? Just want to understand the use case. Write resiliency for replica count 1 Key: HDFS-5434 URL: https://issues.apache.org/jira/browse/HDFS-5434 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Buddy Priority: Minor If a file has a replica count of one, the HDFS client is exposed to write failures if the data node fails during a write. With a pipeline of size of one, no recovery is possible if the sole data node dies. A simple fix is to force a minimum pipeline size of 2, while leaving the replication count as 1. The implementation for this is fairly non-invasive. Although the replica count is one, the block will be written to two data nodes instead of one. If one of the data nodes fails during the write, normal pipeline recovery will ensure that the write succeeds to the surviving data node. The existing code in the name node will prune the extra replica when it receives the block received reports for the finalized block from both data nodes. This results in the intended replica count of one for the block. This behavior should be controlled by a configuration option such as {{dfs.namenode.minPipelineSize}}. This behavior can be implemented in {{FSNameSystem.getAdditionalBlock()}} by ensuring that the pipeline size passed to {{BlockPlacementPolicy.chooseTarget()}} in the replication parameter is: {code} max(replication, ${dfs.namenode.minPipelineSize}) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5651) remove dfs.namenode.caching.enabled
[ https://issues.apache.org/jira/browse/HDFS-5651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5651: --- Status: Patch Available (was: Open) remove dfs.namenode.caching.enabled --- Key: HDFS-5651 URL: https://issues.apache.org/jira/browse/HDFS-5651 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5651.001.patch We can remove dfs.namenode.caching.enabled and simply always enable caching, similar to how we do with snapshots and other features. The main overhead is the size of the cachedBlocks GSet. However, we can simply make the size of this GSet configurable, and people who don't want caching can set it to a very small value. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5651) remove dfs.namenode.caching.enabled
[ https://issues.apache.org/jira/browse/HDFS-5651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848063#comment-13848063 ] Colin Patrick McCabe commented on HDFS-5651: It's worth noting that DNs don't send cache reports unless their configured mlockable capacity is greater than 0. So there's no overhead from that either. remove dfs.namenode.caching.enabled --- Key: HDFS-5651 URL: https://issues.apache.org/jira/browse/HDFS-5651 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5651.001.patch We can remove dfs.namenode.caching.enabled and simply always enable caching, similar to how we do with snapshots and other features. The main overhead is the size of the cachedBlocks GSet. However, we can simply make the size of this GSet configurable, and people who don't want caching can set it to a very small value. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5651) remove dfs.namenode.caching.enabled
[ https://issues.apache.org/jira/browse/HDFS-5651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5651: --- Attachment: HDFS-5651.001.patch remove dfs.namenode.caching.enabled --- Key: HDFS-5651 URL: https://issues.apache.org/jira/browse/HDFS-5651 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5651.001.patch We can remove dfs.namenode.caching.enabled and simply always enable caching, similar to how we do with snapshots and other features. The main overhead is the size of the cachedBlocks GSet. However, we can simply make the size of this GSet configurable, and people who don't want caching can set it to a very small value. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5651) remove dfs.namenode.caching.enabled
[ https://issues.apache.org/jira/browse/HDFS-5651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5651: --- Attachment: HDFS-5651.002.patch also update defaults and docs. remove dfs.namenode.caching.enabled --- Key: HDFS-5651 URL: https://issues.apache.org/jira/browse/HDFS-5651 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5651.001.patch, HDFS-5651.002.patch We can remove dfs.namenode.caching.enabled and simply always enable caching, similar to how we do with snapshots and other features. The main overhead is the size of the cachedBlocks GSet. However, we can simply make the size of this GSet configurable, and people who don't want caching can set it to a very small value. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5434) Write resiliency for replica count 1
[ https://issues.apache.org/jira/browse/HDFS-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848069#comment-13848069 ] Colin Patrick McCabe commented on HDFS-5434: bq. If the target storage is reliable then what do we gain by adding an extra replica in the pipeline? Just want to understand the use case. Resiliency against transient network errors. Write resiliency for replica count 1 Key: HDFS-5434 URL: https://issues.apache.org/jira/browse/HDFS-5434 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: Buddy Priority: Minor If a file has a replica count of one, the HDFS client is exposed to write failures if the data node fails during a write. With a pipeline of size of one, no recovery is possible if the sole data node dies. A simple fix is to force a minimum pipeline size of 2, while leaving the replication count as 1. The implementation for this is fairly non-invasive. Although the replica count is one, the block will be written to two data nodes instead of one. If one of the data nodes fails during the write, normal pipeline recovery will ensure that the write succeeds to the surviving data node. The existing code in the name node will prune the extra replica when it receives the block received reports for the finalized block from both data nodes. This results in the intended replica count of one for the block. This behavior should be controlled by a configuration option such as {{dfs.namenode.minPipelineSize}}. This behavior can be implemented in {{FSNameSystem.getAdditionalBlock()}} by ensuring that the pipeline size passed to {{BlockPlacementPolicy.chooseTarget()}} in the replication parameter is: {code} max(replication, ${dfs.namenode.minPipelineSize}) {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5661) Browsing FileSystem via web ui, should use datanode's hostname instead of ip address
[ https://issues.apache.org/jira/browse/HDFS-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848076#comment-13848076 ] Benoy Antony commented on HDFS-5661: --the client presents this DT to authenticate with DN. This s not correct. How can DN validate and read DT ? The DT is issued by NN. client/agent can authenticate using DT only with NN. Both NN/DN use https://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-common/HttpAuthentication.html to authenticate a user's access to their http interfaces. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address Key: HDFS-5661 URL: https://issues.apache.org/jira/browse/HDFS-5661 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5661.patch If authentication is enabled on the web ui, then a cookie is used to keep track of the authentication information. There is normally a domain associated with the cookie. Since ip address doesn't have any domain , the cookie will not be sent by the browser while making http calls with ip address as the destination server. This will break browsing files system via web ui , if authentication is enabled. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5661) Browsing FileSystem via web ui, should use datanode's hostname instead of ip address
[ https://issues.apache.org/jira/browse/HDFS-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848085#comment-13848085 ] Jing Zhao commented on HDFS-5661: - In AuthFilter#doFilter, we have the following code: {code} public void doFilter(ServletRequest request, ServletResponse response, FilterChain filterChain) throws IOException, ServletException { final HttpServletRequest httpRequest = toLowerCase((HttpServletRequest)request); final String tokenString = httpRequest.getParameter(DelegationParam.NAME); if (tokenString != null) { //Token is present in the url, therefore token will be used for //authentication, bypass kerberos authentication. filterChain.doFilter(httpRequest, response); return; } super.doFilter(httpRequest, response, filterChain); } {code} In DatanodeJspHelper#generateDirectoryStructure, we have {code} String tokenString = req.getParameter(JspHelper.DELEGATION_PARAMETER_NAME); UserGroupInformation ugi = JspHelper.getUGI(req, conf); . DFSClient dfs = getDFSClient(ugi, nnAddr, conf); {code} So I think here the whole process is: 1. NN generates DT and put the DT into the redirect URL 2. DN receives the redirect request, finds that there is DT in the request, thus the corresponding SPNEGO filter will bypass the auth check 3. DN uses the DT and files a getFileInfo RPC call to NN 4. DN shows the result to web ui Browsing FileSystem via web ui, should use datanode's hostname instead of ip address Key: HDFS-5661 URL: https://issues.apache.org/jira/browse/HDFS-5661 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-5661.patch If authentication is enabled on the web ui, then a cookie is used to keep track of the authentication information. There is normally a domain associated with the cookie. Since ip address doesn't have any domain , the cookie will not be sent by the browser while making http calls with ip address as the destination server. This will break browsing files system via web ui , if authentication is enabled. Browsing FileSystem via web ui, should use datanode's hostname instead of ip address. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5636) Enforce a max TTL per cache pool
[ https://issues.apache.org/jira/browse/HDFS-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848090#comment-13848090 ] Andrew Wang commented on HDFS-5636: --- I was thinking a hard error. It's better for users too since the default expiry is never. Enforce a max TTL per cache pool Key: HDFS-5636 URL: https://issues.apache.org/jira/browse/HDFS-5636 Project: Hadoop HDFS Issue Type: Sub-task Components: caching, namenode Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang It'd be nice for administrators to be able to specify a maximum TTL for directives in a cache pool. This forces all directives to eventually age out. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5634) allow BlockReaderLocal to switch between checksumming and not
[ https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848100#comment-13848100 ] Hadoop QA commented on HDFS-5634: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618694/HDFS-5634.005.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestCrcCorruption org.apache.hadoop.hdfs.TestClientReportBadBlock org.apache.hadoop.hdfs.TestDFSClientRetries org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServer org.apache.hadoop.hdfs.server.namenode.TestCorruptFilesJsp org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks org.apache.hadoop.hdfs.server.namenode.TestFsck org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks org.apache.hadoop.hdfs.TestDFSShell The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDatanodeBlockScanner {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5714//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5714//console This message is automatically generated. allow BlockReaderLocal to switch between checksumming and not - Key: HDFS-5634 URL: https://issues.apache.org/jira/browse/HDFS-5634 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, HDFS-5634.003.patch, HDFS-5634.004.patch, HDFS-5634.005.patch BlockReaderLocal should be able to switch between checksumming and non-checksumming, so that when we get notifications that something is mlocked (see HDFS-5182), we can avoid checksumming when reading from that block. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5632) Add Snapshot feature to INodeDirectory
[ https://issues.apache.org/jira/browse/HDFS-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848101#comment-13848101 ] Hadoop QA commented on HDFS-5632: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618695/HDFS-5632.003.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestSafeMode {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5715//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5715//console This message is automatically generated. Add Snapshot feature to INodeDirectory -- Key: HDFS-5632 URL: https://issues.apache.org/jira/browse/HDFS-5632 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5632.000.patch, HDFS-5632.001.patch, HDFS-5632.002.patch, HDFS-5632.003.patch We will add snapshot feature to INodeDirectory and remove INodeDirectoryWithSnapshot in this jira. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5618) NameNode: persist ACLs in fsimage.
[ https://issues.apache.org/jira/browse/HDFS-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5618: - Attachment: HDFS-5618.000.patch NameNode: persist ACLs in fsimage. -- Key: HDFS-5618 URL: https://issues.apache.org/jira/browse/HDFS-5618 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Assignee: Haohui Mai Attachments: HDFS-5618.000.patch Store ACLs in fsimage so that ACLs are retained across NameNode restarts. This requires encoding and saving the {{AclManager}} state as a new section of the fsimage, located after all existing sections (snapshot manager state, inodes, secret manager state, and cache manager state). -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Assigned] (HDFS-5619) NameNode: record ACL modifications to edit log.
[ https://issues.apache.org/jira/browse/HDFS-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai reassigned HDFS-5619: Assignee: Haohui Mai NameNode: record ACL modifications to edit log. --- Key: HDFS-5619 URL: https://issues.apache.org/jira/browse/HDFS-5619 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Assignee: Haohui Mai Implement a new edit log opcode, {{OP_SET_ACL}}, which fully replaces the ACL of a specific inode. For ACL operations that perform partial modification of the ACL, the NameNode must merge the modifications with the existing ACL to produce the final resulting ACL and encode it into an {{OP_SET_ACL}}. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5634) allow BlockReaderLocal to switch between checksumming and not
[ https://issues.apache.org/jira/browse/HDFS-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848117#comment-13848117 ] Colin Patrick McCabe commented on HDFS-5634: found a bug in the fastpath handling. will post an update soon allow BlockReaderLocal to switch between checksumming and not - Key: HDFS-5634 URL: https://issues.apache.org/jira/browse/HDFS-5634 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5634.001.patch, HDFS-5634.002.patch, HDFS-5634.003.patch, HDFS-5634.004.patch, HDFS-5634.005.patch BlockReaderLocal should be able to switch between checksumming and non-checksumming, so that when we get notifications that something is mlocked (see HDFS-5182), we can avoid checksumming when reading from that block. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5431) support cachepool-based limit management in path-based caching
[ https://issues.apache.org/jira/browse/HDFS-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-5431: -- Attachment: hdfs-5431-4.patch Hi Colin, I made the suggested changes with the following notes: * With the new EnumSet, I also added a version of add/modify directive without a {{flags}} argument, since it's really annoying to put {{EnumSet.noneOf(CacheFlag.class}} everywhere. * I don't like putting {{bytesOverlimit}} in {{CachePoolEntry}} because it feels inconsistent with {{CachePoolStats}}. Left it as is, unless you feel strongly on this one. * Tweaked the kick behavior somewhat. We now try to do read-after-write consistency with the dirty bit. add/modify with force is essentially a read and a write. This also does complete the test refactor, so we can close that other JIRA out after this one goes in. support cachepool-based limit management in path-based caching -- Key: HDFS-5431 URL: https://issues.apache.org/jira/browse/HDFS-5431 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Andrew Wang Attachments: hdfs-5431-1.patch, hdfs-5431-2.patch, hdfs-5431-3.patch, hdfs-5431-4.patch We should support cachepool-based limit management in path-based caching. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5406) Send incremental block reports for all storages in a single call
[ https://issues.apache.org/jira/browse/HDFS-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-5406: Attachment: h5406.04.patch Send incremental block reports for all storages in a single call Key: HDFS-5406 URL: https://issues.apache.org/jira/browse/HDFS-5406 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: h5406.01.patch, h5406.02.patch, h5406.04.patch Per code review feedback from [~szetszwo] on HDFS-5390, we can combine all incremental block reports in a single {{blockReceivedAndDeleted}} call. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5651) remove dfs.namenode.caching.enabled
[ https://issues.apache.org/jira/browse/HDFS-5651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848155#comment-13848155 ] Hadoop QA commented on HDFS-5651: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618723/HDFS-5651.002.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives org.apache.hadoop.hdfs.server.namenode.ha.TestHAStateTransitions {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5716//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5716//console This message is automatically generated. remove dfs.namenode.caching.enabled --- Key: HDFS-5651 URL: https://issues.apache.org/jira/browse/HDFS-5651 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5651.001.patch, HDFS-5651.002.patch We can remove dfs.namenode.caching.enabled and simply always enable caching, similar to how we do with snapshots and other features. The main overhead is the size of the cachedBlocks GSet. However, we can simply make the size of this GSet configurable, and people who don't want caching can set it to a very small value. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5632) Add Snapshot feature to INodeDirectory
[ https://issues.apache.org/jira/browse/HDFS-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848178#comment-13848178 ] Hadoop QA commented on HDFS-5632: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618695/HDFS-5632.003.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated -14 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5717//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5717//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5717//console This message is automatically generated. Add Snapshot feature to INodeDirectory -- Key: HDFS-5632 URL: https://issues.apache.org/jira/browse/HDFS-5632 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5632.000.patch, HDFS-5632.001.patch, HDFS-5632.002.patch, HDFS-5632.003.patch We will add snapshot feature to INodeDirectory and remove INodeDirectoryWithSnapshot in this jira. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5431) support cachepool-based limit management in path-based caching
[ https://issues.apache.org/jira/browse/HDFS-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848212#comment-13848212 ] Hadoop QA commented on HDFS-5431: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618742/hdfs-5431-4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA org.apache.hadoop.cli.TestCacheAdminCLI org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5718//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5718//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5718//console This message is automatically generated. support cachepool-based limit management in path-based caching -- Key: HDFS-5431 URL: https://issues.apache.org/jira/browse/HDFS-5431 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Andrew Wang Attachments: hdfs-5431-1.patch, hdfs-5431-2.patch, hdfs-5431-3.patch, hdfs-5431-4.patch We should support cachepool-based limit management in path-based caching. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5484) StorageType and State in DatanodeStorageInfo in NameNode is not accurate
[ https://issues.apache.org/jira/browse/HDFS-5484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-5484: - Assignee: Eric Sirianni StorageType and State in DatanodeStorageInfo in NameNode is not accurate Key: HDFS-5484 URL: https://issues.apache.org/jira/browse/HDFS-5484 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Eric Sirianni Assignee: Eric Sirianni Fix For: Heterogeneous Storage (HDFS-2832) Attachments: HDFS-5484-HDFS-2832--2.patch, HDFS-5484-HDFS-2832.patch The fields in DatanodeStorageInfo are updated from two distinct paths: # block reports # storage reports (via heartbeats) The {{state}} and {{storageType}} fields are updated via the Block Report. However, as seen in the code blow, these fields are populated from a dummy {{DatanodeStorage}} object constructed in the DataNode: {code} BPServiceActor.blockReport() { //... // Dummy DatanodeStorage object just for sending the block report. DatanodeStorage dnStorage = new DatanodeStorage(storageID); //... } {code} The net effect is that the {{state}} and {{storageType}} fields are always the default of {{NORMAL}} and {{DISK}} in the NameNode. The recommended fix is to change {{FsDatasetSpi.getBlockReports()}} from: {code} public MapString, BlockListAsLongs getBlockReports(String bpid); {code} to: {code} public MapDatanodeStorage, BlockListAsLongs getBlockReports(String bpid); {code} thereby allowing {{BPServiceActor}} to send the real {{DatanodeStorage}} object with the block report. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5566) HA namenode with QJM created from org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider should implement Closeable
[ https://issues.apache.org/jira/browse/HDFS-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848222#comment-13848222 ] Jimmy Xiang commented on HDFS-5566: --- In NameNodeProxies#createProxy, it creates a proxy with interface ClientProtocol which is not Closable, and a RetryInvocationHandler, for the HA case. However, ConfiguredFailoverProxyProvider doesn't have field h, the InvocationHandler, which is the problem. I think this is a valid bug we need to fix. HA namenode with QJM created from org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider should implement Closeable -- Key: HDFS-5566 URL: https://issues.apache.org/jira/browse/HDFS-5566 Project: Hadoop HDFS Issue Type: Bug Environment: hadoop-2.2.0 hbase-0.96 Reporter: Henry Hung When using hbase-0.96 with hadoop-2.2.0, stopping master/regionserver node will result in {{Cannot close proxy - is not Closeable or does not provide closeable invocation}}. [Mail Archive|https://drive.google.com/file/d/0B22pkxoqCdvWSGFIaEpfR3lnT2M/edit?usp=sharing] My hadoop-2.2.0 configured as HA namenode with QJM, the configuration is like this: {code:xml} property namedfs.nameservices/name valuehadoopdev/value /property property namedfs.ha.namenodes.hadoopdev/name valuenn1,nn2/value /property property namedfs.namenode.rpc-address.hadoopdev.nn1/name valuefphd9.ctpilot1.com:9000/value /property property namedfs.namenode.http-address.hadoopdev.nn1/name valuefphd9.ctpilot1.com:50070/value /property property namedfs.namenode.rpc-address.hadoopdev.nn2/name valuefphd10.ctpilot1.com:9000/value /property property namedfs.namenode.http-address.hadoopdev.nn2/name valuefphd10.ctpilot1.com:50070/value /property property namedfs.namenode.shared.edits.dir/name valueqjournal://fphd8.ctpilot1.com:8485;fphd9.ctpilot1.com:8485;fphd10.ctpilot1.com:8485/hadoopdev/value /property property namedfs.client.failover.proxy.provider.hadoopdev/name valueorg.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider/value /property property namedfs.ha.fencing.methods/name valueshell(/bin/true)/value /property property namedfs.journalnode.edits.dir/name value/data/hadoop/hadoop-data-2/journal/value /property property namedfs.ha.automatic-failover.enabled/name valuetrue/value /property property nameha.zookeeper.quorum/name valuefphd1.ctpilot1.com:/value /property {code} I traced the code and found out that when stopping the hbase master node, it will try invoke method close on namenode, but the instance that created from {{org.apache.hadoop.hdfs.NameNodeProxies.createProxy}} with failoverProxyProviderClass {{org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}} do not have the Closeable interface. If we use the Non-HA case, the created instance will be {{org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB}} that implement Closeable. TL;DR; With hbase connecting to hadoop HA namenode, when stopping the hbase master or regionserver, it couldn't find the {{close}} method to gracefully close namenode session. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Reopened] (HDFS-5566) HA namenode with QJM created from org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider should implement Closeable
[ https://issues.apache.org/jira/browse/HDFS-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reopened HDFS-5566: --- Assignee: Jimmy Xiang HA namenode with QJM created from org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider should implement Closeable -- Key: HDFS-5566 URL: https://issues.apache.org/jira/browse/HDFS-5566 Project: Hadoop HDFS Issue Type: Bug Environment: hadoop-2.2.0 hbase-0.96 Reporter: Henry Hung Assignee: Jimmy Xiang When using hbase-0.96 with hadoop-2.2.0, stopping master/regionserver node will result in {{Cannot close proxy - is not Closeable or does not provide closeable invocation}}. [Mail Archive|https://drive.google.com/file/d/0B22pkxoqCdvWSGFIaEpfR3lnT2M/edit?usp=sharing] My hadoop-2.2.0 configured as HA namenode with QJM, the configuration is like this: {code:xml} property namedfs.nameservices/name valuehadoopdev/value /property property namedfs.ha.namenodes.hadoopdev/name valuenn1,nn2/value /property property namedfs.namenode.rpc-address.hadoopdev.nn1/name valuefphd9.ctpilot1.com:9000/value /property property namedfs.namenode.http-address.hadoopdev.nn1/name valuefphd9.ctpilot1.com:50070/value /property property namedfs.namenode.rpc-address.hadoopdev.nn2/name valuefphd10.ctpilot1.com:9000/value /property property namedfs.namenode.http-address.hadoopdev.nn2/name valuefphd10.ctpilot1.com:50070/value /property property namedfs.namenode.shared.edits.dir/name valueqjournal://fphd8.ctpilot1.com:8485;fphd9.ctpilot1.com:8485;fphd10.ctpilot1.com:8485/hadoopdev/value /property property namedfs.client.failover.proxy.provider.hadoopdev/name valueorg.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider/value /property property namedfs.ha.fencing.methods/name valueshell(/bin/true)/value /property property namedfs.journalnode.edits.dir/name value/data/hadoop/hadoop-data-2/journal/value /property property namedfs.ha.automatic-failover.enabled/name valuetrue/value /property property nameha.zookeeper.quorum/name valuefphd1.ctpilot1.com:/value /property {code} I traced the code and found out that when stopping the hbase master node, it will try invoke method close on namenode, but the instance that created from {{org.apache.hadoop.hdfs.NameNodeProxies.createProxy}} with failoverProxyProviderClass {{org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}} do not have the Closeable interface. If we use the Non-HA case, the created instance will be {{org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB}} that implement Closeable. TL;DR; With hbase connecting to hadoop HA namenode, when stopping the hbase master or regionserver, it couldn't find the {{close}} method to gracefully close namenode session. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-5406) Send incremental block reports for all storages in a single call
[ https://issues.apache.org/jira/browse/HDFS-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848226#comment-13848226 ] Hadoop QA commented on HDFS-5406: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618744/h5406.04.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5719//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5719//console This message is automatically generated. Send incremental block reports for all storages in a single call Key: HDFS-5406 URL: https://issues.apache.org/jira/browse/HDFS-5406 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: h5406.01.patch, h5406.02.patch, h5406.04.patch Per code review feedback from [~szetszwo] on HDFS-5390, we can combine all incremental block reports in a single {{blockReceivedAndDeleted}} call. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HDFS-5632) Add Snapshot feature to INodeDirectory
[ https://issues.apache.org/jira/browse/HDFS-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5632: - Hadoop Flags: Reviewed +1 patch looks good. Add Snapshot feature to INodeDirectory -- Key: HDFS-5632 URL: https://issues.apache.org/jira/browse/HDFS-5632 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5632.000.patch, HDFS-5632.001.patch, HDFS-5632.002.patch, HDFS-5632.003.patch We will add snapshot feature to INodeDirectory and remove INodeDirectoryWithSnapshot in this jira. -- This message was sent by Atlassian JIRA (v6.1.4#6159)