[jira] [Commented] (HDFS-8900) Compact XAttrs to optimize memory footprint.
[ https://issues.apache.org/jira/browse/HDFS-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718106#comment-14718106 ] Hudson commented on HDFS-8900: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2264 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2264/]) HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Compact XAttrs to optimize memory footprint. Key: HDFS-8900 URL: https://issues.apache.org/jira/browse/HDFS-8900 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HDFS-8900.001.patch, HDFS-8900.002.patch, HDFS-8900.003.patch, HDFS-8900.004.patch, HDFS-8900.005.patch {code} private final ImmutableListXAttr xAttrs; {code} Currently we use above in XAttrFeature, it's not efficient from memory point of view, since {{ImmutableList}} and {{XAttr}} have object memory overhead, and each object has memory alignment. We can use a {{byte[]}} in XAttrFeature and do some compact in {{XAttr}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone
[ https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lifeng Wang updated HDFS-8987: -- Summary: Erasure coding: MapReduce job failed when I set the / folder to the EC zone (was: Erasure coding: MapReduce job failed when I set the / foler to the EC zone ) Erasure coding: MapReduce job failed when I set the / folder to the EC zone Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. {noformat} ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / foler to the EC zone
[ https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HDFS-8987: - Description: Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. {noformat} ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` {noformat} was: Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` Erasure coding: MapReduce job failed when I set the / foler to the EC zone --- Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. {noformat} ```
[jira] [Commented] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / foler to the EC zone
[ https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718113#comment-14718113 ] Zhe Zhang commented on HDFS-8987: - Thanks for testing and finding the issue [~Lifeng Wang]. Looks like it's a duplicate of HDFS-8937. If you also agree we can close this JIRA. Erasure coding: MapReduce job failed when I set the / foler to the EC zone --- Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. {noformat} ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8960) DFS client says no more good datanodes being available to try on a single drive failure
[ https://issues.apache.org/jira/browse/HDFS-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718146#comment-14718146 ] Yongjun Zhang commented on HDFS-8960: - Yes, it's trying to do pipeline recovery, see below: {code} [yzhang@localhost Downloads]$ grep -B 3 blk_1073817519 r12s16-datanode.log | grep firstbadlink 15/08/23 07:21:49 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.5:10110 15/08/23 07:21:49 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.5:10110 15/08/23 07:21:52 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.1:10110 15/08/23 07:21:52 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.1:10110 15/08/23 07:21:55 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.6:10110 15/08/23 07:21:55 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.6:10110 15/08/23 07:21:58 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.8:10110 15/08/23 07:21:58 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.8:10110 15/08/23 07:22:01 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.14:10110 15/08/23 07:22:01 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.14:10110 15/08/23 07:22:04 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.2:10110 15/08/23 07:22:04 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.2:10110 15/08/23 07:22:07 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.9:10110 15/08/23 07:22:07 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.9:10110 15/08/23 07:22:10 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.3:10110 15/08/23 07:22:10 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.3:10110 15/08/23 07:22:13 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.7:10110 15/08/23 07:22:13 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.7:10110 15/08/23 07:22:16 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.10:10110 15/08/23 07:22:16 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.10:10110 15/08/23 07:22:19 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.12:10110 15/08/23 07:22:19 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.12:10110 15/08/23 07:22:23 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.11:10110 15/08/23 07:22:23 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.11:10110 15/08/23 07:22:26 INFO datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.24.32.15:10110 15/08/23 07:22:26 INFO datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.24.32.15:10110 {code} It happened you loaded r12s13 which is not one of the node in the grepped message (per your report, r12s13 is the last node in the initial pipeline), and r12s16 is the source node. Would you please upload a few more DN logs? Thanks. DFS client says no more good datanodes being available to try on a single drive failure - Key: HDFS-8960 URL: https://issues.apache.org/jira/browse/HDFS-8960 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.7.1 Environment: openjdk version 1.8.0_45-internal OpenJDK Runtime Environment (build 1.8.0_45-internal-b14) OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode) Reporter: Benoit Sigoure Attachments: blk_1073817519_77099.log, r12s13-datanode.log, r12s16-datanode.log Since we upgraded to 2.7.1 we regularly see single-drive failures cause widespread problems at
[jira] [Updated] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap
[ https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8988: - Issue Type: Sub-task (was: Improvement) Parent: HDFS-8793 Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap - Key: HDFS-8988 URL: https://issues.apache.org/jira/browse/HDFS-8988 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Priority: Minor Attachments: HDFS-8988.001.patch {code} public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new HashMap(); {code} {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in order, but it requires more memory for each entry (2 references = 8 bytes). We don't need to keep excess replicated blocks in order here, so should use {{LightWeightHashSet}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap
[ https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8988: - Status: Patch Available (was: Open) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap - Key: HDFS-8988 URL: https://issues.apache.org/jira/browse/HDFS-8988 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Priority: Minor Attachments: HDFS-8988.001.patch {code} public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new HashMap(); {code} {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in order, but it requires more memory for each entry (2 references = 8 bytes). We don't need to keep excess replicated blocks in order here, so should use {{LightWeightHashSet}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8704) Erasure Coding: client fails to write large file when one datanode fails
[ https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718121#comment-14718121 ] Hadoop QA commented on HDFS-8704: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 41s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 45s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 2s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 32s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 39s | The patch appears to introduce 5 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 5s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 188m 19s | Tests failed in hadoop-hdfs. | | | | 230m 30s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752923/HDFS-8704-HDFS-7285-006.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / 164cbe6 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12189/artifact/patchprocess/patchReleaseAuditProblems.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12189/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12189/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12189/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12189/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12189/console | This message was automatically generated. Erasure Coding: client fails to write large file when one datanode fails Key: HDFS-8704 URL: https://issues.apache.org/jira/browse/HDFS-8704 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: HDFS-8704-000.patch, HDFS-8704-HDFS-7285-002.patch, HDFS-8704-HDFS-7285-003.patch, HDFS-8704-HDFS-7285-004.patch, HDFS-8704-HDFS-7285-005.patch, HDFS-8704-HDFS-7285-006.patch I test current code on a 5-node cluster using RS(3,2). When a datanode is corrupt, client succeeds to write a file smaller than a block group but fails to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests files smaller than a block group, this jira will add more test situations. A streamer may encounter some bad datanodes when writing blocks allocated to it. When it fails to connect datanode or send a packet, the streamer needs to prepare for the next block. First it removes the packets of current block from its data queue. If the first packet of next block has already been in the data queue, the streamer will reset its state and start to wait for the next block allocated for it; otherwise it will just wait for the first packet of next block. The streamer will check periodically if it is asked to terminate during its waiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone
[ https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718116#comment-14718116 ] Lifeng Wang commented on HDFS-8987: --- OK. Please help to close this JIRA. Erasure coding: MapReduce job failed when I set the / folder to the EC zone Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. {noformat} ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone
[ https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang resolved HDFS-8987. - Resolution: Duplicate Erasure coding: MapReduce job failed when I set the / folder to the EC zone Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. {noformat} ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8965) Harden edit log reading code against out of memory errors
[ https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718102#comment-14718102 ] Hadoop QA commented on HDFS-8965: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 32s | Pre-patch trunk has 2 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 6s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 23s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 39s | The applied patch generated 12 new checkstyle issues (total was 401, now 407). | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 36s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 27s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 22s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 191m 46s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 6m 33s | Tests passed in bkjournal. | | | | 246m 37s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.qjournal.server.TestJournal | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752906/HDFS-8965.004.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 035ed26 | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/testrun_hadoop-hdfs.txt | | bkjournal test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12184/artifact/patchprocess/testrun_bkjournal.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12184/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12184/console | This message was automatically generated. Harden edit log reading code against out of memory errors - Key: HDFS-8965 URL: https://issues.apache.org/jira/browse/HDFS-8965 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, HDFS-8965.003.patch, HDFS-8965.004.patch We should harden the edit log reading code against out of memory errors. Now that each op has a length prefix and a checksum, we can validate the checksum before trying to load the Op data. This should avoid out of memory errors when trying to load garbage data as Op data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8963) Fix incorrect sign extension of xattr length in HDFS-8900
[ https://issues.apache.org/jira/browse/HDFS-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718105#comment-14718105 ] Hudson commented on HDFS-8963: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2264 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2264/]) HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix incorrect sign extension of xattr length in HDFS-8900 - Key: HDFS-8963 URL: https://issues.apache.org/jira/browse/HDFS-8963 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.8.0 Reporter: Haohui Mai Assignee: Colin Patrick McCabe Priority: Critical Fix For: 2.8.0 Attachments: HDFS-8963.001.patch HDFS-8900 introduced two new findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/12120/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap
[ https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8988: - Description: {code} public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new HashMap(); {code} {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in order, but it requires more memory for each entry (2 references = 8 bytes). We don't need to keep excess replicated blocks in order here, so should use {{LightWeightHashSet}}. was: {code} public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new HashMap(); {code} {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in order, but it requires more memory for each entry (2 references, totally 8 bytes). We don't need to keep excess replicated blocks in order here, so should use {{LightWeightHashSet}}. Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap - Key: HDFS-8988 URL: https://issues.apache.org/jira/browse/HDFS-8988 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu Priority: Minor {code} public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new HashMap(); {code} {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in order, but it requires more memory for each entry (2 references = 8 bytes). We don't need to keep excess replicated blocks in order here, so should use {{LightWeightHashSet}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap
[ https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8988: - Attachment: HDFS-8988.001.patch Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap - Key: HDFS-8988 URL: https://issues.apache.org/jira/browse/HDFS-8988 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu Priority: Minor Attachments: HDFS-8988.001.patch {code} public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new HashMap(); {code} {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in order, but it requires more memory for each entry (2 references = 8 bytes). We don't need to keep excess replicated blocks in order here, so should use {{LightWeightHashSet}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8987) MapReduce job failed when I set the / foler to the EC zone
[ https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HDFS-8987: - Issue Type: Sub-task (was: Bug) Parent: HDFS-7285 MapReduce job failed when I set the / foler to the EC zone --- Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / foler to the EC zone
[ https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HDFS-8987: - Summary: Erasure coding: MapReduce job failed when I set the / foler to the EC zone (was: MapReduce job failed when I set the / foler to the EC zone ) Erasure coding: MapReduce job failed when I set the / foler to the EC zone --- Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8689) move hasClusterEverBeenMultiRack to NetworkTopology
[ https://issues.apache.org/jira/browse/HDFS-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su reopened HDFS-8689: - move hasClusterEverBeenMultiRack to NetworkTopology --- Key: HDFS-8689 URL: https://issues.apache.org/jira/browse/HDFS-8689 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8689) move hasClusterEverBeenMultiRack to NetworkTopology
[ https://issues.apache.org/jira/browse/HDFS-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su resolved HDFS-8689. - Resolution: Invalid move hasClusterEverBeenMultiRack to NetworkTopology --- Key: HDFS-8689 URL: https://issues.apache.org/jira/browse/HDFS-8689 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap
Yi Liu created HDFS-8988: Summary: Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap Key: HDFS-8988 URL: https://issues.apache.org/jira/browse/HDFS-8988 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yi Liu Assignee: Yi Liu Priority: Minor {code} public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new HashMap(); {code} {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in order, but it requires more memory for each entry (2 references, totally 8 bytes). We don't need to keep excess replicated blocks in order here, so should use {{LightWeightHashSet}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone
[ https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang resolved HDFS-8987. - Resolution: Fixed Erasure coding: MapReduce job failed when I set the / folder to the EC zone Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. {noformat} ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8987) Erasure coding: MapReduce job failed when I set the / folder to the EC zone
[ https://issues.apache.org/jira/browse/HDFS-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang reopened HDFS-8987: - Erasure coding: MapReduce job failed when I set the / folder to the EC zone Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. {noformat} ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8963) Fix incorrect sign extension of xattr length in HDFS-8900
[ https://issues.apache.org/jira/browse/HDFS-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718147#comment-14718147 ] Hudson commented on HDFS-8963: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #307 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/307/]) HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix incorrect sign extension of xattr length in HDFS-8900 - Key: HDFS-8963 URL: https://issues.apache.org/jira/browse/HDFS-8963 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.8.0 Reporter: Haohui Mai Assignee: Colin Patrick McCabe Priority: Critical Fix For: 2.8.0 Attachments: HDFS-8963.001.patch HDFS-8900 introduced two new findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/12120/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8900) Compact XAttrs to optimize memory footprint.
[ https://issues.apache.org/jira/browse/HDFS-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718148#comment-14718148 ] Hudson commented on HDFS-8900: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #307 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/307/]) HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Compact XAttrs to optimize memory footprint. Key: HDFS-8900 URL: https://issues.apache.org/jira/browse/HDFS-8900 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HDFS-8900.001.patch, HDFS-8900.002.patch, HDFS-8900.003.patch, HDFS-8900.004.patch, HDFS-8900.005.patch {code} private final ImmutableListXAttr xAttrs; {code} Currently we use above in XAttrFeature, it's not efficient from memory point of view, since {{ImmutableList}} and {{XAttr}} have object memory overhead, and each object has memory alignment. We can use a {{byte[]}} in XAttrFeature and do some compact in {{XAttr}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8964) Provide max TxId when validating in-progress edit log files
[ https://issues.apache.org/jira/browse/HDFS-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8964: Attachment: HDFS-8964.01.patch Thanks Colin for taking a look. Updating the patch to better handle cases with no provided {{maxTxId}}. I also found {{scanLog}} is identical to {{validateLog}} and removed it from two places. {{FileJournalManager#getRemoteEditLogs}} and {{selectInputStreams}} are already updated to provide {{maxTxId}}. Where else do you think we are trying to read an active in-progress edit file? Provide max TxId when validating in-progress edit log files --- Key: HDFS-8964 URL: https://issues.apache.org/jira/browse/HDFS-8964 Project: Hadoop HDFS Issue Type: Bug Components: journal-node, namenode Affects Versions: 2.7.1 Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8964.00.patch, HDFS-8964.01.patch NN/JN validates in-progress edit log files in multiple scenarios, via {{EditLogFile#validateLog}}. The method scans through the edit log file to find the last transaction ID. However, an in-progress edit log file could be actively written to, which creates a race condition and causes incorrect data to be read (and later we attempt to interpret the data as ops). Currently {{validateLog}} is used in 3 places: # NN {{getEditsFromTxid}} # JN {{getEditLogManifest}} # NN/JN {{recoverUnfinalizedSegments}} In the first two scenarios we should provide a maximum TxId to validate in the in-progress file. The 3rd scenario won't cause a race condition because only non-current in-progress edit log files are validated. {{validateLog}} is actually only used with in-progress files, and could use a better name and Javadoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8978) Erasure coding: fix 2 failed tests of DFSStripedOutputStream
[ https://issues.apache.org/jira/browse/HDFS-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718158#comment-14718158 ] Hadoop QA commented on HDFS-8978: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 23s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 8 new or modified test files. | | {color:green}+1{color} | javac | 7m 42s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 45s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 31s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 36s | The patch appears to introduce 4 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 4s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 53m 52s | Tests failed in hadoop-hdfs. | | | | 95m 19s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.TestParallelShortCircuitRead | | | hadoop.fs.contract.hdfs.TestHDFSContractMkdir | | | hadoop.hdfs.qjournal.client.TestQuorumJournalManagerUnit | | | hadoop.hdfs.server.namenode.TestAllowFormat | | | hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens | | | hadoop.hdfs.TestBlockStoragePolicy | | | hadoop.hdfs.server.datanode.TestRefreshNamenodes | | | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM | | | hadoop.hdfs.TestEncryptedTransfer | | | hadoop.hdfs.protocol.TestBlockListAsLongs | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotMetrics | | | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.TestFileLengthOnClusterRestart | | | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshottableDirListing | | | hadoop.fs.contract.hdfs.TestHDFSContractRootDirectory | | | hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots | | | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.TestDFSPermission | | | hadoop.hdfs.server.namenode.TestCheckpoint | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.tools.TestGetGroups | | | hadoop.hdfs.TestRemoteBlockReader2 | | | hadoop.hdfs.server.namenode.TestStartup | | | hadoop.hdfs.TestErasureCodingZones | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy | | | hadoop.hdfs.TestDFSStorageStateRecovery | | | hadoop.hdfs.server.namenode.TestFSImageWithXAttr | | | hadoop.hdfs.TestRemoteBlockReader | | | hadoop.hdfs.TestMultiThreadedHflush | | | hadoop.fs.contract.hdfs.TestHDFSContractRename | | | hadoop.hdfs.TestBlockReaderLocal | | | hadoop.cli.TestCacheAdminCLI | | | hadoop.hdfs.server.mover.TestMover | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation | | | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks | | | hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits | | | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality | | | hadoop.hdfs.server.namenode.TestNameNodeRecovery | | | hadoop.hdfs.server.namenode.ha.TestFailureOfSharedDir | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.fs.loadGenerator.TestLoadGenerator | | | hadoop.hdfs.server.namenode.TestFSImageWithAcl | | | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete | | | hadoop.fs.TestFcHdfsSetUMask | | | hadoop.hdfs.TestPread | | | hadoop.hdfs.server.namenode.TestFSEditLogLoader | | | hadoop.hdfs.server.datanode.TestFsDatasetCacheRevocation | | | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA | | | hadoop.hdfs.crypto.TestHdfsCryptoStreams | | | hadoop.fs.viewfs.TestViewFsFileStatusHdfs | | | hadoop.hdfs.server.namenode.TestCommitBlockSynchronization | | | hadoop.hdfs.server.datanode.TestReadOnlySharedStorage | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl | | | hadoop.hdfs.server.namenode.ha.TestHAConfiguration | | | hadoop.hdfs.TestDFSAddressConfig | | | hadoop.tracing.TestTracingShortCircuitLocalRead | | |
[jira] [Created] (HDFS-8987) MapReduce job failed when I set the / foler to the EC zone
Lifeng Wang created HDFS-8987: - Summary: MapReduce job failed when I set the / foler to the EC zone Key: HDFS-8987 URL: https://issues.apache.org/jira/browse/HDFS-8987 Project: Hadoop HDFS Issue Type: Bug Components: HDFS Affects Versions: 3.0.0 Reporter: Lifeng Wang Test progress is as follows * For a new cluster, I format the namenode and then start HDFS service. * After HDFS service is started, there is no files in HDFS and set the / folder to the EC zone. the EC zone is created successfully. * Start the yarn and mr JobHistoryServer services. All the services start successfully. * Then run hadoop example pi program and it failed. The following is the exception. ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.UnsupportedActionException): Cannot set replication to a file with striped blocks at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetReplication(FSDirAttrOp.java:391) at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setReplication(FSDirAttrOp.java:151) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:2231) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setReplication(NameNodeRpcServer.java:682) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setReplication(ClientNamenodeProtocolServerSideTranslatorPB.java:445) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:636) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2171) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1666) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2165) ``` -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement
[ https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718222#comment-14718222 ] Masatake Iwasaki commented on HDFS-8946: Thanks for working on this, [~Hitliuyi]. I read the code of BlockPlacementPolicyDefault for HDFS-8945 recently and comment here while my memory is fresh:-) bq. Besides, no need to shuffle the storages, since we only need to check according to the storage type on the datanode once. Here is my understanding of this. Please correct me if I'm wrong: {{LocatedBlock}} returned by {{ClientProtocol#addBlock}}, {{ClientProtocol#getAdditionalDatanode}} and {{ClientProtocol#updateBlockForPipeline}} contains storageIDs given by {{BlockPlacementPolicy#chooseTarget}} but the user of these APIs (which is only DataStreamer) does not uses storageIDs. DataStreamer just send storage type to DataNode and the DataNode decides which volume to use on its own by using {{VolumeChoosingPolicy}}. Improve choosing datanode storage for block placement - Key: HDFS-8946 URL: https://issues.apache.org/jira/browse/HDFS-8946 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch This JIRA is to: Improve chooseing datanode storage for block placement: In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, {{chooseRandom}}), we have following logic to choose datanode storage to place block. For given storage type, we iterate storages of the datanode. But for datanode, it only cares about the storage type. In the loop, we check according to Storage type and return the first storage if the storages of the type on the datanode fit in requirement. So we can remove the iteration of storages, and just need to do once to find a good storage of given type, it's efficient if the storages of the type on the datanode don't fit in requirement since we don't need to loop all storages and do the same check. Besides, no need to shuffle the storages, since we only need to check according to the storage type on the datanode once. This also improves the logic and make it more clear. {code} if (excludedNodes.add(localMachine) // was not in the excluded list isGoodDatanode(localDatanode, maxNodesPerRack, false, results, avoidStaleNodes)) { for (IteratorMap.EntryStorageType, Integer iter = storageTypes .entrySet().iterator(); iter.hasNext(); ) { Map.EntryStorageType, Integer entry = iter.next(); for (DatanodeStorageInfo localStorage : DFSUtil.shuffle( localDatanode.getStorageInfos())) { StorageType type = entry.getKey(); if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize, results, type) = 0) { int num = entry.getValue(); ... {code} (current logic above) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5714) Use byte array to represent UnderConstruction feature and Snapshot feature for INodeFile
[ https://issues.apache.org/jira/browse/HDFS-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718219#comment-14718219 ] Yi Liu commented on HDFS-5714: -- Thanks [~jingzhao]. I think this is good, can you rebase the patch? Use byte array to represent UnderConstruction feature and Snapshot feature for INodeFile Key: HDFS-5714 URL: https://issues.apache.org/jira/browse/HDFS-5714 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-5714.000.patch Currently we define specific classes to represent different INode features, such as FileUnderConstructionFeature and FileWithSnapshotFeature. While recording these feature information in memory, the internal information and object references can still cost a lot of memory. For example, for FileWithSnapshotFeature, not considering the INode's local name, the whole FileDiff list (with size n) can cost around 120n bytes. In order to decrease the memory usage, we plan to use byte array to record the UnderConstruction feature and Snapshot feature for INodeFile. Specifically, if we use protobuf's encoding, the memory usage for a FileWithSnapshotFeature can be less than 56n bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8704) Erasure Coding: client fails to write large file when one datanode fails
[ https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718235#comment-14718235 ] Li Bo commented on HDFS-8704: - The two failed test cases are about insufficient datanodes. They also fail without the patch. We can handle them in a separate jira. Erasure Coding: client fails to write large file when one datanode fails Key: HDFS-8704 URL: https://issues.apache.org/jira/browse/HDFS-8704 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: HDFS-8704-000.patch, HDFS-8704-HDFS-7285-002.patch, HDFS-8704-HDFS-7285-003.patch, HDFS-8704-HDFS-7285-004.patch, HDFS-8704-HDFS-7285-005.patch, HDFS-8704-HDFS-7285-006.patch I test current code on a 5-node cluster using RS(3,2). When a datanode is corrupt, client succeeds to write a file smaller than a block group but fails to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests files smaller than a block group, this jira will add more test situations. A streamer may encounter some bad datanodes when writing blocks allocated to it. When it fails to connect datanode or send a packet, the streamer needs to prepare for the next block. First it removes the packets of current block from its data queue. If the first packet of next block has already been in the data queue, the streamer will reset its state and start to wait for the next block allocated for it; otherwise it will just wait for the first packet of next block. The streamer will check periodically if it is asked to terminate during its waiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8988) Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap
[ https://issues.apache.org/jira/browse/HDFS-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718343#comment-14718343 ] Hadoop QA commented on HDFS-8988: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 32s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 10s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 31s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 42s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 15s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 188m 53s | Tests failed in hadoop-hdfs. | | | | 232m 35s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles | | | hadoop.hdfs.TestRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752942/HDFS-8988.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e166c03 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12192/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12192/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12192/console | This message was automatically generated. Use LightWeightHashSet instead of LightWeightLinkedSet in BlockManager#excessReplicateMap - Key: HDFS-8988 URL: https://issues.apache.org/jira/browse/HDFS-8988 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Yi Liu Assignee: Yi Liu Priority: Minor Attachments: HDFS-8988.001.patch {code} public final MapString, LightWeightLinkedSetBlock excessReplicateMap = new HashMap(); {code} {{LightWeightLinkedSet}} extends {{LightWeightHashSet}} and keeps elements in order, but it requires more memory for each entry (2 references = 8 bytes). We don't need to keep excess replicated blocks in order here, so should use {{LightWeightHashSet}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8501) Erasure Coding: Improve memory efficiency of BlockInfoStriped
[ https://issues.apache.org/jira/browse/HDFS-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8501: Attachment: HDFS-8501-HDFS-7285.01.patch Erasure Coding: Improve memory efficiency of BlockInfoStriped - Key: HDFS-8501 URL: https://issues.apache.org/jira/browse/HDFS-8501 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8501-HDFS-7285.01.patch Erasure Coding: Improve memory efficiency of BlockInfoStriped Assume we have a BlockInfoStriped: {noformat} triplets[] = {s0, s1, s2, s3} indices[] = {0, 1, 2, 3} {noformat} {{s0}} means {{storage_0}} When we run balancer/mover to re-locate replica on s2, firstly it becomes: {noformat} triplets[] = {s0, s1, s2, s3, s4} indices[] = {0, 1, 2, 3, 2} {noformat} Then the replica on s2 is removed, finally it becomes: {noformat} triplets[] = {s0, s1, null, s3, s4} indices[] = {0, 1, -1, 3, 2} {noformat} The worst case is: {noformat} triplets[] = {null, null, null, null, s4, s5, s6, s7} indices[] = {-1, -1, -1, -1, 0, 1, 2, 3} {noformat} We should learn from {{BlockInfoContiguous.removeStorage(..)}}. When a storage is removed, we move the last item front. With the improvement, the worst case become: {noformat} triplets[] = {s4, s5, s6, s7, null} indices[] = {0, 1, 2, 3, -1} {noformat} We have an empty slot. Notes: Assume we copy 4 storage first, then delete 4. Even with the improvement, the worst case could be: {noformat} triplets[] = {s4, s5, s6, s7, null, null, null, null} indices[] = {0, 1, 2, 3, -1, -1, -1, -1} {noformat} But the Balancer uses {{delHint}}. So when add one will always delete one. So this case won't happen for striped and contiguous blocks. *idx_i must be moved to slot_i.* So slot_i will have idx_i. So we can do further improvement in HDFS-8032. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8964) Provide max TxId when validating in-progress edit log files
[ https://issues.apache.org/jira/browse/HDFS-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718390#comment-14718390 ] Hadoop QA commented on HDFS-8964: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 53s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 52s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 7s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 22s | The applied patch generated 2 new checkstyle issues (total was 162, now 162). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 37s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 21s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 189m 35s | Tests failed in hadoop-hdfs. | | | | 235m 24s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.qjournal.server.TestJournal | | | hadoop.hdfs.server.namenode.TestFSNamesystem | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.server.namenode.TestFileTruncate | | | hadoop.hdfs.server.namenode.TestFileJournalManager | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752950/HDFS-8964.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e166c03 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12193/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12193/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12193/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12193/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12193/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12193/console | This message was automatically generated. Provide max TxId when validating in-progress edit log files --- Key: HDFS-8964 URL: https://issues.apache.org/jira/browse/HDFS-8964 Project: Hadoop HDFS Issue Type: Bug Components: journal-node, namenode Affects Versions: 2.7.1 Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8964.00.patch, HDFS-8964.01.patch NN/JN validates in-progress edit log files in multiple scenarios, via {{EditLogFile#validateLog}}. The method scans through the edit log file to find the last transaction ID. However, an in-progress edit log file could be actively written to, which creates a race condition and causes incorrect data to be read (and later we attempt to interpret the data as ops). Currently {{validateLog}} is used in 3 places: # NN {{getEditsFromTxid}} # JN {{getEditLogManifest}} # NN/JN {{recoverUnfinalizedSegments}} In the first two scenarios we should provide a maximum TxId to validate in the in-progress file. The 3rd scenario won't cause a race condition because only non-current in-progress edit log files are validated. {{validateLog}} is actually only used with in-progress files, and could use a better name and Javadoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8946) Improve choosing datanode storage for block placement
[ https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8946: - Attachment: HDFS-8946.003.patch Improve choosing datanode storage for block placement - Key: HDFS-8946 URL: https://issues.apache.org/jira/browse/HDFS-8946 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, HDFS-8946.003.patch This JIRA is to: Improve chooseing datanode storage for block placement: In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, {{chooseRandom}}), we have following logic to choose datanode storage to place block. For given storage type, we iterate storages of the datanode. But for datanode, it only cares about the storage type. In the loop, we check according to Storage type and return the first storage if the storages of the type on the datanode fit in requirement. So we can remove the iteration of storages, and just need to do once to find a good storage of given type, it's efficient if the storages of the type on the datanode don't fit in requirement since we don't need to loop all storages and do the same check. Besides, no need to shuffle the storages, since we only need to check according to the storage type on the datanode once. This also improves the logic and make it more clear. {code} if (excludedNodes.add(localMachine) // was not in the excluded list isGoodDatanode(localDatanode, maxNodesPerRack, false, results, avoidStaleNodes)) { for (IteratorMap.EntryStorageType, Integer iter = storageTypes .entrySet().iterator(); iter.hasNext(); ) { Map.EntryStorageType, Integer entry = iter.next(); for (DatanodeStorageInfo localStorage : DFSUtil.shuffle( localDatanode.getStorageInfos())) { StorageType type = entry.getKey(); if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize, results, type) = 0) { int num = entry.getValue(); ... {code} (current logic above) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8501) Erasure Coding: Improve memory efficiency of BlockInfoStriped
[ https://issues.apache.org/jira/browse/HDFS-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-8501: Status: Patch Available (was: Open) Erasure Coding: Improve memory efficiency of BlockInfoStriped - Key: HDFS-8501 URL: https://issues.apache.org/jira/browse/HDFS-8501 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8501-HDFS-7285.01.patch Erasure Coding: Improve memory efficiency of BlockInfoStriped Assume we have a BlockInfoStriped: {noformat} triplets[] = {s0, s1, s2, s3} indices[] = {0, 1, 2, 3} {noformat} {{s0}} means {{storage_0}} When we run balancer/mover to re-locate replica on s2, firstly it becomes: {noformat} triplets[] = {s0, s1, s2, s3, s4} indices[] = {0, 1, 2, 3, 2} {noformat} Then the replica on s2 is removed, finally it becomes: {noformat} triplets[] = {s0, s1, null, s3, s4} indices[] = {0, 1, -1, 3, 2} {noformat} The worst case is: {noformat} triplets[] = {null, null, null, null, s4, s5, s6, s7} indices[] = {-1, -1, -1, -1, 0, 1, 2, 3} {noformat} We should learn from {{BlockInfoContiguous.removeStorage(..)}}. When a storage is removed, we move the last item front. With the improvement, the worst case become: {noformat} triplets[] = {s4, s5, s6, s7, null} indices[] = {0, 1, 2, 3, -1} {noformat} We have an empty slot. Notes: Assume we copy 4 storage first, then delete 4. Even with the improvement, the worst case could be: {noformat} triplets[] = {s4, s5, s6, s7, null, null, null, null} indices[] = {0, 1, 2, 3, -1, -1, -1, -1} {noformat} But the Balancer uses {{delHint}}. So when add one will always delete one. So this case won't happen for striped and contiguous blocks. *idx_i must be moved to slot_i.* So slot_i will have idx_i. So we can do further improvement in HDFS-8032. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718197#comment-14718197 ] GAO Rui commented on HDFS-7285: --- Thank you very much [~brahmareddy], [~zhz]. {quote} 1) snapshot feature 2) balancer feature {quote} may will be developed in future EC work, we could add these into the system test plan, and implement the test later. {quote} 4) parallel writes 5) parallel reads {quote} I think {{parallel reads}} means more than one client try to read the same EC file form HDFS, right? What is {{parallel writes}} refer to, in EC system testing? Could you explain the scenario? {quote} 1. Good points from Brahma Reddy Battula, I suggest that we also add HSM/mover tests to the list. 2. In reading tests we can distinguish stateful read and pread. Maybe we should test seek-and-read scenario too. 3. It seems each test scenario in the Tips for EC Writing/Reading section is systematically labeled. Will the labels be used to drive automatic testing? {quote} We can also add {{HSM/mover}} to the test plan, and implement it in future work. For the reading distinguish, we currently implement system test by using FSShell command in terminal, like {{CopyFromLocal}} and {{CopyToLocal}}. Can we set the client to read EC file in particular mechanism like stateful read and pread by terminal command? The labels in EC Writing/Reading tests were generated by test script during the test process, but it is also possible to drive automatic testing by the scenario labels vice versa. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: Compare-consolidated-20150824.diff, Consolidated-20150707.patch, Consolidated-20150806.patch, Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFS-7285-merge-consolidated-01.patch, HDFS-7285-merge-consolidated-trunk-01.patch, HDFS-7285-merge-consolidated.trunk.03.patch, HDFS-7285-merge-consolidated.trunk.04.patch, HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, HDFSErasureCodingSystemTestPlan-20150824.pdf, HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement
[ https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718232#comment-14718232 ] Yi Liu commented on HDFS-8946: -- Thanks Masatake for the comments. {quote} LocatedBlock returned by ClientProtocol#addBlock, ClientProtocol#getAdditionalDatanode and ClientProtocol#updateBlockForPipeline contains storageIDs given by BlockPlacementPolicy#chooseTarget but the user of these APIs (which is only DataStreamer) does not uses storageIDs. DataStreamer just send storage type to DataNode and the DataNode decides which volume to use on its own by using VolumeChoosingPolicy. {quote} Yes, {{storageIDs}} is not used. {quote} Any reason to change the logic of remaining size checking? {quote} Nice find, It's my fault to forget to add {{blockSize *}} while coping this logic from original {{isGoodTarget}} Improve choosing datanode storage for block placement - Key: HDFS-8946 URL: https://issues.apache.org/jira/browse/HDFS-8946 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch This JIRA is to: Improve chooseing datanode storage for block placement: In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, {{chooseRandom}}), we have following logic to choose datanode storage to place block. For given storage type, we iterate storages of the datanode. But for datanode, it only cares about the storage type. In the loop, we check according to Storage type and return the first storage if the storages of the type on the datanode fit in requirement. So we can remove the iteration of storages, and just need to do once to find a good storage of given type, it's efficient if the storages of the type on the datanode don't fit in requirement since we don't need to loop all storages and do the same check. Besides, no need to shuffle the storages, since we only need to check according to the storage type on the datanode once. This also improves the logic and make it more clear. {code} if (excludedNodes.add(localMachine) // was not in the excluded list isGoodDatanode(localDatanode, maxNodesPerRack, false, results, avoidStaleNodes)) { for (IteratorMap.EntryStorageType, Integer iter = storageTypes .entrySet().iterator(); iter.hasNext(); ) { Map.EntryStorageType, Integer entry = iter.next(); for (DatanodeStorageInfo localStorage : DFSUtil.shuffle( localDatanode.getStorageInfos())) { StorageType type = entry.getKey(); if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize, results, type) = 0) { int num = entry.getValue(); ... {code} (current logic above) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8963) Fix incorrect sign extension of xattr length in HDFS-8900
[ https://issues.apache.org/jira/browse/HDFS-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718199#comment-14718199 ] Hudson commented on HDFS-8963: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2245 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2245/]) HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java Fix incorrect sign extension of xattr length in HDFS-8900 - Key: HDFS-8963 URL: https://issues.apache.org/jira/browse/HDFS-8963 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.8.0 Reporter: Haohui Mai Assignee: Colin Patrick McCabe Priority: Critical Fix For: 2.8.0 Attachments: HDFS-8963.001.patch HDFS-8900 introduced two new findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/12120/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8900) Compact XAttrs to optimize memory footprint.
[ https://issues.apache.org/jira/browse/HDFS-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718200#comment-14718200 ] Hudson commented on HDFS-8900: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2245 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2245/]) HDFS-8963. Fix incorrect sign extension of xattr length in HDFS-8900. (Colin Patrick McCabe via yliu) (yliu: rev e166c038c0aaa57b245f985a1c0fadd5fe33c384) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/XAttrFormat.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestXAttrFeature.java Compact XAttrs to optimize memory footprint. Key: HDFS-8900 URL: https://issues.apache.org/jira/browse/HDFS-8900 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.8.0 Attachments: HDFS-8900.001.patch, HDFS-8900.002.patch, HDFS-8900.003.patch, HDFS-8900.004.patch, HDFS-8900.005.patch {code} private final ImmutableListXAttr xAttrs; {code} Currently we use above in XAttrFeature, it's not efficient from memory point of view, since {{ImmutableList}} and {{XAttr}} have object memory overhead, and each object has memory alignment. We can use a {{byte[]}} in XAttrFeature and do some compact in {{XAttr}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement
[ https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718234#comment-14718234 ] Yi Liu commented on HDFS-8946: -- Will update the patch later. Improve choosing datanode storage for block placement - Key: HDFS-8946 URL: https://issues.apache.org/jira/browse/HDFS-8946 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch This JIRA is to: Improve chooseing datanode storage for block placement: In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, {{chooseRandom}}), we have following logic to choose datanode storage to place block. For given storage type, we iterate storages of the datanode. But for datanode, it only cares about the storage type. In the loop, we check according to Storage type and return the first storage if the storages of the type on the datanode fit in requirement. So we can remove the iteration of storages, and just need to do once to find a good storage of given type, it's efficient if the storages of the type on the datanode don't fit in requirement since we don't need to loop all storages and do the same check. Besides, no need to shuffle the storages, since we only need to check according to the storage type on the datanode once. This also improves the logic and make it more clear. {code} if (excludedNodes.add(localMachine) // was not in the excluded list isGoodDatanode(localDatanode, maxNodesPerRack, false, results, avoidStaleNodes)) { for (IteratorMap.EntryStorageType, Integer iter = storageTypes .entrySet().iterator(); iter.hasNext(); ) { Map.EntryStorageType, Integer entry = iter.next(); for (DatanodeStorageInfo localStorage : DFSUtil.shuffle( localDatanode.getStorageInfos())) { StorageType type = entry.getKey(); if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize, results, type) = 0) { int num = entry.getValue(); ... {code} (current logic above) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8973) NameNode exit without any exception log
[ https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718325#comment-14718325 ] kanaka kumar avvaru commented on HDFS-8973: --- Regarding logs not printed, looks like log4j produces [only first error in an appender by default |http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/helpers/OnlyOnceErrorHandler.html] and doesn't have recovery of the log file. So, its recommended to configure {{FallbackErrorHandler}} or some other alternative method to ensure logs are not missed. Regarding process Exit, we are missing something about the cause. Even after log4j error, system has functioned well for some time. The actual reason is may not be visible as logs are not present. {quote}. it seems cause by log4j ERROR.{quote} IMO we can't conclude this is the reason for process exit as NN looks functioning sometime after this message also. NameNode exit without any exception log --- Key: HDFS-8973 URL: https://issues.apache.org/jira/browse/HDFS-8973 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.1 Reporter: He Xiaoqiao Priority: Critical namenode process exit without any useful WARN/ERROR log, and after .log file output interrupt .out file continue show about 5 min GC log. when .log file intertupt .out file print the follow ERROR, it may hint some info. it seems cause by log4j ERROR. {code:title=namenode.out|borderStyle=solid} log4j:ERROR Failed to flush writer, java.io.IOException: 错误的文件描述符 at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:318) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291) at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295) at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141) at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229) at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324) at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276) at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.log(Category.java:856) at org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209) at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-7285: -- Attachment: HDFSErasureCodingSystemTestReport-20150826.pdf Based on latest version of branch HDFS-7285, we implemented system test according to the test plan [^HDFSErasureCodingSystemTestPlan-20150824.pdf]. We failed to test some scenarios in EC file writing/reading test case because of problems which is not related to HDFS, but ssh issues of the test script. We will figure out the problem, and implement remaining test scenarios ASAP. Thanks [~jingzhao] and [~szetszwo]] for help [~tfukudom] and our team during the whole process of test planning and implementation. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: Compare-consolidated-20150824.diff, Consolidated-20150707.patch, Consolidated-20150806.patch, Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFS-7285-merge-consolidated-01.patch, HDFS-7285-merge-consolidated-trunk-01.patch, HDFS-7285-merge-consolidated.trunk.03.patch, HDFS-7285-merge-consolidated.trunk.04.patch, HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, HDFSErasureCodingSystemTestPlan-20150824.pdf, HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement
[ https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718226#comment-14718226 ] Masatake Iwasaki commented on HDFS-8946: DatanodeDescriptor.java {code} 679 final long requiredSize = 680 blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE; 681 final long scheduledSize = getBlocksScheduled(t); 682 long remaining = 0; 683 DatanodeStorageInfo storage = null; 684 for (DatanodeStorageInfo s : getStorageInfos()) { 685 if (s.getState() == State.NORMAL 686 s.getStorageType() == t) { 687 if (storage == null) { 688 storage = s; 689 } 690 long r = s.getRemaining(); 691 if (r = requiredSize) { 692 remaining += r; 693 } 694 } 695 } 696 if (requiredSize remaining - scheduledSize) { 697 return null; {code} {{scheduledSize}} is number of blocks but used as if it's bytes. Any reason to change the logic of remaining size checking? Improve choosing datanode storage for block placement - Key: HDFS-8946 URL: https://issues.apache.org/jira/browse/HDFS-8946 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch This JIRA is to: Improve chooseing datanode storage for block placement: In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, {{chooseRandom}}), we have following logic to choose datanode storage to place block. For given storage type, we iterate storages of the datanode. But for datanode, it only cares about the storage type. In the loop, we check according to Storage type and return the first storage if the storages of the type on the datanode fit in requirement. So we can remove the iteration of storages, and just need to do once to find a good storage of given type, it's efficient if the storages of the type on the datanode don't fit in requirement since we don't need to loop all storages and do the same check. Besides, no need to shuffle the storages, since we only need to check according to the storage type on the datanode once. This also improves the logic and make it more clear. {code} if (excludedNodes.add(localMachine) // was not in the excluded list isGoodDatanode(localDatanode, maxNodesPerRack, false, results, avoidStaleNodes)) { for (IteratorMap.EntryStorageType, Integer iter = storageTypes .entrySet().iterator(); iter.hasNext(); ) { Map.EntryStorageType, Integer entry = iter.next(); for (DatanodeStorageInfo localStorage : DFSUtil.shuffle( localDatanode.getStorageInfos())) { StorageType type = entry.getKey(); if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize, results, type) = 0) { int num = entry.getValue(); ... {code} (current logic above) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too
[ https://issues.apache.org/jira/browse/HDFS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718308#comment-14718308 ] kanaka kumar avvaru commented on HDFS-8892: --- Hi [~Ravikumar], are you planning to update code and produce patch as per [~cmccabe] suggestion? If yes, feel free to assign the JIRA to you. Otherwise, I will create the patch. ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too - Key: HDFS-8892 URL: https://issues.apache.org/jira/browse/HDFS-8892 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.7.1 Reporter: Ravikumar Assignee: kanaka kumar avvaru Priority: Minor Currently CacheCleaner thread checks only for cache-expiry times. It would be nice if it handles an invalid-slot too in an extra-pass of evictable map… for(ShortCircuitReplica replica:evictable.values()) { if(!scr.getSlot().isValid()) { purge(replica); } } //Existing code... int numDemoted = demoteOldEvictableMmaped(curMs); int numPurged = 0; Long evictionTimeNs = Long.valueOf(0); …. ….. Apps like HBase can tweak the expiry/staleness/cache-size params in DFS-Client, so that ShortCircuitReplica will never be closed except when Slot is declared invalid. I assume slot-invalidation will happen during block-invalidation/deletes {Primarily triggered by compaction/shard-takeover etc..} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8967) Create a BlockManagerLock class to represent the lock used in the BlockManager
[ https://issues.apache.org/jira/browse/HDFS-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718588#comment-14718588 ] Daryn Sharp commented on HDFS-8967: --- This mimics a subset of what I've been working on, so I'm ok with it after pre-commit succeeds I'll re-review. Create a BlockManagerLock class to represent the lock used in the BlockManager -- Key: HDFS-8967 URL: https://issues.apache.org/jira/browse/HDFS-8967 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8967.000.patch, HDFS-8967.001.patch This jira proposes to create a {{BlockManagerLock}} class to represent the lock used in {{BlockManager}}. Currently it directly points to the {{FSNamesystem}} lock thus there are no functionality changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8833) Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones
[ https://issues.apache.org/jira/browse/HDFS-8833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718396#comment-14718396 ] Walter Su commented on HDFS-8833: - ...encode replication and ecPolicy together (Zhe Zhang) Good Thought Zhe! ... Well it depends on how small (in relative to cell size). We should certainly skip files smaller than a full stripe. (Zhe Zhang) Yes. cellSize is relavant. ...I find the above usecase very compelling, which is why I've been advocating for using the file header bits. I haven't seen much competition for the bits either, and we can also start conservatively when using bits (only as many as we need). (Andrew Wang) Agree. So, have we reached a consensus? Have other different thoughts, guys? Erasure coding: store EC schema and cell size in INodeFile and eliminate notion of EC zones --- Key: HDFS-8833 URL: https://issues.apache.org/jira/browse/HDFS-8833 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS-7285 Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-8833-HDFS-7285-merge.00.patch, HDFS-8833-HDFS-7285-merge.01.patch, HDFS-8833-HDFS-7285.02.patch We have [discussed | https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14357754page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14357754] storing EC schema with files instead of EC zones and recently revisited the discussion under HDFS-8059. As a recap, the _zone_ concept has severe limitations including renaming and nested configuration. Those limitations are valid in encryption for security reasons and it doesn't make sense to carry them over in EC. This JIRA aims to store EC schema and cell size on {{INodeFile}} level. For simplicity, we should first implement it as an xattr and consider memory optimizations (such as moving it to file header) as a follow-on. We should also disable changing EC policy on a non-empty file / dir in the first phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8950) NameNode refresh doesn't remove DataNodes
[ https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-8950: --- Attachment: HDFS-8950.005.patch Tests are all passing. Again. NameNode refresh doesn't remove DataNodes - Key: HDFS-8950 URL: https://issues.apache.org/jira/browse/HDFS-8950 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.6.0 Reporter: Daniel Templeton Assignee: Daniel Templeton Fix For: 2.8.0 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN refresh, it doesn't remove it actually and the NN UI keeps showing that node. It may try to allocate some blocks to that DN as well during an MR job. This issue is independent from DN decommission. To reproduce: 1. Add a DN to dfs_hosts_allow 2. Refresh NN 3. Start DN. Now NN starts seeing DN. 4. Stop DN 5. Remove DN from dfs_hosts_allow 6. Refresh NN - NN is still reporting DN as being used by HDFS. This is different from decom because there DN is added to exclude list in addition to being removed from allowed list, and in that case everything works correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8501) Erasure Coding: Improve memory efficiency of BlockInfoStriped
[ https://issues.apache.org/jira/browse/HDFS-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14719964#comment-14719964 ] Hadoop QA commented on HDFS-8501: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 39s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 45s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 58s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 41s | The patch appears to introduce 4 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 6s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 184m 49s | Tests failed in hadoop-hdfs. | | | | 226m 55s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.TestWriteStripedFileWithFailure | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752982/HDFS-8501-HDFS-7285.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / 164cbe6 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12195/artifact/patchprocess/patchReleaseAuditProblems.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12195/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12195/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12195/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12195/console | This message was automatically generated. Erasure Coding: Improve memory efficiency of BlockInfoStriped - Key: HDFS-8501 URL: https://issues.apache.org/jira/browse/HDFS-8501 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8501-HDFS-7285.01.patch Erasure Coding: Improve memory efficiency of BlockInfoStriped Assume we have a BlockInfoStriped: {noformat} triplets[] = {s0, s1, s2, s3} indices[] = {0, 1, 2, 3} {noformat} {{s0}} means {{storage_0}} When we run balancer/mover to re-locate replica on s2, firstly it becomes: {noformat} triplets[] = {s0, s1, s2, s3, s4} indices[] = {0, 1, 2, 3, 2} {noformat} Then the replica on s2 is removed, finally it becomes: {noformat} triplets[] = {s0, s1, null, s3, s4} indices[] = {0, 1, -1, 3, 2} {noformat} The worst case is: {noformat} triplets[] = {null, null, null, null, s4, s5, s6, s7} indices[] = {-1, -1, -1, -1, 0, 1, 2, 3} {noformat} We should learn from {{BlockInfoContiguous.removeStorage(..)}}. When a storage is removed, we move the last item front. With the improvement, the worst case become: {noformat} triplets[] = {s4, s5, s6, s7, null} indices[] = {0, 1, 2, 3, -1} {noformat} We have an empty slot. Notes: Assume we copy 4 storage first, then delete 4. Even with the improvement, the worst case could be: {noformat} triplets[] = {s4, s5, s6, s7, null, null, null, null} indices[] = {0, 1, 2, 3, -1, -1, -1, -1} {noformat} But the Balancer uses {{delHint}}. So when add one will always delete one. So this case won't happen for striped and contiguous blocks. *idx_i must be moved to slot_i.* So slot_i will have idx_i. So we can do further improvement in HDFS-8032. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-2070) A lack of auto-test for FsShell getmerge in hdfs
[ https://issues.apache.org/jira/browse/HDFS-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-2070: --- Status: Patch Available (was: In Progress) Ready for review. This should be a quick and easy one. A lack of auto-test for FsShell getmerge in hdfs Key: HDFS-2070 URL: https://issues.apache.org/jira/browse/HDFS-2070 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: XieXianshan Assignee: Daniel Templeton Labels: newbie Attachments: HDFS-2070.001.patch There is no any automated test for FsShell getmerge in hdfs. In regard to reliability and reuse,some automated tests should be added to the test set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement
[ https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14719294#comment-14719294 ] Hadoop QA commented on HDFS-8946: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 5s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 6s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 20s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 21s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 36s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 18s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 185m 32s | Tests failed in hadoop-hdfs. | | | | 231m 51s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.blockmanagement.TestBlockManager | | | hadoop.hdfs.TestRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752975/HDFS-8946.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e166c03 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12194/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12194/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12194/console | This message was automatically generated. Improve choosing datanode storage for block placement - Key: HDFS-8946 URL: https://issues.apache.org/jira/browse/HDFS-8946 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, HDFS-8946.003.patch This JIRA is to: Improve chooseing datanode storage for block placement: In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, {{chooseRandom}}), we have following logic to choose datanode storage to place block. For given storage type, we iterate storages of the datanode. But for datanode, it only cares about the storage type. In the loop, we check according to Storage type and return the first storage if the storages of the type on the datanode fit in requirement. So we can remove the iteration of storages, and just need to do once to find a good storage of given type, it's efficient if the storages of the type on the datanode don't fit in requirement since we don't need to loop all storages and do the same check. Besides, no need to shuffle the storages, since we only need to check according to the storage type on the datanode once. This also improves the logic and make it more clear. {code} if (excludedNodes.add(localMachine) // was not in the excluded list isGoodDatanode(localDatanode, maxNodesPerRack, false, results, avoidStaleNodes)) { for (IteratorMap.EntryStorageType, Integer iter = storageTypes .entrySet().iterator(); iter.hasNext(); ) { Map.EntryStorageType, Integer entry = iter.next(); for (DatanodeStorageInfo localStorage : DFSUtil.shuffle( localDatanode.getStorageInfos())) { StorageType type = entry.getKey(); if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize, results, type) = 0) { int num = entry.getValue(); ... {code} (current logic above) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-2070) A lack of auto-test for FsShell getmerge in hdfs
[ https://issues.apache.org/jira/browse/HDFS-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-2070: --- Attachment: HDFS-2070.001.patch Turns out there was already one getmerge test in the testHDFSConf.xml, but this patch adds a few more to cover all the bases. A lack of auto-test for FsShell getmerge in hdfs Key: HDFS-2070 URL: https://issues.apache.org/jira/browse/HDFS-2070 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: XieXianshan Assignee: Daniel Templeton Labels: newbie Attachments: HDFS-2070.001.patch There is no any automated test for FsShell getmerge in hdfs. In regard to reliability and reuse,some automated tests should be added to the test set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8925) Move BlockReaderLocal to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720929#comment-14720929 ] Hudson commented on HDFS-8925: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2249 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2249/]) HDFS-8925. Move BlockReaderLocal to hdfs-client. Contributed by Mingliang Liu. (wheat9: rev e2c9b288b223b9fd82dc12018936e13128413492) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/InvalidEncryptionKeyException.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ClientContext.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/BlockLocalPathInfo.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSelector.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/KeyProviderCache.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReader.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/KeyProviderCache.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/Receiver.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSelector.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderUtil.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolServerSideTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/InvalidEncryptionKeyException.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitLocalRead.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockLocalPathInfo.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocalLegacy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/PeerCache.java *
[jira] [Commented] (HDFS-8983) NameNode support for protected directories
[ https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720913#comment-14720913 ] Hadoop QA commented on HDFS-8983: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 32s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 46s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 4s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 43s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 41s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 23m 34s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 171m 1s | Tests failed in hadoop-hdfs. | | | | 238m 49s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | | | hadoop.hdfs.web.TestWebHDFSForHA | | Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestFsck | | | org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12753080/HDFS-8983.03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e2c9b28 | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12206/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12206/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12206/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12206/console | This message was automatically generated. NameNode support for protected directories -- Key: HDFS-8983 URL: https://issues.apache.org/jira/browse/HDFS-8983 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, HDFS-8983.03.patch To protect important system directories from inadvertent deletion (e.g. /Users) the NameNode can allow marking directories as _protected_. Such directories cannot be deleted unless they are empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8965) Harden edit log reading code against out of memory errors
[ https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720907#comment-14720907 ] Hadoop QA commented on HDFS-8965: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 46s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 7s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 33s | The applied patch generated 7 new checkstyle issues (total was 401, now 402). | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 26s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 10s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 146m 58s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 3m 56s | Tests passed in bkjournal. | | | | 198m 31s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Timed out tests | org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12753081/HDFS-8965.005.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e2c9b28 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12207/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12207/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12207/artifact/patchprocess/testrun_hadoop-hdfs.txt | | bkjournal test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12207/artifact/patchprocess/testrun_bkjournal.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12207/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12207/console | This message was automatically generated. Harden edit log reading code against out of memory errors - Key: HDFS-8965 URL: https://issues.apache.org/jira/browse/HDFS-8965 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, HDFS-8965.003.patch, HDFS-8965.004.patch, HDFS-8965.005.patch We should harden the edit log reading code against out of memory errors. Now that each op has a length prefix and a checksum, we can validate the checksum before trying to load the Op data. This should avoid out of memory errors when trying to load garbage data as Op data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement
[ https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720899#comment-14720899 ] Yi Liu commented on HDFS-8946: -- The test failure is not related, they run successfully in my local env. Improve choosing datanode storage for block placement - Key: HDFS-8946 URL: https://issues.apache.org/jira/browse/HDFS-8946 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, HDFS-8946.003.patch This JIRA is to: Improve chooseing datanode storage for block placement: In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, {{chooseRandom}}), we have following logic to choose datanode storage to place block. For given storage type, we iterate storages of the datanode. But for datanode, it only cares about the storage type. In the loop, we check according to Storage type and return the first storage if the storages of the type on the datanode fit in requirement. So we can remove the iteration of storages, and just need to do once to find a good storage of given type, it's efficient if the storages of the type on the datanode don't fit in requirement since we don't need to loop all storages and do the same check. Besides, no need to shuffle the storages, since we only need to check according to the storage type on the datanode once. This also improves the logic and make it more clear. {code} if (excludedNodes.add(localMachine) // was not in the excluded list isGoodDatanode(localDatanode, maxNodesPerRack, false, results, avoidStaleNodes)) { for (IteratorMap.EntryStorageType, Integer iter = storageTypes .entrySet().iterator(); iter.hasNext(); ) { Map.EntryStorageType, Integer entry = iter.next(); for (DatanodeStorageInfo localStorage : DFSUtil.shuffle( localDatanode.getStorageInfos())) { StorageType type = entry.getKey(); if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize, results, type) = 0) { int num = entry.getValue(); ... {code} (current logic above) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8983) NameNode support for protected directories
[ https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720967#comment-14720967 ] Jitendra Nath Pandey commented on HDFS-8983: Thanks for addressing my comments [~arpitagarwal]. +1 for the patch. NameNode support for protected directories -- Key: HDFS-8983 URL: https://issues.apache.org/jira/browse/HDFS-8983 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, HDFS-8983.03.patch, HDFS-8983.04.patch To protect important system directories from inadvertent deletion (e.g. /Users) the NameNode can allow marking directories as _protected_. Such directories cannot be deleted unless they are empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8978) Erasure coding: fix 2 failed tests of DFSStripedOutputStream
[ https://issues.apache.org/jira/browse/HDFS-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720954#comment-14720954 ] Hadoop QA commented on HDFS-8978: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 14s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 8 new or modified test files. | | {color:green}+1{color} | javac | 7m 53s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 2s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 43s | The patch appears to introduce 4 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 12s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 101m 26s | Tests failed in hadoop-hdfs. | | | | 144m 40s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Timed out tests | org.apache.hadoop.hdfs.TestParallelShortCircuitReadUnCached | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752917/HDFS-8978-HDFS-7285.02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / 164cbe6 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12208/artifact/patchprocess/patchReleaseAuditProblems.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12208/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12208/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12208/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12208/console | This message was automatically generated. Erasure coding: fix 2 failed tests of DFSStripedOutputStream Key: HDFS-8978 URL: https://issues.apache.org/jira/browse/HDFS-8978 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8978-HDFS-7285.01.patch, HDFS-8978-HDFS-7285.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8983) NameNode support for protected directories
[ https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720982#comment-14720982 ] Hadoop QA commented on HDFS-8983: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 27s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 44s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 59s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 28s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 21s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 23m 20s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 111m 1s | Tests failed in hadoop-hdfs. | | | | 180m 54s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.blockmanagement.TestBlockManager | | Timed out tests | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12753112/HDFS-8983.04.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e2c9b28 | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12210/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12210/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12210/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12210/console | This message was automatically generated. NameNode support for protected directories -- Key: HDFS-8983 URL: https://issues.apache.org/jira/browse/HDFS-8983 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, HDFS-8983.03.patch, HDFS-8983.04.patch To protect important system directories from inadvertent deletion (e.g. /Users) the NameNode can allow marking directories as _protected_. Such directories cannot be deleted unless they are empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8983) NameNode support for protected directories
[ https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8983: Attachment: HDFS-8983.04.patch Thanks [~jnp]! Addressed in .04 patch. Also updated {{FsDirectory#normalizePaths}} with more checks. Delta chunks .03 to .04. {code} - checkProtectedDescendants(fsd, src.endsWith(Path.SEPARATOR) ? - src.substring(0, src.length() - 1) : src); + checkProtectedDescendants(fsd, fsd.normalizePath(src)); {code} {code} -// {@link Path#SEPARATOR} is /. +// {@link Path#SEPARATOR} is / and '0' is the next ASCII +// character after '/'. {code} {code} - * Reserved paths are ignored. + * Reserved paths, relative paths and paths with scheme are ignored. {code} {code} -final CollectionString normalized = new ArrayListString(paths.size()); -for (String path : paths) { - if (isReservedName(path)) { -LOG.error({} ignoring reserved path {}, errorString, path); +final CollectionString normalized = new ArrayList(paths.size()); +for (String dir : paths) { + if (isReservedName(dir)) { +LOG.error({} ignoring reserved path {}, errorString, dir); } else { -normalized.add(normalizePath(path)); +final Path path = new Path(dir); +if (!path.isAbsolute()) { + LOG.error({} ignoring relative path {}, errorString, dir); +} else if (path.toUri().getScheme() != null) { + LOG.error({} ignoring path {} with scheme, errorString, dir); +} else { + normalized.add(path.toString()); +} {code} And three new tests for the additional checks in {{normalizePaths}}. NameNode support for protected directories -- Key: HDFS-8983 URL: https://issues.apache.org/jira/browse/HDFS-8983 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, HDFS-8983.03.patch, HDFS-8983.04.patch To protect important system directories from inadvertent deletion (e.g. /Users) the NameNode can allow marking directories as _protected_. Such directories cannot be deleted unless they are empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720070#comment-14720070 ] Bob Hansen commented on HDFS-8855: -- Xiaobing - is there a race condition in initializing the ugiCache? If two threads make simultaneous requests, one of them will succeed in the CAS for ugiCacheInit, and the other will proceed ahead. If the latter thread immediately tries to reference the ugiCache while the first is still initializing it, we will get an NPE or a partially-constructed object. See http://www.journaldev.com/1377/java-singleton-design-pattern-best-practices-with-examples for a nice little discussion of idiomatic singletons in Java; if we're supporting JRE = 1.5, the Pugh construction is clean and works well for concurrent access. Webhdfs client leaks active NameNode connections Key: HDFS-8855 URL: https://issues.apache.org/jira/browse/HDFS-8855 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Environment: HDP 2.2 Reporter: Bob Hansen Assignee: Xiaobing Zhou Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS_8855.prototype.patch The attached script simulates a process opening ~50 files via webhdfs and performing random reads. Note that there are at most 50 concurrent reads, and all webhdfs sessions are kept open. Each read is ~64k at a random position. The script periodically (once per second) shells into the NameNode and produces a summary of the socket states. For my test cluster with 5 nodes, it took ~30 seconds for the NameNode to have ~25000 active connections and fails. It appears that each request to the webhdfs client is opening a new connection to the NameNode and keeping it open after the request is complete. If the process continues to run, eventually (~30-60 seconds), all of the open connections are closed and the NameNode recovers. This smells like SoftReference reaping. Are we using SoftReferences in the webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager
[ https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8938: - Status: Patch Available (was: Reopened) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager -- Key: HDFS-8938 URL: https://issues.apache.org/jira/browse/HDFS-8938 Project: Hadoop HDFS Issue Type: Task Reporter: Mingliang Liu Assignee: Mingliang Liu Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, HDFS-8938.008.patch This jira proposes to refactor two inner static classes, {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to standalone classes. The refactor also improves readability by abstracting the complexity of scheduling and validating replications to corresponding helper methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes
[ https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720296#comment-14720296 ] Hadoop QA commented on HDFS-8950: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 30s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 49s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 56s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 21s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 27s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 7s | Pre-build of native portion | | {color:green}+1{color} | hdfs tests | 161m 37s | Tests passed in hadoop-hdfs. | | | | 206m 14s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752995/HDFS-8950.005.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / beb65c9 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12196/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12196/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12196/console | This message was automatically generated. NameNode refresh doesn't remove DataNodes - Key: HDFS-8950 URL: https://issues.apache.org/jira/browse/HDFS-8950 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.6.0 Reporter: Daniel Templeton Assignee: Daniel Templeton Fix For: 2.8.0 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN refresh, it doesn't remove it actually and the NN UI keeps showing that node. It may try to allocate some blocks to that DN as well during an MR job. This issue is independent from DN decommission. To reproduce: 1. Add a DN to dfs_hosts_allow 2. Refresh NN 3. Start DN. Now NN starts seeing DN. 4. Stop DN 5. Remove DN from dfs_hosts_allow 6. Refresh NN - NN is still reporting DN as being used by HDFS. This is different from decom because there DN is added to exclude list in addition to being removed from allowed list, and in that case everything works correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8155) Support OAuth2 in WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720043#comment-14720043 ] Chris Nauroth commented on HDFS-8155: - {code} client.setConnectTimeout(URLConnectionFactory.DEFAULT_SOCKET_TIMEOUT, TimeUnit.MILLISECONDS); {code} Sorry if my earlier comment was unclear, but I think we need to call both {{client.setConnectTimeout}} and {{client.setReadTimeout}}. Otherwise, we could have a successful connection, but then hang indefinitely on a non-responsive server. +1 after that. I don't know what happened with that last Jenkins run. It's building fine for me locally. Support OAuth2 in WebHDFS - Key: HDFS-8155 URL: https://issues.apache.org/jira/browse/HDFS-8155 Project: Hadoop HDFS Issue Type: New Feature Components: webhdfs Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-8155-1.patch, HDFS-8155.002.patch, HDFS-8155.003.patch, HDFS-8155.004.patch, HDFS-8155.005.patch WebHDFS should be able to accept OAuth2 credentials. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too
[ https://issues.apache.org/jira/browse/HDFS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720116#comment-14720116 ] Ravikumar commented on HDFS-8892: - Please feel free to contribute the patch [~kanaka]. I am not currently looking to submit it, ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too - Key: HDFS-8892 URL: https://issues.apache.org/jira/browse/HDFS-8892 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.7.1 Reporter: Ravikumar Assignee: kanaka kumar avvaru Priority: Minor Currently CacheCleaner thread checks only for cache-expiry times. It would be nice if it handles an invalid-slot too in an extra-pass of evictable map… for(ShortCircuitReplica replica:evictable.values()) { if(!scr.getSlot().isValid()) { purge(replica); } } //Existing code... int numDemoted = demoteOldEvictableMmaped(curMs); int numPurged = 0; Long evictionTimeNs = Long.valueOf(0); …. ….. Apps like HBase can tweak the expiry/staleness/cache-size params in DFS-Client, so that ShortCircuitReplica will never be closed except when Slot is declared invalid. I assume slot-invalidation will happen during block-invalidation/deletes {Primarily triggered by compaction/shard-takeover etc..} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8155) Support OAuth2 in WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-8155: -- Attachment: HDFS-8155.006.patch Fixed ChrisN's final coment. Yeah, Jenkins is being weird for me. I ran through all the HDFS tests manually and except for a couple non-repeatable, unrelated failures, everything passed. I'll let Jenkins run again, but unless it's something real, I'll go ahead and commit this later today. Thanks. Support OAuth2 in WebHDFS - Key: HDFS-8155 URL: https://issues.apache.org/jira/browse/HDFS-8155 Project: Hadoop HDFS Issue Type: New Feature Components: webhdfs Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-8155-1.patch, HDFS-8155.002.patch, HDFS-8155.003.patch, HDFS-8155.004.patch, HDFS-8155.005.patch, HDFS-8155.006.patch WebHDFS should be able to accept OAuth2 credentials. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8155) Support OAuth2 in WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-8155: -- Status: Open (was: Patch Available) Support OAuth2 in WebHDFS - Key: HDFS-8155 URL: https://issues.apache.org/jira/browse/HDFS-8155 Project: Hadoop HDFS Issue Type: New Feature Components: webhdfs Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-8155-1.patch, HDFS-8155.002.patch, HDFS-8155.003.patch, HDFS-8155.004.patch, HDFS-8155.005.patch, HDFS-8155.006.patch WebHDFS should be able to accept OAuth2 credentials. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8155) Support OAuth2 in WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-8155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-8155: -- Status: Patch Available (was: Open) Support OAuth2 in WebHDFS - Key: HDFS-8155 URL: https://issues.apache.org/jira/browse/HDFS-8155 Project: Hadoop HDFS Issue Type: New Feature Components: webhdfs Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-8155-1.patch, HDFS-8155.002.patch, HDFS-8155.003.patch, HDFS-8155.004.patch, HDFS-8155.005.patch, HDFS-8155.006.patch WebHDFS should be able to accept OAuth2 credentials. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-2070) A lack of auto-test for FsShell getmerge in hdfs
[ https://issues.apache.org/jira/browse/HDFS-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720417#comment-14720417 ] Hadoop QA commented on HDFS-2070: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 5m 30s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 26s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | native | 1m 2s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 163m 14s | Tests failed in hadoop-hdfs. | | | | 180m 4s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.hdfs.server.datanode.TestBPOfferService | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12753015/HDFS-2070.001.patch | | Optional Tests | javac unit | | git revision | trunk / beb65c9 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12197/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12197/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12197/console | This message was automatically generated. A lack of auto-test for FsShell getmerge in hdfs Key: HDFS-2070 URL: https://issues.apache.org/jira/browse/HDFS-2070 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: XieXianshan Assignee: Daniel Templeton Labels: newbie Attachments: HDFS-2070.001.patch There is no any automated test for FsShell getmerge in hdfs. In regard to reliability and reuse,some automated tests should be added to the test set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720344#comment-14720344 ] Daryn Sharp commented on HDFS-8865: --- +1 This has made a huge difference, and all the possible style warning were addressed. Improve quota initialization performance Key: HDFS-8865 URL: https://issues.apache.org/jira/browse/HDFS-8865 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, HDFS-8865.v2.patch, HDFS-8865.v3.patch After replaying edits, the whole file system tree is recursively scanned in order to initialize the quota. For big name space, this can take a very long time. Since this is done during namenode failover, it also affects failover latency. By using the Fork-Join framework, I was able to greatly reduce the initialization time. The following is the test result using the fsimage from one of the big name nodes we have. || threads || seconds|| | 1 (existing) | 55| | 1 (fork-join) | 68 | | 4 | 16 | | 8 | 8 | | 12 | 6 | | 16 | 5 | | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720363#comment-14720363 ] Kihwal Lee commented on HDFS-8865: -- Thanks [~xyao] and [~daryn] for reviews. I've committed this to trunk and branch-2. Improve quota initialization performance Key: HDFS-8865 URL: https://issues.apache.org/jira/browse/HDFS-8865 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, HDFS-8865.v2.patch, HDFS-8865.v3.patch After replaying edits, the whole file system tree is recursively scanned in order to initialize the quota. For big name space, this can take a very long time. Since this is done during namenode failover, it also affects failover latency. By using the Fork-Join framework, I was able to greatly reduce the initialization time. The following is the test result using the fsimage from one of the big name nodes we have. || threads || seconds|| | 1 (existing) | 55| | 1 (fork-join) | 68 | | 4 | 16 | | 8 | 8 | | 12 | 6 | | 16 | 5 | | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee resolved HDFS-8865. -- Resolution: Fixed Fix Version/s: 2.8.0 3.0.0 Improve quota initialization performance Key: HDFS-8865 URL: https://issues.apache.org/jira/browse/HDFS-8865 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 3.0.0, 2.8.0 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, HDFS-8865.v2.patch, HDFS-8865.v3.patch After replaying edits, the whole file system tree is recursively scanned in order to initialize the quota. For big name space, this can take a very long time. Since this is done during namenode failover, it also affects failover latency. By using the Fork-Join framework, I was able to greatly reduce the initialization time. The following is the test result using the fsimage from one of the big name nodes we have. || threads || seconds|| | 1 (existing) | 55| | 1 (fork-join) | 68 | | 4 | 16 | | 8 | 8 | | 12 | 6 | | 16 | 5 | | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8865: - Hadoop Flags: Reviewed Improve quota initialization performance Key: HDFS-8865 URL: https://issues.apache.org/jira/browse/HDFS-8865 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 3.0.0, 2.8.0 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, HDFS-8865.v2.patch, HDFS-8865.v3.patch After replaying edits, the whole file system tree is recursively scanned in order to initialize the quota. For big name space, this can take a very long time. Since this is done during namenode failover, it also affects failover latency. By using the Fork-Join framework, I was able to greatly reduce the initialization time. The following is the test result using the fsimage from one of the big name nodes we have. || threads || seconds|| | 1 (existing) | 55| | 1 (fork-join) | 68 | | 4 | 16 | | 8 | 8 | | 12 | 6 | | 16 | 5 | | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720368#comment-14720368 ] Hudson commented on HDFS-8865: -- FAILURE: Integrated in Hadoop-trunk-Commit #8364 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8364/]) HDFS-8865. Improve quota initialization performance. Contributed by Kihwal Lee. (kihwal: rev b6ceee9bf42eec15891f60a014bbfa47e03f563c) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/QuotaCounts.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithSnapshot.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDiskspaceQuotaUpdate.java Improve quota initialization performance Key: HDFS-8865 URL: https://issues.apache.org/jira/browse/HDFS-8865 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 3.0.0, 2.8.0 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, HDFS-8865.v2.patch, HDFS-8865.v3.patch After replaying edits, the whole file system tree is recursively scanned in order to initialize the quota. For big name space, this can take a very long time. Since this is done during namenode failover, it also affects failover latency. By using the Fork-Join framework, I was able to greatly reduce the initialization time. The following is the test result using the fsimage from one of the big name nodes we have. || threads || seconds|| | 1 (existing) | 55| | 1 (fork-join) | 68 | | 4 | 16 | | 8 | 8 | | 12 | 6 | | 16 | 5 | | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8865: - Status: In Progress (was: Patch Available) Improve quota initialization performance Key: HDFS-8865 URL: https://issues.apache.org/jira/browse/HDFS-8865 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, HDFS-8865.v2.patch, HDFS-8865.v3.patch After replaying edits, the whole file system tree is recursively scanned in order to initialize the quota. For big name space, this can take a very long time. Since this is done during namenode failover, it also affects failover latency. By using the Fork-Join framework, I was able to greatly reduce the initialization time. The following is the test result using the fsimage from one of the big name nodes we have. || threads || seconds|| | 1 (existing) | 55| | 1 (fork-join) | 68 | | 4 | 16 | | 8 | 8 | | 12 | 6 | | 16 | 5 | | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720370#comment-14720370 ] Kihwal Lee commented on HDFS-8879: -- Cherry-picked the fix to branch-2.7. Quota by storage type usage incorrectly initialized upon namenode restart - Key: HDFS-8879 URL: https://issues.apache.org/jira/browse/HDFS-8879 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Kihwal Lee Assignee: Xiaoyu Yao Fix For: 3.0.0, 2.7.2 Attachments: HDFS-8879.01.patch This was found by [~kihwal] as part of HDFS-8865 work in this [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. The unit test testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit failed to detect this because they were using an obsolete FsDirectory instance. Once added the highlighted line below, the issue can be reproed. {code} fsdir = cluster.getNamesystem().getFSDirectory(); INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-8879: - Fix Version/s: (was: 2.8.0) 2.7.2 3.0.0 Quota by storage type usage incorrectly initialized upon namenode restart - Key: HDFS-8879 URL: https://issues.apache.org/jira/browse/HDFS-8879 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Kihwal Lee Assignee: Xiaoyu Yao Fix For: 3.0.0, 2.7.2 Attachments: HDFS-8879.01.patch This was found by [~kihwal] as part of HDFS-8865 work in this [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. The unit test testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit failed to detect this because they were using an obsolete FsDirectory instance. Once added the highlighted line below, the issue can be reproed. {code} fsdir = cluster.getNamesystem().getFSDirectory(); INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8779) WebUI can't display randomly generated block ID
[ https://issues.apache.org/jira/browse/HDFS-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720375#comment-14720375 ] Haohui Mai commented on HDFS-8779: -- -1 on the 03 patch. For 04 patch the javascript needs to be minimized. Or it might make sense to create a browsified version of our own. WebUI can't display randomly generated block ID --- Key: HDFS-8779 URL: https://issues.apache.org/jira/browse/HDFS-8779 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8779.01.patch, HDFS-8779.02.patch, HDFS-8779.03.patch, HDFS-8779.04.patch, after-02-patch.png, before.png, patch-to-json-parse.txt Old release use randomly generated block ID(HDFS-4645). max value of Long in Java is 2^63-1 max value of -number-(*integer*) in Javascript is 2^53-1. ( See [Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER]) Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER. A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8779) WebUI can't display randomly generated block ID
[ https://issues.apache.org/jira/browse/HDFS-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720399#comment-14720399 ] Haohui Mai commented on HDFS-8779: -- I'll upload a patch to demonstrate the idea later today. WebUI can't display randomly generated block ID --- Key: HDFS-8779 URL: https://issues.apache.org/jira/browse/HDFS-8779 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8779.01.patch, HDFS-8779.02.patch, HDFS-8779.03.patch, HDFS-8779.04.patch, after-02-patch.png, before.png, patch-to-json-parse.txt Old release use randomly generated block ID(HDFS-4645). max value of Long in Java is 2^63-1 max value of -number-(*integer*) in Javascript is 2^53-1. ( See [Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER]) Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER. A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager
[ https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720750#comment-14720750 ] Hudson commented on HDFS-8938: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #318 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/318/]) HDFS-8938. Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager. Contributed by Mingliang Liu. (wheat9: rev 6d12cd8d609dec26d44cece9937c35b7d72a3cd1) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ReplicationWork.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockToMarkCorrupt.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager -- Key: HDFS-8938 URL: https://issues.apache.org/jira/browse/HDFS-8938 Project: Hadoop HDFS Issue Type: Task Reporter: Mingliang Liu Assignee: Mingliang Liu Fix For: 2.8.0 Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, HDFS-8938.008.patch This jira proposes to refactor two inner static classes, {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to standalone classes. The refactor also improves readability by abstracting the complexity of scheduling and validating replications to corresponding helper methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list
[ https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720768#comment-14720768 ] Hudson commented on HDFS-8950: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1052 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1052/]) HDFS-8950. NameNode refresh doesn't remove DataNodes that are no longer in the allowed list (Daniel Templeton) (cmccabe: rev b94b56806d3d6e04984e229b479f7ac15b62bbfa) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHostFileManager.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java NameNode refresh doesn't remove DataNodes that are no longer in the allowed list Key: HDFS-8950 URL: https://issues.apache.org/jira/browse/HDFS-8950 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.6.0 Reporter: Daniel Templeton Assignee: Daniel Templeton Fix For: 2.8.0 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN refresh, it doesn't remove it actually and the NN UI keeps showing that node. It may try to allocate some blocks to that DN as well during an MR job. This issue is independent from DN decommission. To reproduce: 1. Add a DN to dfs_hosts_allow 2. Refresh NN 3. Start DN. Now NN starts seeing DN. 4. Stop DN 5. Remove DN from dfs_hosts_allow 6. Refresh NN - NN is still reporting DN as being used by HDFS. This is different from decom because there DN is added to exclude list in addition to being removed from allowed list, and in that case everything works correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager
[ https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720770#comment-14720770 ] Hudson commented on HDFS-8938: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1052 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1052/]) HDFS-8938. Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager. Contributed by Mingliang Liu. (wheat9: rev 6d12cd8d609dec26d44cece9937c35b7d72a3cd1) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ReplicationWork.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockToMarkCorrupt.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager -- Key: HDFS-8938 URL: https://issues.apache.org/jira/browse/HDFS-8938 Project: Hadoop HDFS Issue Type: Task Reporter: Mingliang Liu Assignee: Mingliang Liu Fix For: 2.8.0 Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, HDFS-8938.008.patch This jira proposes to refactor two inner static classes, {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to standalone classes. The refactor also improves readability by abstracting the complexity of scheduling and validating replications to corresponding helper methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8925) Move BlockReaderLocal to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720769#comment-14720769 ] Hudson commented on HDFS-8925: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1052 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1052/]) HDFS-8925. Move BlockReaderLocal to hdfs-client. Contributed by Mingliang Liu. (wheat9: rev e2c9b288b223b9fd82dc12018936e13128413492) * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolPB.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientDatanodeProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolServerSideTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/KeyProviderCache.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ClientContext.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockLocalPathInfo.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderUtil.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolPB.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSelector.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ExternalBlockReader.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/InvalidEncryptionKeyException.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocalLegacy.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/Receiver.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitLocalRead.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/InvalidEncryptionKeyException.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/BlockReportOptions.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/PeerCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/BlockLocalPathInfo.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/BlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/KeyProviderCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReader.java *
[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module
[ https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-8990: Attachment: HDFS-8990.000.patch The v0 patch moves the {{RemoteBlockReader}} and {{RemoteBlockReader2}} classes to the client module. Move RemoteBlockReader to hdfs-client module Key: HDFS-8990 URL: https://issues.apache.org/jira/browse/HDFS-8990 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu Attachments: HDFS-8990.000.patch This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. The extant checkstyle warnings can be fixed in [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to replace the _log4j_ with _slf4j_ in this patch, we track the effort of removing the guards when calling LOG.debug() and LOG.trace() in jira [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8990) Move RemoteBlockReader to hdfs-client module
[ https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720793#comment-14720793 ] Haohui Mai commented on HDFS-8990: -- {code} public int available() throws IOException { // An optimistic estimate of how much data is available // to us without doing network I/O. -return DFSClient.TCP_WINDOW_SIZE; +return HdfsClientConfigKeys.DFS_CLIENT_CACHED_CONN_RETRY_DEFAULT; } {code} This is the wrong constant. {code} --- a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java +++ b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java @@ -46,6 +46,7 @@ int DFS_NAMENODE_RPC_PORT_DEFAULT = 8020; String DFS_NAMENODE_KERBEROS_PRINCIPAL_KEY = dfs.namenode.kerberos.principal; + int DFS_CLIENT_TCP_WINDOW_SIZE = 128 * 1024; // 128 KB String DFS_CLIENT_WRITE_PACKET_SIZE_KEY = dfs.client-write-packet-size; int DFS_CLIENT_WRITE_PACKET_SIZE_DEFAULT = 64*1024; String DFS_CLIENT_SOCKET_TIMEOUT_KEY = dfs.client.socket-timeout; {coe} {{TCP_WINDOW_SIZE}} is a constant that is only used by the {{RemoteBlockReader}} / {{RemoteBlockReader2}}. Let's put it into {{RemoteBlockBlockReader2}} instead. Move RemoteBlockReader to hdfs-client module Key: HDFS-8990 URL: https://issues.apache.org/jira/browse/HDFS-8990 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu Attachments: HDFS-8990.000.patch This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. The extant checkstyle warnings can be fixed in [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to replace the _log4j_ with _slf4j_ in this patch, we track the effort of removing the guards when calling LOG.debug() and LOG.trace() in jira [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager
[ https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720796#comment-14720796 ] Hudson commented on HDFS-8938: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2267 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2267/]) HDFS-8938. Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager. Contributed by Mingliang Liu. (wheat9: rev 6d12cd8d609dec26d44cece9937c35b7d72a3cd1) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ReplicationWork.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockToMarkCorrupt.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager -- Key: HDFS-8938 URL: https://issues.apache.org/jira/browse/HDFS-8938 Project: Hadoop HDFS Issue Type: Task Reporter: Mingliang Liu Assignee: Mingliang Liu Fix For: 2.8.0 Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, HDFS-8938.008.patch This jira proposes to refactor two inner static classes, {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to standalone classes. The refactor also improves readability by abstracting the complexity of scheduling and validating replications to corresponding helper methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module
[ https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-8990: Attachment: HDFS-8990.001.patch Move RemoteBlockReader to hdfs-client module Key: HDFS-8990 URL: https://issues.apache.org/jira/browse/HDFS-8990 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu Attachments: HDFS-8990.000.patch, HDFS-8990.001.patch This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. The extant checkstyle warnings can be fixed in [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to replace the _log4j_ with _slf4j_ in this patch, we track the effort of removing the guards when calling LOG.debug() and LOG.trace() in jira [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8983) NameNode support for protected directories
[ https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8983: Attachment: HDFS-8983.03.patch NameNode support for protected directories -- Key: HDFS-8983 URL: https://issues.apache.org/jira/browse/HDFS-8983 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch, HDFS-8983.03.patch To protect important system directories from inadvertent deletion (e.g. /Users) the NameNode can allow marking directories as _protected_. Such directories cannot be deleted unless they are empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8990) Move RemoteBlockReader to hdfs-client module
Mingliang Liu created HDFS-8990: --- Summary: Move RemoteBlockReader to hdfs-client module Key: HDFS-8990 URL: https://issues.apache.org/jira/browse/HDFS-8990 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu Fix For: 2.8.0 This jira tracks the effort of moving the {{BlockReader}} class into the hdfs-client module. We also move {{BlockReaderLocal}} class which implements the {{BlockReader}} interface to {{hdfs-client}} module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module
[ https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-8990: Fix Version/s: (was: 2.8.0) Move RemoteBlockReader to hdfs-client module Key: HDFS-8990 URL: https://issues.apache.org/jira/browse/HDFS-8990 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu This jira tracks the effort of moving the {{BlockReader}} class into the hdfs-client module. We also move {{BlockReaderLocal}} class which implements the {{BlockReader}} interface to {{hdfs-client}} module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module
[ https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-8990: Hadoop Flags: (was: Reviewed) Move RemoteBlockReader to hdfs-client module Key: HDFS-8990 URL: https://issues.apache.org/jira/browse/HDFS-8990 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu This jira tracks the effort of moving the {{BlockReader}} class into the hdfs-client module. We also move {{BlockReaderLocal}} class which implements the {{BlockReader}} interface to {{hdfs-client}} module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8965) Harden edit log reading code against out of memory errors
[ https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720699#comment-14720699 ] Colin Patrick McCabe commented on HDFS-8965: bq. Do we also already have tests for invalid op lengths (e.g. greater than max op size)? I see testFuzzSequences but that's not explicit. {{TestNameNodeRecovery#testNonDefaultMaxOpSize}} tests maximum op sizes. The latest patch fixes the test failure in {{TestJournal}}. The issue was that we need to ensure that {{scanOp}} works when the edit log version is newer than the latest version. Harden edit log reading code against out of memory errors - Key: HDFS-8965 URL: https://issues.apache.org/jira/browse/HDFS-8965 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, HDFS-8965.003.patch, HDFS-8965.004.patch, HDFS-8965.005.patch We should harden the edit log reading code against out of memory errors. Now that each op has a length prefix and a checksum, we can validate the checksum before trying to load the Op data. This should avoid out of memory errors when trying to load garbage data as Op data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8965) Harden edit log reading code against out of memory errors
[ https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-8965: --- Attachment: HDFS-8965.005.patch Harden edit log reading code against out of memory errors - Key: HDFS-8965 URL: https://issues.apache.org/jira/browse/HDFS-8965 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, HDFS-8965.003.patch, HDFS-8965.004.patch, HDFS-8965.005.patch We should harden the edit log reading code against out of memory errors. Now that each op has a length prefix and a checksum, we can validate the checksum before trying to load the Op data. This should avoid out of memory errors when trying to load garbage data as Op data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module
[ https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-8990: Description: This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} was moved to {{hadoop-hdfs-client}} module in [HDFS-8925|] (was: This jira tracks the effort of moving the {{BlockReader}} class into the hdfs-client module. We also move {{BlockReaderLocal}} class which implements the {{BlockReader}} interface to {{hdfs-client}} module. ) Move RemoteBlockReader to hdfs-client module Key: HDFS-8990 URL: https://issues.apache.org/jira/browse/HDFS-8990 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} was moved to {{hadoop-hdfs-client}} module in [HDFS-8925|] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module
[ https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-8990: Description: This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. The extant checkstyle warnings can be fixed in [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to replace the _log4j_ with _slf4j_, we track the effort of removing the guards when calling LOG.debug() and LOG.trace() in jira [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971] was:This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} was moved to {{hadoop-hdfs-client}} module in [HDFS-8925|] Move RemoteBlockReader to hdfs-client module Key: HDFS-8990 URL: https://issues.apache.org/jira/browse/HDFS-8990 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. The extant checkstyle warnings can be fixed in [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to replace the _log4j_ with _slf4j_, we track the effort of removing the guards when calling LOG.debug() and LOG.trace() in jira [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8990) Move RemoteBlockReader to hdfs-client module
[ https://issues.apache.org/jira/browse/HDFS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-8990: Description: This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. The extant checkstyle warnings can be fixed in [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to replace the _log4j_ with _slf4j_ in this patch, we track the effort of removing the guards when calling LOG.debug() and LOG.trace() in jira [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971]. was: This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. The extant checkstyle warnings can be fixed in [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to replace the _log4j_ with _slf4j_, we track the effort of removing the guards when calling LOG.debug() and LOG.trace() in jira [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971] Move RemoteBlockReader to hdfs-client module Key: HDFS-8990 URL: https://issues.apache.org/jira/browse/HDFS-8990 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu This jira tracks the effort of moving the {{RemoteBlockReader}} class into the {{hdfs-client}} module. {{BlockReader}} interface and {{BlockReaderLocal}} class were moved to {{hadoop-hdfs-client}} module in jira [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. The extant checkstyle warnings can be fixed in [HDFS-8979|https://issues.apache.org/jira/browse/HDFS-8979]. While we need to replace the _log4j_ with _slf4j_ in this patch, we track the effort of removing the guards when calling LOG.debug() and LOG.trace() in jira [HDFS-8971|https://issues.apache.org/jira/browse/HDFS-8971]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list
[ https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720727#comment-14720727 ] Hudson commented on HDFS-8950: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #324 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/324/]) HDFS-8950. NameNode refresh doesn't remove DataNodes that are no longer in the allowed list (Daniel Templeton) (cmccabe: rev b94b56806d3d6e04984e229b479f7ac15b62bbfa) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHostFileManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NameNode refresh doesn't remove DataNodes that are no longer in the allowed list Key: HDFS-8950 URL: https://issues.apache.org/jira/browse/HDFS-8950 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.6.0 Reporter: Daniel Templeton Assignee: Daniel Templeton Fix For: 2.8.0 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN refresh, it doesn't remove it actually and the NN UI keeps showing that node. It may try to allocate some blocks to that DN as well during an MR job. This issue is independent from DN decommission. To reproduce: 1. Add a DN to dfs_hosts_allow 2. Refresh NN 3. Start DN. Now NN starts seeing DN. 4. Stop DN 5. Remove DN from dfs_hosts_allow 6. Refresh NN - NN is still reporting DN as being used by HDFS. This is different from decom because there DN is added to exclude list in addition to being removed from allowed list, and in that case everything works correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8938) Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager
[ https://issues.apache.org/jira/browse/HDFS-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720728#comment-14720728 ] Hudson commented on HDFS-8938: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #324 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/324/]) HDFS-8938. Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager. Contributed by Mingliang Liu. (wheat9: rev 6d12cd8d609dec26d44cece9937c35b7d72a3cd1) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockToMarkCorrupt.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/ReplicationWork.java Extract BlockToMarkCorrupt and ReplicationWork as standalone classes from BlockManager -- Key: HDFS-8938 URL: https://issues.apache.org/jira/browse/HDFS-8938 Project: Hadoop HDFS Issue Type: Task Reporter: Mingliang Liu Assignee: Mingliang Liu Fix For: 2.8.0 Attachments: HDFS-8938.000.patch, HDFS-8938.001.patch, HDFS-8938.002.patch, HDFS-8938.003.patch, HDFS-8938.004.patch, HDFS-8938.005.patch, HDFS-8938.006.patch, HDFS-8938.007.patch, HDFS-8938.008.patch This jira proposes to refactor two inner static classes, {{BlockToMarkCorrupt}} and {{ReplicationWork}} from {{BlockManager}} to standalone classes. The refactor also improves readability by abstracting the complexity of scheduling and validating replications to corresponding helper methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list
[ https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720615#comment-14720615 ] Hudson commented on HDFS-8950: -- FAILURE: Integrated in Hadoop-trunk-Commit #8367 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8367/]) HDFS-8950. NameNode refresh doesn't remove DataNodes that are no longer in the allowed list (Daniel Templeton) (cmccabe: rev b94b56806d3d6e04984e229b479f7ac15b62bbfa) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHostFileManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt NameNode refresh doesn't remove DataNodes that are no longer in the allowed list Key: HDFS-8950 URL: https://issues.apache.org/jira/browse/HDFS-8950 Project: Hadoop HDFS Issue Type: Bug Components: datanode, namenode Affects Versions: 2.6.0 Reporter: Daniel Templeton Assignee: Daniel Templeton Fix For: 2.8.0 Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN refresh, it doesn't remove it actually and the NN UI keeps showing that node. It may try to allocate some blocks to that DN as well during an MR job. This issue is independent from DN decommission. To reproduce: 1. Add a DN to dfs_hosts_allow 2. Refresh NN 3. Start DN. Now NN starts seeing DN. 4. Stop DN 5. Remove DN from dfs_hosts_allow 6. Refresh NN - NN is still reporting DN as being used by HDFS. This is different from decom because there DN is added to exclude list in addition to being removed from allowed list, and in that case everything works correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8925) Move BlockReaderLocal to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8925: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) I've committed the patch to trunk and branch-2. Thanks [~liuml07] for the contribution. Move BlockReaderLocal to hdfs-client Key: HDFS-8925 URL: https://issues.apache.org/jira/browse/HDFS-8925 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu Fix For: 2.8.0 Attachments: HDFS-8925.000.patch, HDFS-8925.001.patch, HDFS-8925.002.patch This jira tracks the effort of moving the {{BlockReader}} class into the hdfs-client module. We also move {{BlockReaderLocal}} class which implements the {{BlockReader}} interface to {{hdfs-client}} module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720620#comment-14720620 ] Hudson commented on HDFS-8865: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2247 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2247/]) HDFS-8865. Improve quota initialization performance. Contributed by Kihwal Lee. (kihwal: rev b6ceee9bf42eec15891f60a014bbfa47e03f563c) * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/QuotaCounts.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageWithSnapshot.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDiskspaceQuotaUpdate.java Improve quota initialization performance Key: HDFS-8865 URL: https://issues.apache.org/jira/browse/HDFS-8865 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 3.0.0, 2.8.0 Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, HDFS-8865.v2.patch, HDFS-8865.v3.patch After replaying edits, the whole file system tree is recursively scanned in order to initialize the quota. For big name space, this can take a very long time. Since this is done during namenode failover, it also affects failover latency. By using the Fork-Join framework, I was able to greatly reduce the initialization time. The following is the test result using the fsimage from one of the big name nodes we have. || threads || seconds|| | 1 (existing) | 55| | 1 (fork-join) | 68 | | 4 | 16 | | 8 | 8 | | 12 | 6 | | 16 | 5 | | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8983) NameNode support for protected directories
[ https://issues.apache.org/jira/browse/HDFS-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720631#comment-14720631 ] Hadoop QA commented on HDFS-8983: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 53s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 53s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | javadoc | 10m 10s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 6s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 23m 2s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 161m 55s | Tests failed in hadoop-hdfs. | | | | 232m 10s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes | | | hadoop.hdfs.server.namenode.TestNameNodeMetricsLogger | | | hadoop.fs.permission.TestStickyBit | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752913/HDFS-8393.02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / beb65c9 | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/12202/artifact/patchprocess/diffJavacWarnings.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12202/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12202/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12202/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12202/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12202/console | This message was automatically generated. NameNode support for protected directories -- Key: HDFS-8983 URL: https://issues.apache.org/jira/browse/HDFS-8983 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.7.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: HDFS-8393.01.patch, HDFS-8393.02.patch To protect important system directories from inadvertent deletion (e.g. /Users) the NameNode can allow marking directories as _protected_. Such directories cannot be deleted unless they are empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)