[jira] [Commented] (HDFS-9364) Unnecessary DNS resolution attempts when creating NameNodeProxies
[ https://issues.apache.org/jira/browse/HDFS-9364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990427#comment-14990427 ] Hadoop QA commented on HDFS-9364: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 51s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 26s {color} | {color:red} hadoop-hdfs-project-jdk1.8.0_60 with JDK v1.8.0_60 generated 9 new issues (was 29, now 30). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 8m 21s {color} | {color:red} hadoop-hdfs-project-jdk1.7.0_79 with JDK v1.7.0_79 generated 9 new issues (was 29, now 30). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 49s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 20s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s {color} | {color:red} Patch generated 56 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 130m 38s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | |
[jira] [Commented] (HDFS-9007) Fix HDFS Balancer to honor upgrade domain policy
[ https://issues.apache.org/jira/browse/HDFS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990444#comment-14990444 ] Hudson commented on HDFS-9007: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #639 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/639/]) HDFS-9007. Fix HDFS Balancer to honor upgrade domain policy. (Ming Ma (lei: rev ec414600ede8e305c584818565b50e055ea5d2b5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java > Fix HDFS Balancer to honor upgrade domain policy > > > Key: HDFS-9007 > URL: https://issues.apache.org/jira/browse/HDFS-9007 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9007-2.patch, HDFS-9007-branch-2.patch, > HDFS-9007.patch > > > In the current design of HDFS Balancer, it doesn't use BlockPlacementPolicy > used by namenode runtime. Instead, it has somewhat redundant code to make > sure block allocation conforms with the rack policy. > When namenode uses upgrade domain based policy, we need to make sure that > HDFS balancer doesn't move blocks in a way that could violate upgrade domain > block placement policy. > In the longer term, we should consider how to make Balancer independent of > the actual BlockPlacementPolicy as in HDFS-1431. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990446#comment-14990446 ] Hudson commented on HDFS-8855: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #639 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/639/]) Revert "HDFS-8855. Webhdfs client leaks active NameNode connections. (wheat9: rev 88beb46cf6e6fd3e51f73a411a2750de7595e326) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/DataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestDataNodeUGIProvider.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Fix For: 2.8.0 > > Attachments: HDFS-8855.005.patch, HDFS-8855.006.patch, > HDFS-8855.007.patch, HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9357) NN UI renders icons of decommissioned DN incorrectly
[ https://issues.apache.org/jira/browse/HDFS-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990447#comment-14990447 ] Hudson commented on HDFS-9357: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #639 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/639/]) HDFS-9357. NN UI renders icons of decommissioned DN incorrectly. (wheat9: rev 0eed886a165f5a0850ddbfb1d5f98c7b5e379fb3) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/hadoop.css > NN UI renders icons of decommissioned DN incorrectly > > > Key: HDFS-9357 > URL: https://issues.apache.org/jira/browse/HDFS-9357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Critical > Fix For: 2.8.0 > > Attachments: Decommissioned_Dead_Fixed.PNG, Decommissioned_Fixed.PNG, > HDFS-9357.001.patch, HDFS-9357.001.patch, decommisioned_n_dead_.png, > decommissioned_.png > > > NN UI is not showing which DN is "Decommissioned "and "Decommissioned & dead" > Root Cause -- > "Decommissioned" and "Decommissioned & dead" icon not reflected on NN UI > When DN is in Decommissioned status or in "Decommissioned & dead" status, > same status is not reflected on NN UI > DN status is as below -- > hdfs dfsadmin -report > Name: 10.xx.xx.xx1:50076 (host-xx1) > Hostname: host-xx > Decommission Status : Decommissioned > Configured Capacity: 230501634048 (214.67 GB) > DFS Used: 36864 (36 KB) > Dead datanodes (1): > Name: 10.xx.xx.xx2:50076 (host-xx2) > Hostname: host-xx > Decommission Status : Decommissioned > Same is not reflected on NN UI. > Attached NN UI snapshots for the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9371) Code cleanup for DatanodeManager
[ https://issues.apache.org/jira/browse/HDFS-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-9371: Attachment: HDFS-9371.000.patch > Code cleanup for DatanodeManager > > > Key: HDFS-9371 > URL: https://issues.apache.org/jira/browse/HDFS-9371 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9371.000.patch > > > Some code cleanup for DatanodeManager. The main changes include: > # make the synchronization of {{datanodeMap}} and > {{datanodesSoftwareVersions}} consistent > # remove unnecessary lock in {{handleHeartbeat}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9377) Fix findbugs warnings in FSDirSnapshotOp
[ https://issues.apache.org/jira/browse/HDFS-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-9377: Hadoop Flags: Reviewed Target Version/s: 2.8.0 Component/s: namenode +1 pending Jenkins run. Thank you, Mingliang. > Fix findbugs warnings in FSDirSnapshotOp > > > Key: HDFS-9377 > URL: https://issues.apache.org/jira/browse/HDFS-9377 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9377.000.patch > > > I ran findbugs version 3.0.1 and find there is a findbugs warning in > {{FSDirSnapshotOp}}, brought by [HDFS-9231]. > It's caused by unused variable and the fix is simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9282) Make data directory count and storage raw capacity related tests FsDataset-agnostic
[ https://issues.apache.org/jira/browse/HDFS-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990521#comment-14990521 ] Tony Wu commented on HDFS-9282: --- Thanks [~eddyxu] for catching this. I have incorporated the comments in the latest patch. > Make data directory count and storage raw capacity related tests > FsDataset-agnostic > --- > > Key: HDFS-9282 > URL: https://issues.apache.org/jira/browse/HDFS-9282 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Minor > Attachments: HDFS-9282.001.patch, HDFS-9282.002.patch, > HDFS-9282.003.patch > > > DFSMiniCluster and several tests have hard coded assumption of the underlying > storage having 2 data directories (volumes). As HDFS-9188 pointed out, with > new FsDataset implementations, these hard coded assumption about number of > data directories and raw capacities of storage may change as well. > We need to extend FsDatasetTestUtils to provide: > * Number of data directories of underlying storage per DataNode > * Raw storage capacity of underlying storage per DataNode. > * Have MiniDFSCluster automatically pick up the correct values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9318) considerLoad factor can be improved
[ https://issues.apache.org/jira/browse/HDFS-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990641#comment-14990641 ] Hadoop QA commented on HDFS-9318: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 5s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 1s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s {color} | {color:red} Patch generated 1 new checkstyle issues in hadoop-hdfs-project/hadoop-hdfs (total was 423, now 424). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 8s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 10s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s {color} | {color:red} Patch generated 58 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 121m 50s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.shortcircuit.TestShortCircuitCache | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-04 | | JIRA Patch URL |
[jira] [Updated] (HDFS-9377) Fix findbugs warnings in FSDirSnapshotOp
[ https://issues.apache.org/jira/browse/HDFS-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9377: Description: I ran findbugs version 3.0.1 and find there is a findbugs warning in {{FSDirSnapshotOp}}, brought by [HDFS-9231]. It's caused by unused variable and the fix is simple. {code:title=findbugsXml.xml} Dead store to local variable Dead store to sfi in org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.getSnapshotFiles(FSDirectory, List, String) At FSDirSnapshotOp.java:[lines 41-279] In class org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp In method org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.getSnapshotFiles(FSDirectory, List, String) Local variable named sfi At FSDirSnapshotOp.java:[line 175] {code} was: I ran findbugs version 3.0.1 and find there is a findbugs warning in {{FSDirSnapshotOp}}, brought by [HDFS-9231]. It's caused by unused variable and the fix is simple. > Fix findbugs warnings in FSDirSnapshotOp > > > Key: HDFS-9377 > URL: https://issues.apache.org/jira/browse/HDFS-9377 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9377.000.patch > > > I ran findbugs version 3.0.1 and find there is a findbugs warning in > {{FSDirSnapshotOp}}, brought by [HDFS-9231]. > It's caused by unused variable and the fix is simple. > {code:title=findbugsXml.xml} > instanceHash="e553cd68a81bb1d8aaf6eba15aa9d322" instanceOccurrenceNum="0" > priority="2" abbrev="DLS" type="DLS_DEAD_LOCAL_STORE" cweid="563" > instanceOccurrenceMax="0"> > Dead store to local variable > > Dead store to sfi in > org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.getSnapshotFiles(FSDirectory, > List, String) > > primary="true"> > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > sourcepath="org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java" > sourcefile="FSDirSnapshotOp.java" end="279"> > At FSDirSnapshotOp.java:[lines 41-279] > > > In class org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp > > > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > name="getSnapshotFiles" primary="true" > signature="(Lorg/apache/hadoop/hdfs/server/namenode/FSDirectory;Ljava/util/List;Ljava/lang/String;)Ljava/util/Collection;"> > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > sourcepath="org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java" > sourcefile="FSDirSnapshotOp.java" end="197"/> > > In method > org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.getSnapshotFiles(FSDirectory, > List, String) > > > > Local variable named sfi > > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > primary="true" > sourcepath="org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java" > sourcefile="FSDirSnapshotOp.java" end="175"> > At FSDirSnapshotOp.java:[line 175] > > name="edu.umd.cs.findbugs.detect.DeadLocalStoreProperty.METHOD_RESULT" > value="true"/> > value="sfi"/> > value="true"/> > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9231) fsck doesn't list correct file path when Bad Replicas/Blocks are in a snapshot
[ https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990582#comment-14990582 ] Yongjun Zhang commented on HDFS-9231: - Many thanks [~liuml07], I reviewed and commented in HDFS-9377. > fsck doesn't list correct file path when Bad Replicas/Blocks are in a snapshot > -- > > Key: HDFS-9231 > URL: https://issues.apache.org/jira/browse/HDFS-9231 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Reporter: Xiao Chen >Assignee: Xiao Chen > Fix For: 2.8.0 > > Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch, > HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch, > HDFS-9231.006.patch, HDFS-9231.007.patch, HDFS-9231.008.patch, > HDFS-9231.009.patch > > > Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt > blocks with the original file dir instead of the snapshot dir, and {{fsck > -list-corruptfileblocks -includeSnapshots}} behave the same. > This can be confusing because even when the original file is deleted, fsck > will still show that deleted file as corrupted, although what's actually > corrupted is the snapshot. > As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9377) Fix findbugs warnings in FSDirSnapshotOp
[ https://issues.apache.org/jira/browse/HDFS-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990579#comment-14990579 ] Yongjun Zhang commented on HDFS-9377: - Thanks [~liuml07] a lot for the good find here! +1 pending jenkins. I wish HADOOP-12517 is fixed soon so we won't miss this. Sorry for missing that when I did the review. > Fix findbugs warnings in FSDirSnapshotOp > > > Key: HDFS-9377 > URL: https://issues.apache.org/jira/browse/HDFS-9377 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9377.000.patch > > > I ran findbugs version 3.0.1 and find there is a findbugs warning in > {{FSDirSnapshotOp}}, brought by [HDFS-9231]. > It's caused by unused variable and the fix is simple. > {code:title=findbugsXml.xml} > instanceHash="e553cd68a81bb1d8aaf6eba15aa9d322" instanceOccurrenceNum="0" > priority="2" abbrev="DLS" type="DLS_DEAD_LOCAL_STORE" cweid="563" > instanceOccurrenceMax="0"> > Dead store to local variable > > Dead store to sfi in > org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.getSnapshotFiles(FSDirectory, > List, String) > > primary="true"> > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > sourcepath="org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java" > sourcefile="FSDirSnapshotOp.java" end="279"> > At FSDirSnapshotOp.java:[lines 41-279] > > > In class org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp > > > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > name="getSnapshotFiles" primary="true" > signature="(Lorg/apache/hadoop/hdfs/server/namenode/FSDirectory;Ljava/util/List;Ljava/lang/String;)Ljava/util/Collection;"> > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > sourcepath="org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java" > sourcefile="FSDirSnapshotOp.java" end="197"/> > > In method > org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.getSnapshotFiles(FSDirectory, > List, String) > > > > Local variable named sfi > > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > primary="true" > sourcepath="org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java" > sourcefile="FSDirSnapshotOp.java" end="175"> > At FSDirSnapshotOp.java:[line 175] > > name="edu.umd.cs.findbugs.detect.DeadLocalStoreProperty.METHOD_RESULT" > value="true"/> > value="sfi"/> > value="true"/> > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9371) Code cleanup for DatanodeManager
[ https://issues.apache.org/jira/browse/HDFS-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-9371: Attachment: (was: HDFS-9371.000.patch) > Code cleanup for DatanodeManager > > > Key: HDFS-9371 > URL: https://issues.apache.org/jira/browse/HDFS-9371 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9371.000.patch > > > Some code cleanup for DatanodeManager. The main changes include: > # make the synchronization of {{datanodeMap}} and > {{datanodesSoftwareVersions}} consistent > # remove unnecessary lock in {{handleHeartbeat}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9282) Make data directory count and storage raw capacity related tests FsDataset-agnostic
[ https://issues.apache.org/jira/browse/HDFS-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Wu updated HDFS-9282: -- Attachment: HDFS-9282.004.patch In v4 patch: * Addressed [~eddyxu]'s review comments (use {{try-with-resources}} to avoid leaking resources). > Make data directory count and storage raw capacity related tests > FsDataset-agnostic > --- > > Key: HDFS-9282 > URL: https://issues.apache.org/jira/browse/HDFS-9282 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Minor > Attachments: HDFS-9282.001.patch, HDFS-9282.002.patch, > HDFS-9282.003.patch, HDFS-9282.004.patch > > > DFSMiniCluster and several tests have hard coded assumption of the underlying > storage having 2 data directories (volumes). As HDFS-9188 pointed out, with > new FsDataset implementations, these hard coded assumption about number of > data directories and raw capacities of storage may change as well. > We need to extend FsDatasetTestUtils to provide: > * Number of data directories of underlying storage per DataNode > * Raw storage capacity of underlying storage per DataNode. > * Have MiniDFSCluster automatically pick up the correct values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9377) Fix findbugs warnings in FSDirSnapshotOp
[ https://issues.apache.org/jira/browse/HDFS-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9377: Status: Patch Available (was: Open) > Fix findbugs warnings in FSDirSnapshotOp > > > Key: HDFS-9377 > URL: https://issues.apache.org/jira/browse/HDFS-9377 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9377.000.patch > > > I ran findbugs version 3.0.1 and find there is a findbugs warning in > {{FSDirSnapshotOp}}, brought by [HDFS-9231]. > It's caused by unused variable and the fix is simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9231) fsck doesn't list correct file path when Bad Replicas/Blocks are in a snapshot
[ https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990546#comment-14990546 ] Mingliang Liu commented on HDFS-9231: - I found a findbugs warning in {{FSDirSnapshotOp}} probably brought by this patch, tracked by [HDFS-9377]. Please close that jira if it's wrong, or review that otherwise. Thanks. > fsck doesn't list correct file path when Bad Replicas/Blocks are in a snapshot > -- > > Key: HDFS-9231 > URL: https://issues.apache.org/jira/browse/HDFS-9231 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Reporter: Xiao Chen >Assignee: Xiao Chen > Fix For: 2.8.0 > > Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch, > HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch, > HDFS-9231.006.patch, HDFS-9231.007.patch, HDFS-9231.008.patch, > HDFS-9231.009.patch > > > Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt > blocks with the original file dir instead of the snapshot dir, and {{fsck > -list-corruptfileblocks -includeSnapshots}} behave the same. > This can be confusing because even when the original file is deleted, fsck > will still show that deleted file as corrupted, although what's actually > corrupted is the snapshot. > As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9231) fsck doesn't list correct file path when Bad Replicas/Blocks are in a snapshot
[ https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990628#comment-14990628 ] Xiao Chen commented on HDFS-9231: - Thanks [~liuml07] and [~yzhangal], and sorry for the findbugs miss. > fsck doesn't list correct file path when Bad Replicas/Blocks are in a snapshot > -- > > Key: HDFS-9231 > URL: https://issues.apache.org/jira/browse/HDFS-9231 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Reporter: Xiao Chen >Assignee: Xiao Chen > Fix For: 2.8.0 > > Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch, > HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch, > HDFS-9231.006.patch, HDFS-9231.007.patch, HDFS-9231.008.patch, > HDFS-9231.009.patch > > > Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt > blocks with the original file dir instead of the snapshot dir, and {{fsck > -list-corruptfileblocks -includeSnapshots}} behave the same. > This can be confusing because even when the original file is deleted, fsck > will still show that deleted file as corrupted, although what's actually > corrupted is the snapshot. > As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9103) Retry reads on DN failure
[ https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990464#comment-14990464 ] James Clampffer commented on HDFS-9103: --- The thumbs down was supposed to be a log base 2. One of these days I'll learn how to use jira markdown or at least what character sequences I need to escape. Also figured out that the DNs are identified by uuid, I forgot about that. That's a bummer because it won't fit in the typical std::string small string optimization space. > Retry reads on DN failure > - > > Key: HDFS-9103 > URL: https://issues.apache.org/jira/browse/HDFS-9103 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: James Clampffer > Fix For: HDFS-8707 > > Attachments: HDFS-9103.1.patch, HDFS-9103.2.patch, > HDFS-9103.HDFS-8707.006.patch, HDFS-9103.HDFS-8707.3.patch, > HDFS-9103.HDFS-8707.4.patch, HDFS-9103.HDFS-8707.5.patch > > > When AsyncPreadSome fails, add the failed DataNode to the excluded list and > try again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9377) Fix findbugs warnings in FSDirSnapshotOp
[ https://issues.apache.org/jira/browse/HDFS-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990634#comment-14990634 ] Xiao Chen commented on HDFS-9377: - Thanks very much for the work guys! I added a link to the original JIRA, sorry for breaking this. > Fix findbugs warnings in FSDirSnapshotOp > > > Key: HDFS-9377 > URL: https://issues.apache.org/jira/browse/HDFS-9377 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9377.000.patch > > > I ran findbugs version 3.0.1 and find there is a findbugs warning in > {{FSDirSnapshotOp}}, brought by [HDFS-9231]. > It's caused by unused variable and the fix is simple. > {code:title=findbugsXml.xml} > instanceHash="e553cd68a81bb1d8aaf6eba15aa9d322" instanceOccurrenceNum="0" > priority="2" abbrev="DLS" type="DLS_DEAD_LOCAL_STORE" cweid="563" > instanceOccurrenceMax="0"> > Dead store to local variable > > Dead store to sfi in > org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.getSnapshotFiles(FSDirectory, > List, String) > > primary="true"> > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > sourcepath="org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java" > sourcefile="FSDirSnapshotOp.java" end="279"> > At FSDirSnapshotOp.java:[lines 41-279] > > > In class org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp > > > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > name="getSnapshotFiles" primary="true" > signature="(Lorg/apache/hadoop/hdfs/server/namenode/FSDirectory;Ljava/util/List;Ljava/lang/String;)Ljava/util/Collection;"> > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > sourcepath="org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java" > sourcefile="FSDirSnapshotOp.java" end="197"/> > > In method > org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp.getSnapshotFiles(FSDirectory, > List, String) > > > > Local variable named sfi > > classname="org.apache.hadoop.hdfs.server.namenode.FSDirSnapshotOp" > primary="true" > sourcepath="org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java" > sourcefile="FSDirSnapshotOp.java" end="175"> > At FSDirSnapshotOp.java:[line 175] > > name="edu.umd.cs.findbugs.detect.DeadLocalStoreProperty.METHOD_RESULT" > value="true"/> > value="sfi"/> > value="true"/> > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9103) Retry reads on DN failure
[ https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990381#comment-14990381 ] James Clampffer commented on HDFS-9103: --- "Every node in the set requires a separate heap allocation. They might scatter around the address space. vector is on the heap as well but it guarantees continuous amount of memory. Some implementation of string has inlined buffer which has much better cache locality results." I agree with both of those statements. Do datanode IDs have a max length, is it just uuid? SSO is usually limited to 23 chars on 64 bit machines. Intel processors have 8-way associative caches for now so I'm not terribly worried about address space fragmentation. The processor has to try a little harder because it's not a simple linear prefetch to scoop up the vector anymore, but superscalar pipelines have multiple load units :) I think I might try out a more architectural fix to side-step this whole problem(why I didn't get a patch up yet). How about passing a function "IsDead(const std::string& dn)" through the InputStream down to the block reader. My current approach of generating a new set or vector of bad nodes on every calls is terribly inefficient. Even if SSO kicked in and it boiled down to a memcpy there's still a smallish heap allocation for every GetNodesToExclude call. Passing down a function avoids keeping the redundant copies in cache to begin with. I'd change BadDataNodeTracker::bad_datanodes_ to a map (this is what it always should have been, not sure why I thought a set of pairs keyed by pair::first was a good idea...). The IsDead function would grab the update lock which is usually implemented as a CAS in userspace, and do a O(log(n)) map lookup. In my experience the log2(smallish number) indirections with std::map that lookup shouldn't come close to bottlenecking anything. Do you see any obvious issues with this approach? "Checking with the code kResourceUnavailable is only for the NN cannot find any DNs to serve this data. I don't think we'll need to handle this case when excluding the DNs." Thanks for the info. I was hoping this was the case but wasn't sure if I was missing something that would be added soon. > Retry reads on DN failure > - > > Key: HDFS-9103 > URL: https://issues.apache.org/jira/browse/HDFS-9103 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: James Clampffer > Fix For: HDFS-8707 > > Attachments: HDFS-9103.1.patch, HDFS-9103.2.patch, > HDFS-9103.HDFS-8707.006.patch, HDFS-9103.HDFS-8707.3.patch, > HDFS-9103.HDFS-8707.4.patch, HDFS-9103.HDFS-8707.5.patch > > > When AsyncPreadSome fails, add the failed DataNode to the excluded list and > try again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9363) Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica.
[ https://issues.apache.org/jira/browse/HDFS-9363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990445#comment-14990445 ] Hudson commented on HDFS-9363: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #639 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/639/]) HDFS-9363. Add fetchReplica to FsDatasetTestUtils to return (lei: rev 5667129276c3123ecb0a96b78d5897431c47a9d5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica. > -- > > Key: HDFS-9363 > URL: https://issues.apache.org/jira/browse/HDFS-9363 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Minor > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9363.001.patch > > > {{FsDatasetTestUtils()}} abstracts away the details in {{FsDataset}} to allow > writing generic tests regardless of underlying {{FsDataset}} implementations. > We can add a {{fetchReplica()}} method to allow some HDFS tests to avoid > using {{FsDatasetTestUtil#fetchReplicaInfo()}}, which assumes FsDatasetImpl > is the only implementation of FsDataset. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9331) Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem entirely allocated for DFS use
[ https://issues.apache.org/jira/browse/HDFS-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990449#comment-14990449 ] Hudson commented on HDFS-9331: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #639 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/639/]) HDFS-9331. Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account (lei: rev e2a5441b062fd0758138079d24a2740fc5e5e350) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java > Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem > entirely allocated for DFS use > --- > > Key: HDFS-9331 > URL: https://issues.apache.org/jira/browse/HDFS-9331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Trivial > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9331.001.patch > > > {{TestNameNodeMXBean#testNameNodeMXBeanInfo}} expects a none-zero nonDFS > size. The nonDFS size is defined as: > {quote} > The space that is not used by HDFS. For instance, once you format a new disk > to ext4, certain space is used for "lost-and-found" directory and ext4 > metadata. > {quote} > It will be possible to fully allocate all spaces in a filesystem for DFS use. > In which case the nonDFS size will be zero. We can relax the check in the > test to account for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9103) Retry reads on DN failure
[ https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990478#comment-14990478 ] Haohui Mai commented on HDFS-9103: -- Thanks for the investigation. I'm okay with either approach. Please feel free to pick the simpler approach to implement. :-D > Retry reads on DN failure > - > > Key: HDFS-9103 > URL: https://issues.apache.org/jira/browse/HDFS-9103 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: James Clampffer > Fix For: HDFS-8707 > > Attachments: HDFS-9103.1.patch, HDFS-9103.2.patch, > HDFS-9103.HDFS-8707.006.patch, HDFS-9103.HDFS-8707.3.patch, > HDFS-9103.HDFS-8707.4.patch, HDFS-9103.HDFS-8707.5.patch > > > When AsyncPreadSome fails, add the failed DataNode to the excluded list and > try again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9328) Formalize coding standards for libhdfs++ and put them in a README.txt
[ https://issues.apache.org/jira/browse/HDFS-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer reassigned HDFS-9328: - Assignee: James Clampffer (was: Haohui Mai) > Formalize coding standards for libhdfs++ and put them in a README.txt > - > > Key: HDFS-9328 > URL: https://issues.apache.org/jira/browse/HDFS-9328 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer >Priority: Blocker > > We have 2-3 people working on this project full time and hopefully more > people will start contributing. In order to efficiently scale we need a > single, easy to find, place where developers can check to make sure they are > following the coding standards of this project to both save their time and > save the time of people doing code reviews. > The most practical place to do this seems like a README file in libhdfspp/. > The foundation of the standards is google's C++ guide found here: > https://google-styleguide.googlecode.com/svn/trunk/cppguide.html > Any exceptions to google's standards or additional restrictions need to be > explicitly enumerated so there is one single point of reference for all > libhdfs++ code standards. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9377) Fix findbugs warnings in FSDirSnapshotOp
[ https://issues.apache.org/jira/browse/HDFS-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9377: Attachment: HDFS-9377.000.patch > Fix findbugs warnings in FSDirSnapshotOp > > > Key: HDFS-9377 > URL: https://issues.apache.org/jira/browse/HDFS-9377 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9377.000.patch > > > I ran findbugs version 3.0.1 and find there is a findbugs warning in > {{FSDirSnapshotOp}}, brought by [HDFS-9231]. > It's caused by unused variable and the fix is simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9377) Fix findbugs warnings in FSDirSnapshotOp
Mingliang Liu created HDFS-9377: --- Summary: Fix findbugs warnings in FSDirSnapshotOp Key: HDFS-9377 URL: https://issues.apache.org/jira/browse/HDFS-9377 Project: Hadoop HDFS Issue Type: Task Reporter: Mingliang Liu Assignee: Mingliang Liu Attachments: HDFS-9377.000.patch I ran findbugs version 3.0.1 and find there is a findbugs warning in {{FSDirSnapshotOp}}, brought by [HDFS-9231]. It's caused by unused variable and the fix is simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9007) Fix HDFS Balancer to honor upgrade domain policy
[ https://issues.apache.org/jira/browse/HDFS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990593#comment-14990593 ] Hudson commented on HDFS-9007: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2569 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2569/]) HDFS-9007. Fix HDFS Balancer to honor upgrade domain policy. (Ming Ma (lei: rev ec414600ede8e305c584818565b50e055ea5d2b5) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java > Fix HDFS Balancer to honor upgrade domain policy > > > Key: HDFS-9007 > URL: https://issues.apache.org/jira/browse/HDFS-9007 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9007-2.patch, HDFS-9007-branch-2.patch, > HDFS-9007.patch > > > In the current design of HDFS Balancer, it doesn't use BlockPlacementPolicy > used by namenode runtime. Instead, it has somewhat redundant code to make > sure block allocation conforms with the rack policy. > When namenode uses upgrade domain based policy, we need to make sure that > HDFS balancer doesn't move blocks in a way that could violate upgrade domain > block placement policy. > In the longer term, we should consider how to make Balancer independent of > the actual BlockPlacementPolicy as in HDFS-1431. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9357) NN UI renders icons of decommissioned DN incorrectly
[ https://issues.apache.org/jira/browse/HDFS-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990596#comment-14990596 ] Hudson commented on HDFS-9357: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2569 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2569/]) HDFS-9357. NN UI renders icons of decommissioned DN incorrectly. (wheat9: rev 0eed886a165f5a0850ddbfb1d5f98c7b5e379fb3) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/hadoop.css > NN UI renders icons of decommissioned DN incorrectly > > > Key: HDFS-9357 > URL: https://issues.apache.org/jira/browse/HDFS-9357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Critical > Fix For: 2.8.0 > > Attachments: Decommissioned_Dead_Fixed.PNG, Decommissioned_Fixed.PNG, > HDFS-9357.001.patch, HDFS-9357.001.patch, decommisioned_n_dead_.png, > decommissioned_.png > > > NN UI is not showing which DN is "Decommissioned "and "Decommissioned & dead" > Root Cause -- > "Decommissioned" and "Decommissioned & dead" icon not reflected on NN UI > When DN is in Decommissioned status or in "Decommissioned & dead" status, > same status is not reflected on NN UI > DN status is as below -- > hdfs dfsadmin -report > Name: 10.xx.xx.xx1:50076 (host-xx1) > Hostname: host-xx > Decommission Status : Decommissioned > Configured Capacity: 230501634048 (214.67 GB) > DFS Used: 36864 (36 KB) > Dead datanodes (1): > Name: 10.xx.xx.xx2:50076 (host-xx2) > Hostname: host-xx > Decommission Status : Decommissioned > Same is not reflected on NN UI. > Attached NN UI snapshots for the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9331) Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem entirely allocated for DFS use
[ https://issues.apache.org/jira/browse/HDFS-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990598#comment-14990598 ] Hudson commented on HDFS-9331: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2569 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2569/]) HDFS-9331. Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account (lei: rev e2a5441b062fd0758138079d24a2740fc5e5e350) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java > Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem > entirely allocated for DFS use > --- > > Key: HDFS-9331 > URL: https://issues.apache.org/jira/browse/HDFS-9331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Trivial > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9331.001.patch > > > {{TestNameNodeMXBean#testNameNodeMXBeanInfo}} expects a none-zero nonDFS > size. The nonDFS size is defined as: > {quote} > The space that is not used by HDFS. For instance, once you format a new disk > to ext4, certain space is used for "lost-and-found" directory and ext4 > metadata. > {quote} > It will be possible to fully allocate all spaces in a filesystem for DFS use. > In which case the nonDFS size will be zero. We can relax the check in the > test to account for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9363) Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica.
[ https://issues.apache.org/jira/browse/HDFS-9363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990594#comment-14990594 ] Hudson commented on HDFS-9363: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2569 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2569/]) HDFS-9363. Add fetchReplica to FsDatasetTestUtils to return (lei: rev 5667129276c3123ecb0a96b78d5897431c47a9d5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java > Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica. > -- > > Key: HDFS-9363 > URL: https://issues.apache.org/jira/browse/HDFS-9363 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Minor > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9363.001.patch > > > {{FsDatasetTestUtils()}} abstracts away the details in {{FsDataset}} to allow > writing generic tests regardless of underlying {{FsDataset}} implementations. > We can add a {{fetchReplica()}} method to allow some HDFS tests to avoid > using {{FsDatasetTestUtil#fetchReplicaInfo()}}, which assumes FsDatasetImpl > is the only implementation of FsDataset. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990595#comment-14990595 ] Hudson commented on HDFS-8855: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2569 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2569/]) Revert "HDFS-8855. Webhdfs client leaks active NameNode connections. (wheat9: rev 88beb46cf6e6fd3e51f73a411a2750de7595e326) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestDataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/DataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Fix For: 2.8.0 > > Attachments: HDFS-8855.005.patch, HDFS-8855.006.patch, > HDFS-8855.007.patch, HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9363) Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica.
[ https://issues.apache.org/jira/browse/HDFS-9363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990651#comment-14990651 ] Hudson commented on HDFS-9363: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #1362 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1362/]) HDFS-9363. Add fetchReplica to FsDatasetTestUtils to return (lei: rev 5667129276c3123ecb0a96b78d5897431c47a9d5) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java > Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica. > -- > > Key: HDFS-9363 > URL: https://issues.apache.org/jira/browse/HDFS-9363 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Minor > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9363.001.patch > > > {{FsDatasetTestUtils()}} abstracts away the details in {{FsDataset}} to allow > writing generic tests regardless of underlying {{FsDataset}} implementations. > We can add a {{fetchReplica()}} method to allow some HDFS tests to avoid > using {{FsDatasetTestUtil#fetchReplicaInfo()}}, which assumes FsDatasetImpl > is the only implementation of FsDataset. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9331) Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem entirely allocated for DFS use
[ https://issues.apache.org/jira/browse/HDFS-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990654#comment-14990654 ] Hudson commented on HDFS-9331: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #1362 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1362/]) HDFS-9331. Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account (lei: rev e2a5441b062fd0758138079d24a2740fc5e5e350) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem > entirely allocated for DFS use > --- > > Key: HDFS-9331 > URL: https://issues.apache.org/jira/browse/HDFS-9331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Trivial > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9331.001.patch > > > {{TestNameNodeMXBean#testNameNodeMXBeanInfo}} expects a none-zero nonDFS > size. The nonDFS size is defined as: > {quote} > The space that is not used by HDFS. For instance, once you format a new disk > to ext4, certain space is used for "lost-and-found" directory and ext4 > metadata. > {quote} > It will be possible to fully allocate all spaces in a filesystem for DFS use. > In which case the nonDFS size will be zero. We can relax the check in the > test to account for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9231) fsck doesn't list correct file path when Bad Replicas/Blocks are in a snapshot
[ https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990661#comment-14990661 ] Mingliang Liu commented on HDFS-9231: - Thanks for your quick reply, [~yzhangal] and [~xiaochen]. Actually the Jenkins test-patch report is not showing this findbugs warning and it's understood to miss it. I also hope the [HADOOP-12517] be resolved soon, if there is any fundamental issue. > fsck doesn't list correct file path when Bad Replicas/Blocks are in a snapshot > -- > > Key: HDFS-9231 > URL: https://issues.apache.org/jira/browse/HDFS-9231 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Reporter: Xiao Chen >Assignee: Xiao Chen > Fix For: 2.8.0 > > Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch, > HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch, > HDFS-9231.006.patch, HDFS-9231.007.patch, HDFS-9231.008.patch, > HDFS-9231.009.patch > > > Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt > blocks with the original file dir instead of the snapshot dir, and {{fsck > -list-corruptfileblocks -includeSnapshots}} behave the same. > This can be confusing because even when the original file is deleted, fsck > will still show that deleted file as corrupted, although what's actually > corrupted is the snapshot. > As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9369) Use ctest to run tests for hadoop-hdfs-native-client
[ https://issues.apache.org/jira/browse/HDFS-9369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990736#comment-14990736 ] Haohui Mai commented on HDFS-9369: -- It looks like there is no need to change the patch. The information is already available at {{target/Testing/Temporary/LastTest.log}}. > Use ctest to run tests for hadoop-hdfs-native-client > > > Key: HDFS-9369 > URL: https://issues.apache.org/jira/browse/HDFS-9369 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Attachments: HDFS-9369.000.patch > > > Currently we write special rules in {{pom.xml}} to run tests in > {{hadoop-hdfs-native-client}}. This jira proposes to run these tests using > ctest to simplify {{pom.xml}} and improve portability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9103) Retry reads on DN failure
[ https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9103: -- Attachment: HDFS-9103.HDFS-8707.007.patch New patch, there's a bit of extra noise due to clang-format hitting a few files that hadn't had it before. Addressing Haohui's batch of concerns in order: -That name_match function isn't needed after switching bad_datanodes_ to a map -Got rid of BadDataNodeTracker::GetNodesToExclude and added a IsBadNode method instead. The InputStream takes a shared_ptr to the BadDataNodeTracker and calls IsBadNode directly, this should get rid of any need for caching as it gets rid of a lot of copies and other work making sets of strings. -Got rid of BadDataNodeTracker::Clear entirely and changed the tests so that BadDataNodeTracker is scoped by test function. This avoids issues with possibly carrying state between tests. -Added a datanode exclusion duration to the Option class with a default of 10 minutes. Switched time units to milliseconds to be consistent. Is there a standard name for this? I didn't see anything in the options used for hdfs-sites.xml. -Switched from system_clock to steady_clock to make sure time is always monotonically increasing. -I think the way I rearranged the code that this comment referred to simplified it. If it's not please let me know what exactly needs to be simplified. -Made ShouldExclude a static method of InputStream, got rid of the duplicate used by the gmock test. > Retry reads on DN failure > - > > Key: HDFS-9103 > URL: https://issues.apache.org/jira/browse/HDFS-9103 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: James Clampffer > Fix For: HDFS-8707 > > Attachments: HDFS-9103.1.patch, HDFS-9103.2.patch, > HDFS-9103.HDFS-8707.006.patch, HDFS-9103.HDFS-8707.007.patch, > HDFS-9103.HDFS-8707.3.patch, HDFS-9103.HDFS-8707.4.patch, > HDFS-9103.HDFS-8707.5.patch > > > When AsyncPreadSome fails, add the failed DataNode to the excluded list and > try again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9360) Storage type usage isn't updated properly after file deletion
[ https://issues.apache.org/jira/browse/HDFS-9360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990731#comment-14990731 ] Xiaoyu Yao commented on HDFS-9360: -- Patch v002 looks good to me. The test failures and findbugs issues are unrelated to this patch. I will commit it shortly. > Storage type usage isn't updated properly after file deletion > - > > Key: HDFS-9360 > URL: https://issues.apache.org/jira/browse/HDFS-9360 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9360-2.patch, HDFS-9360.patch > > > For a directory that doesn't have any storage policy defined, its storage > quota usage is deducted when a file is deleted (addBlock skips storage quota > usage update in such case). This means negative value for storage quota > usage. Later after applications set the storage policy and storage type > quota, it allows the applications to use more than its storage type quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9363) Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica.
[ https://issues.apache.org/jira/browse/HDFS-9363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990727#comment-14990727 ] Hudson commented on HDFS-9363: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #628 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/628/]) HDFS-9363. Add fetchReplica to FsDatasetTestUtils to return (lei: rev 5667129276c3123ecb0a96b78d5897431c47a9d5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java > Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica. > -- > > Key: HDFS-9363 > URL: https://issues.apache.org/jira/browse/HDFS-9363 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Minor > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9363.001.patch > > > {{FsDatasetTestUtils()}} abstracts away the details in {{FsDataset}} to allow > writing generic tests regardless of underlying {{FsDataset}} implementations. > We can add a {{fetchReplica()}} method to allow some HDFS tests to avoid > using {{FsDatasetTestUtil#fetchReplicaInfo()}}, which assumes FsDatasetImpl > is the only implementation of FsDataset. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990728#comment-14990728 ] Hudson commented on HDFS-8855: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #628 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/628/]) Revert "HDFS-8855. Webhdfs client leaks active NameNode connections. (wheat9: rev 88beb46cf6e6fd3e51f73a411a2750de7595e326) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestDataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/DataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Fix For: 2.8.0 > > Attachments: HDFS-8855.005.patch, HDFS-8855.006.patch, > HDFS-8855.007.patch, HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9331) Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem entirely allocated for DFS use
[ https://issues.apache.org/jira/browse/HDFS-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990730#comment-14990730 ] Hudson commented on HDFS-9331: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #628 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/628/]) HDFS-9331. Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account (lei: rev e2a5441b062fd0758138079d24a2740fc5e5e350) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java > Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem > entirely allocated for DFS use > --- > > Key: HDFS-9331 > URL: https://issues.apache.org/jira/browse/HDFS-9331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Trivial > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9331.001.patch > > > {{TestNameNodeMXBean#testNameNodeMXBeanInfo}} expects a none-zero nonDFS > size. The nonDFS size is defined as: > {quote} > The space that is not used by HDFS. For instance, once you format a new disk > to ext4, certain space is used for "lost-and-found" directory and ext4 > metadata. > {quote} > It will be possible to fully allocate all spaces in a filesystem for DFS use. > In which case the nonDFS size will be zero. We can relax the check in the > test to account for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9364) Unnecessary DNS resolution attempts when creating NameNodeProxies
[ https://issues.apache.org/jira/browse/HDFS-9364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990800#comment-14990800 ] Zhe Zhang commented on HDFS-9364: - Thanks Xiao for the patch and [~andreina] for the review. Should we add the {{isLogicalUri}} logic to the existing {{getNNAddress(URI filesystemURI)}} method? Is there a downside of always doing this check? It's cleaner to have one util method to handle all cases, assuming the check doesn't harm non-HA cases. Another minor point in the 03 patch is that {{DFSUtilClient.}} is redundant: {code} retAddr = DFSUtilClient.getNNAddress(filesystemURI); {code} > Unnecessary DNS resolution attempts when creating NameNodeProxies > - > > Key: HDFS-9364 > URL: https://issues.apache.org/jira/browse/HDFS-9364 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-9364.001.patch, HDFS-9364.002.patch, > HDFS-9364.003.patch > > > When creating NameNodeProxies, we always try to DNS-resolve namenode URIs. > This is unnecessary if the URI is logical, and may be significantly slow if > the DNS is having problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9378) hadoop-hdfs-client tests do not write logs.
[ https://issues.apache.org/jira/browse/HDFS-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990799#comment-14990799 ] Haohui Mai commented on HDFS-9378: -- +1. Thanks. > hadoop-hdfs-client tests do not write logs. > --- > > Key: HDFS-9378 > URL: https://issues.apache.org/jira/browse/HDFS-9378 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: HDFS-9378.001.patch > > > The tests that have been split into the hadoop-hdfs-client module are not > writing any log output, because there is no > src/test/resources/log4j.properties file in the module. This makes it more > difficult to troubleshoot test failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990758#comment-14990758 ] Staffan Friberg commented on HDFS-9260: --- Hi Daryn, Thanks for the comments and the additional data points. Interesting to learn more about the scale of HDFS instances. I wonder if the NN was running on older and slower hardware in my case compared to your setup, the cluster I was able to get my hands on for these runs has fairly old machines. Adds of new blocks are relatively fast since they will be at the far right of the Tree the number of lookups will be minimal. However the current implementation only needs to do around two writes to insert something at the head/end of the list nothing that has a more complicated datastructure will be able to match it. It will be a question of trade-off. Also to clarify, the microbenchmarks only measures the actual remove and insert of random values not the whole process of copying files etc. I would expect the other parts to far outweigh the time it takes to update the datastructures, so while the 4x sounds scary it should be a minor part of the whole transaction. I think the patch you are referring to is HDFS-6658. I applied it to the 3.0.0 branch from March 11 2015 which was from when the patch was created and ran it on the same microbenchmarks I built to test my patch. I will attach the source code for the benchmarks so you can check that I used the right APIs for it to be comparable. From what I can tell the benchmarks should do the same thing on a high level. The performance overhead for adding and removing are similar between our two implementations. {noformat} fbrAllExisting - Do a Full Block Report with the same 2M entries that are already registered for the Storage in the NN. addRemoveBulk - Remove 32k random blocks from a StorageInfo that has 64k entries, then re-add them all. addRemoveRandom - Remove and directly re-add a block from a Storage entry, repeat for 32k blocks from a StorageInfo with 64k blocks iterate - Iterate and get blockID for 64k blocks associated with a particular StorageInfo ==> benchmarks_trunkMarch11_intMapping.jar.output <== Benchmark Mode CntScore Error Units FullBlockReport.fbrAllExisting avgt 25 379.659 ± 5.463 ms/op StorageInfoAccess.addRemoveBulkavgt 25 16.426 ± 0.380 ms/op StorageInfoAccess.addRemoveRandom avgt 25 15.401 ± 0.196 ms/op StorageInfoAccess.iterate avgt 251.496 ± 0.004 ms/op ==> benchmarks_trunk_baseline.jar.output <== Benchmark Mode CntScore Error Units FullBlockReport.fbrAllExisting avgt 25 288.974 ± 3.970 ms/op StorageInfoAccess.addRemoveBulkavgt 253.157 ± 0.046 ms/op StorageInfoAccess.addRemoveRandom avgt 252.815 ± 0.012 ms/op StorageInfoAccess.iterate avgt 250.788 ± 0.006 ms/op ==> benchmarks_trunk_treeset.jar.output <== Benchmark Mode CntScore Error Units FullBlockReport.fbrAllExisting avgt 25 231.270 ± 3.450 ms/op StorageInfoAccess.addRemoveBulkavgt 25 11.596 ± 0.521 ms/op StorageInfoAccess.addRemoveRandom avgt 25 11.249 ± 0.101 ms/op StorageInfoAccess.iterate avgt 250.385 ± 0.010 ms/op {noformat} Do you have a good suggestion for some other perf test/stress test that would be good to try out? Any stress load you have on your end that would be possible to try it out on? > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change and also some help > investigating/understanding a few outstanding issues if we are interested in > moving forward with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9378) hadoop-hdfs-client tests do not write logs.
Chris Nauroth created HDFS-9378: --- Summary: hadoop-hdfs-client tests do not write logs. Key: HDFS-9378 URL: https://issues.apache.org/jira/browse/HDFS-9378 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Chris Nauroth Assignee: Chris Nauroth Priority: Minor The tests that have been split into the hadoop-hdfs-client module are not writing any log output, because there is no src/test/resources/log4j.properties file in the module. This makes it more difficult to troubleshoot test failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery
[ https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990903#comment-14990903 ] Walter Su commented on HDFS-9236: - The logic looks good to me. Thanks [~twu] for updating and [~yzhangal] for review. There are many {{isDebugEnabled()}} guard. We can consider move to slf4j style. Well, that's not related to this jira. > Missing sanity check for block size during block recovery > - > > Key: HDFS-9236 > URL: https://issues.apache.org/jira/browse/HDFS-9236 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu > Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, > HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, > HDFS-9236.006.patch, HDFS-9236.007.patch > > > Ran into an issue while running test against faulty data-node code. > Currently in DataNode.java: > {code:java} > /** Block synchronization */ > void syncBlock(RecoveringBlock rBlock, > List syncList) throws IOException { > … > // Calculate the best available replica state. > ReplicaState bestState = ReplicaState.RWR; > … > // Calculate list of nodes that will participate in the recovery > // and the new block size > List participatingList = new ArrayList(); > final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId, > -1, recoveryId); > switch(bestState) { > … > case RBW: > case RWR: > long minLength = Long.MAX_VALUE; > for(BlockRecord r : syncList) { > ReplicaState rState = r.rInfo.getOriginalReplicaState(); > if(rState == bestState) { > minLength = Math.min(minLength, r.rInfo.getNumBytes()); > participatingList.add(r); > } > } > newBlock.setNumBytes(minLength); > break; > … > } > … > nn.commitBlockSynchronization(block, > newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false, > datanodes, storages); > } > {code} > This code is called by the DN coordinating the block recovery. In the above > case, it is possible for none of the rState (reported by DNs with copies of > the replica being recovered) to match the bestState. This can either be > caused by faulty DN code or stale/modified/corrupted files on DN. When this > happens the DN will end up reporting the minLengh of Long.MAX_VALUE. > Unfortunately there is no check on the NN for replica length. See > FSNamesystem.java: > {code:java} > void commitBlockSynchronization(ExtendedBlock oldBlock, > long newgenerationstamp, long newlength, > boolean closeFile, boolean deleteblock, DatanodeID[] newtargets, > String[] newtargetstorages) throws IOException { > … > if (deleteblock) { > Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock); > boolean remove = iFile.removeLastBlock(blockToDel) != null; > if (remove) { > blockManager.removeBlock(storedBlock); > } > } else { > // update last block > if(!copyTruncate) { > storedBlock.setGenerationStamp(newgenerationstamp); > > // XXX block length is updated without any check <<< storedBlock.setNumBytes(newlength); > } > … > if (closeFile) { > LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock > + ", file=" + src > + (copyTruncate ? ", newBlock=" + truncatedBlock > : ", newgenerationstamp=" + newgenerationstamp) > + ", newlength=" + newlength > + ", newtargets=" + Arrays.asList(newtargets) + ") successful"); > } else { > LOG.info("commitBlockSynchronization(" + oldBlock + ") successful"); > } > } > {code} > After this point the block length becomes Long.MAX_VALUE. Any subsequent > block report (even with correct length) will cause the block to be marked as > corrupted. Since this is block could be the last block of the file. If this > happens and the client goes away, NN won’t be able to recover the lease and > close the file because the last block is under-replicated. > I believe we need to have a sanity check for block size on both DN and NN to > prevent such case from happening. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery
[ https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990945#comment-14990945 ] Yongjun Zhang commented on HDFS-9236: - Thanks [~twu] for the new rev and [~walter.k.su] for the review. I'm +1 on 007 pending jenkins. > Missing sanity check for block size during block recovery > - > > Key: HDFS-9236 > URL: https://issues.apache.org/jira/browse/HDFS-9236 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu > Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, > HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, > HDFS-9236.006.patch, HDFS-9236.007.patch > > > Ran into an issue while running test against faulty data-node code. > Currently in DataNode.java: > {code:java} > /** Block synchronization */ > void syncBlock(RecoveringBlock rBlock, > List syncList) throws IOException { > … > // Calculate the best available replica state. > ReplicaState bestState = ReplicaState.RWR; > … > // Calculate list of nodes that will participate in the recovery > // and the new block size > List participatingList = new ArrayList(); > final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId, > -1, recoveryId); > switch(bestState) { > … > case RBW: > case RWR: > long minLength = Long.MAX_VALUE; > for(BlockRecord r : syncList) { > ReplicaState rState = r.rInfo.getOriginalReplicaState(); > if(rState == bestState) { > minLength = Math.min(minLength, r.rInfo.getNumBytes()); > participatingList.add(r); > } > } > newBlock.setNumBytes(minLength); > break; > … > } > … > nn.commitBlockSynchronization(block, > newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false, > datanodes, storages); > } > {code} > This code is called by the DN coordinating the block recovery. In the above > case, it is possible for none of the rState (reported by DNs with copies of > the replica being recovered) to match the bestState. This can either be > caused by faulty DN code or stale/modified/corrupted files on DN. When this > happens the DN will end up reporting the minLengh of Long.MAX_VALUE. > Unfortunately there is no check on the NN for replica length. See > FSNamesystem.java: > {code:java} > void commitBlockSynchronization(ExtendedBlock oldBlock, > long newgenerationstamp, long newlength, > boolean closeFile, boolean deleteblock, DatanodeID[] newtargets, > String[] newtargetstorages) throws IOException { > … > if (deleteblock) { > Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock); > boolean remove = iFile.removeLastBlock(blockToDel) != null; > if (remove) { > blockManager.removeBlock(storedBlock); > } > } else { > // update last block > if(!copyTruncate) { > storedBlock.setGenerationStamp(newgenerationstamp); > > // XXX block length is updated without any check <<< storedBlock.setNumBytes(newlength); > } > … > if (closeFile) { > LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock > + ", file=" + src > + (copyTruncate ? ", newBlock=" + truncatedBlock > : ", newgenerationstamp=" + newgenerationstamp) > + ", newlength=" + newlength > + ", newtargets=" + Arrays.asList(newtargets) + ") successful"); > } else { > LOG.info("commitBlockSynchronization(" + oldBlock + ") successful"); > } > } > {code} > After this point the block length becomes Long.MAX_VALUE. Any subsequent > block report (even with correct length) will cause the block to be marked as > corrupted. Since this is block could be the last block of the file. If this > happens and the client goes away, NN won’t be able to recover the lease and > close the file because the last block is under-replicated. > I believe we need to have a sanity check for block size on both DN and NN to > prevent such case from happening. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9007) Fix HDFS Balancer to honor upgrade domain policy
[ https://issues.apache.org/jira/browse/HDFS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990726#comment-14990726 ] Hudson commented on HDFS-9007: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #628 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/628/]) HDFS-9007. Fix HDFS Balancer to honor upgrade domain policy. (Ming Ma (lei: rev ec414600ede8e305c584818565b50e055ea5d2b5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java > Fix HDFS Balancer to honor upgrade domain policy > > > Key: HDFS-9007 > URL: https://issues.apache.org/jira/browse/HDFS-9007 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9007-2.patch, HDFS-9007-branch-2.patch, > HDFS-9007.patch > > > In the current design of HDFS Balancer, it doesn't use BlockPlacementPolicy > used by namenode runtime. Instead, it has somewhat redundant code to make > sure block allocation conforms with the rack policy. > When namenode uses upgrade domain based policy, we need to make sure that > HDFS balancer doesn't move blocks in a way that could violate upgrade domain > block placement policy. > In the longer term, we should consider how to make Balancer independent of > the actual BlockPlacementPolicy as in HDFS-1431. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Attachment: HDFSBenchmarks.zip Microbenchmarks > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, > HDFSBenchmarks.zip > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9378) hadoop-hdfs-client tests do not write logs.
[ https://issues.apache.org/jira/browse/HDFS-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990857#comment-14990857 ] Mingliang Liu commented on HDFS-9378: - +1 (non-binding). I also think it's OK to commit without Jenkins report. > hadoop-hdfs-client tests do not write logs. > --- > > Key: HDFS-9378 > URL: https://issues.apache.org/jira/browse/HDFS-9378 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: HDFS-9378.001.patch > > > The tests that have been split into the hadoop-hdfs-client module are not > writing any log output, because there is no > src/test/resources/log4j.properties file in the module. This makes it more > difficult to troubleshoot test failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9129) Move the safemode block count into BlockManager
[ https://issues.apache.org/jira/browse/HDFS-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9129: Attachment: HDFS-9129.021.patch The failing tests can pass locally. The v21 patch is to address the findbugs warnings. > Move the safemode block count into BlockManager > --- > > Key: HDFS-9129 > URL: https://issues.apache.org/jira/browse/HDFS-9129 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Haohui Mai >Assignee: Mingliang Liu > Attachments: HDFS-9129.000.patch, HDFS-9129.001.patch, > HDFS-9129.002.patch, HDFS-9129.003.patch, HDFS-9129.004.patch, > HDFS-9129.005.patch, HDFS-9129.006.patch, HDFS-9129.007.patch, > HDFS-9129.008.patch, HDFS-9129.009.patch, HDFS-9129.010.patch, > HDFS-9129.011.patch, HDFS-9129.012.patch, HDFS-9129.013.patch, > HDFS-9129.014.patch, HDFS-9129.015.patch, HDFS-9129.016.patch, > HDFS-9129.017.patch, HDFS-9129.018.patch, HDFS-9129.019.patch, > HDFS-9129.020.patch, HDFS-9129.021.patch > > > The {{SafeMode}} needs to track whether there are enough blocks so that the > NN can get out of the safemode. These fields can moved to the > {{BlockManager}} class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9103) Retry reads on DN failure
[ https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990983#comment-14990983 ] Hadoop QA commented on HDFS-9103: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 57s {color} | {color:green} HDFS-8707 passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 20s {color} | {color:red} hadoop-hdfs-native-client in HDFS-8707 failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 13s {color} | {color:red} hadoop-hdfs-native-client in HDFS-8707 failed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 12s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 12s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 14s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 14s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 14s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 16s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 16s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 25s {color} | {color:red} Patch generated 425 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 6s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-05 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12770724/HDFS-9103.HDFS-8707.007.patch | | JIRA Issue | HDFS-9103 | | Optional Tests | asflicense cc unit javac compile | | uname | Linux 052d088edb72 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e8bd3ad/precommit/personality/hadoop.sh | | git revision | HDFS-8707 / 3ce4230 | | Default Java | 1.7.0_79 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79 | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/13392/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_60.txt | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/13392/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_79.txt | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/13392/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_60.txt | | cc |
[jira] [Commented] (HDFS-9378) hadoop-hdfs-client tests do not write logs.
[ https://issues.apache.org/jira/browse/HDFS-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990743#comment-14990743 ] Arpit Agarwal commented on HDFS-9378: - +1, probably okay to commit this without Jenkins +1. Thanks for fixing this [~cnauroth]. > hadoop-hdfs-client tests do not write logs. > --- > > Key: HDFS-9378 > URL: https://issues.apache.org/jira/browse/HDFS-9378 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: HDFS-9378.001.patch > > > The tests that have been split into the hadoop-hdfs-client module are not > writing any log output, because there is no > src/test/resources/log4j.properties file in the module. This makes it more > difficult to troubleshoot test failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990652#comment-14990652 ] Hudson commented on HDFS-8855: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #1362 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1362/]) Revert "HDFS-8855. Webhdfs client leaks active NameNode connections. (wheat9: rev 88beb46cf6e6fd3e51f73a411a2750de7595e326) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestDataNodeUGIProvider.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/DataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Fix For: 2.8.0 > > Attachments: HDFS-8855.005.patch, HDFS-8855.006.patch, > HDFS-8855.007.patch, HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9007) Fix HDFS Balancer to honor upgrade domain policy
[ https://issues.apache.org/jira/browse/HDFS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990650#comment-14990650 ] Hudson commented on HDFS-9007: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #1362 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1362/]) HDFS-9007. Fix HDFS Balancer to honor upgrade domain policy. (Ming Ma (lei: rev ec414600ede8e305c584818565b50e055ea5d2b5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java > Fix HDFS Balancer to honor upgrade domain policy > > > Key: HDFS-9007 > URL: https://issues.apache.org/jira/browse/HDFS-9007 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9007-2.patch, HDFS-9007-branch-2.patch, > HDFS-9007.patch > > > In the current design of HDFS Balancer, it doesn't use BlockPlacementPolicy > used by namenode runtime. Instead, it has somewhat redundant code to make > sure block allocation conforms with the rack policy. > When namenode uses upgrade domain based policy, we need to make sure that > HDFS balancer doesn't move blocks in a way that could violate upgrade domain > block placement policy. > In the longer term, we should consider how to make Balancer independent of > the actual BlockPlacementPolicy as in HDFS-1431. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9168) Move client side unit test to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990719#comment-14990719 ] Chris Nauroth commented on HDFS-9168: - I noticed that the tests that have been moved into hadoop-hdfs-client are not generating any log output, which makes it more difficult to troubleshoot test failures. I filed a fix on HDFS-9378 to drop a log4j.properties file at hadoop-hdfs-project/hadoop-hdfs-client/src/test/resources. > Move client side unit test to hadoop-hdfs-client > > > Key: HDFS-9168 > URL: https://issues.apache.org/jira/browse/HDFS-9168 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: 2.8.0 > > Attachments: HDFS-9168.000.patch, HDFS-9168.001.patch, > HDFS-9168.002.patch, HDFS-9168.003.patch, HDFS-9168.004.patch > > > We need to identify and move the unit tests on the client of hdfs to the > hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs
[ https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Staffan Friberg updated HDFS-9260: -- Description: This patch changes the datastructures used for BlockInfos and Replicas to keep them sorted. This allows faster and more GC friendly handling of full block reports. Would like to hear peoples feedback on this change. was: This patch changes the datastructures used for BlockInfos and Replicas to keep them sorted. This allows faster and more GC friendly handling of full block reports. Would like to hear peoples feedback on this change and also some help investigating/understanding a few outstanding issues if we are interested in moving forward with this. > Improve performance and GC friendliness of startup and FBRs > --- > > Key: HDFS-9260 > URL: https://issues.apache.org/jira/browse/HDFS-9260 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namenode, performance >Affects Versions: 2.7.1 >Reporter: Staffan Friberg >Assignee: Staffan Friberg > Attachments: FBR processing.png, HDFS Block and Replica Management > 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, > HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, > HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch > > > This patch changes the datastructures used for BlockInfos and Replicas to > keep them sorted. This allows faster and more GC friendly handling of full > block reports. > Would like to hear peoples feedback on this change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9282) Make data directory count and storage raw capacity related tests FsDataset-agnostic
[ https://issues.apache.org/jira/browse/HDFS-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990924#comment-14990924 ] Hadoop QA commented on HDFS-9282: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 6 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 50m 17s {color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 38s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s {color} | {color:red} Patch generated 56 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 120m 18s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.server.namenode.TestDeleteRace | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-04 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12770693/HDFS-9282.004.patch | | JIRA Issue | HDFS-9282 | | Optional Tests | asflicense javac javadoc mvninstall unit findbugs checkstyle compile | | uname | Linux 2d1774b7fdc4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e8bd3ad/precommit/personality/hadoop.sh | | git revision | trunk / 5667129 | | Default Java | 1.7.0_79 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_60
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990939#comment-14990939 ] Hudson commented on HDFS-8855: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #572 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/572/]) Revert "HDFS-8855. Webhdfs client leaks active NameNode connections. (wheat9: rev 88beb46cf6e6fd3e51f73a411a2750de7595e326) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestDataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/DataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Fix For: 2.8.0 > > Attachments: HDFS-8855.005.patch, HDFS-8855.006.patch, > HDFS-8855.007.patch, HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9363) Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica.
[ https://issues.apache.org/jira/browse/HDFS-9363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990938#comment-14990938 ] Hudson commented on HDFS-9363: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #572 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/572/]) HDFS-9363. Add fetchReplica to FsDatasetTestUtils to return (lei: rev 5667129276c3123ecb0a96b78d5897431c47a9d5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica. > -- > > Key: HDFS-9363 > URL: https://issues.apache.org/jira/browse/HDFS-9363 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Minor > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9363.001.patch > > > {{FsDatasetTestUtils()}} abstracts away the details in {{FsDataset}} to allow > writing generic tests regardless of underlying {{FsDataset}} implementations. > We can add a {{fetchReplica()}} method to allow some HDFS tests to avoid > using {{FsDatasetTestUtil#fetchReplicaInfo()}}, which assumes FsDatasetImpl > is the only implementation of FsDataset. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9331) Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem entirely allocated for DFS use
[ https://issues.apache.org/jira/browse/HDFS-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990942#comment-14990942 ] Hudson commented on HDFS-9331: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #572 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/572/]) HDFS-9331. Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account (lei: rev e2a5441b062fd0758138079d24a2740fc5e5e350) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java > Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem > entirely allocated for DFS use > --- > > Key: HDFS-9331 > URL: https://issues.apache.org/jira/browse/HDFS-9331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Trivial > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9331.001.patch > > > {{TestNameNodeMXBean#testNameNodeMXBeanInfo}} expects a none-zero nonDFS > size. The nonDFS size is defined as: > {quote} > The space that is not used by HDFS. For instance, once you format a new disk > to ext4, certain space is used for "lost-and-found" directory and ext4 > metadata. > {quote} > It will be possible to fully allocate all spaces in a filesystem for DFS use. > In which case the nonDFS size will be zero. We can relax the check in the > test to account for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9007) Fix HDFS Balancer to honor upgrade domain policy
[ https://issues.apache.org/jira/browse/HDFS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990937#comment-14990937 ] Hudson commented on HDFS-9007: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #572 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/572/]) HDFS-9007. Fix HDFS Balancer to honor upgrade domain policy. (Ming Ma (lei: rev ec414600ede8e305c584818565b50e055ea5d2b5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java > Fix HDFS Balancer to honor upgrade domain policy > > > Key: HDFS-9007 > URL: https://issues.apache.org/jira/browse/HDFS-9007 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9007-2.patch, HDFS-9007-branch-2.patch, > HDFS-9007.patch > > > In the current design of HDFS Balancer, it doesn't use BlockPlacementPolicy > used by namenode runtime. Instead, it has somewhat redundant code to make > sure block allocation conforms with the rack policy. > When namenode uses upgrade domain based policy, we need to make sure that > HDFS balancer doesn't move blocks in a way that could violate upgrade domain > block placement policy. > In the longer term, we should consider how to make Balancer independent of > the actual BlockPlacementPolicy as in HDFS-1431. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9357) NN UI renders icons of decommissioned DN incorrectly
[ https://issues.apache.org/jira/browse/HDFS-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990940#comment-14990940 ] Hudson commented on HDFS-9357: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #572 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/572/]) HDFS-9357. NN UI renders icons of decommissioned DN incorrectly. (wheat9: rev 0eed886a165f5a0850ddbfb1d5f98c7b5e379fb3) * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/hadoop.css > NN UI renders icons of decommissioned DN incorrectly > > > Key: HDFS-9357 > URL: https://issues.apache.org/jira/browse/HDFS-9357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Critical > Fix For: 2.8.0 > > Attachments: Decommissioned_Dead_Fixed.PNG, Decommissioned_Fixed.PNG, > HDFS-9357.001.patch, HDFS-9357.001.patch, decommisioned_n_dead_.png, > decommissioned_.png > > > NN UI is not showing which DN is "Decommissioned "and "Decommissioned & dead" > Root Cause -- > "Decommissioned" and "Decommissioned & dead" icon not reflected on NN UI > When DN is in Decommissioned status or in "Decommissioned & dead" status, > same status is not reflected on NN UI > DN status is as below -- > hdfs dfsadmin -report > Name: 10.xx.xx.xx1:50076 (host-xx1) > Hostname: host-xx > Decommission Status : Decommissioned > Configured Capacity: 230501634048 (214.67 GB) > DFS Used: 36864 (36 KB) > Dead datanodes (1): > Name: 10.xx.xx.xx2:50076 (host-xx2) > Hostname: host-xx > Decommission Status : Decommissioned > Same is not reflected on NN UI. > Attached NN UI snapshots for the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8968) New benchmark throughput tool for striping erasure coding
[ https://issues.apache.org/jira/browse/HDFS-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990964#comment-14990964 ] Rui Li commented on HDFS-8968: -- Thanks [~rakeshr] for the detailed and helpful comments. I'll update the patch accordingly. Beside I don't understand the findbugs and asflicense error in the test report. The findbugs link returns a 404 and I do have the license header in my patch. Any idea what goes wrong? Thanks! > New benchmark throughput tool for striping erasure coding > - > > Key: HDFS-8968 > URL: https://issues.apache.org/jira/browse/HDFS-8968 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rui Li > Attachments: HDFS-8968-HDFS-7285.1.patch, > HDFS-8968-HDFS-7285.2.patch, HDFS-8968.3.patch > > > We need a new benchmark tool to measure the throughput of client writing and > reading considering cases or factors: > * 3-replica or striping; > * write or read, stateful read or positional read; > * which erasure coder; > * striping cell size; > * concurrent readers/writers using processes or threads. > The tool should be easy to use and better to avoid unnecessary local > environment impact, like local disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9378) hadoop-hdfs-client tests do not write logs.
[ https://issues.apache.org/jira/browse/HDFS-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-9378: Attachment: HDFS-9378.001.patch I'm attaching a trivial patch that simply copies the same log4j.properties configuration that we've always used in hadoop-hdfs. > hadoop-hdfs-client tests do not write logs. > --- > > Key: HDFS-9378 > URL: https://issues.apache.org/jira/browse/HDFS-9378 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: HDFS-9378.001.patch > > > The tests that have been split into the hadoop-hdfs-client module are not > writing any log output, because there is no > src/test/resources/log4j.properties file in the module. This makes it more > difficult to troubleshoot test failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9378) hadoop-hdfs-client tests do not write logs.
[ https://issues.apache.org/jira/browse/HDFS-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-9378: Status: Patch Available (was: Open) > hadoop-hdfs-client tests do not write logs. > --- > > Key: HDFS-9378 > URL: https://issues.apache.org/jira/browse/HDFS-9378 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Chris Nauroth >Assignee: Chris Nauroth >Priority: Minor > Attachments: HDFS-9378.001.patch > > > The tests that have been split into the hadoop-hdfs-client module are not > writing any log output, because there is no > src/test/resources/log4j.properties file in the module. This makes it more > difficult to troubleshoot test failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9363) Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica.
[ https://issues.apache.org/jira/browse/HDFS-9363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990995#comment-14990995 ] Hudson commented on HDFS-9363: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2510 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2510/]) HDFS-9363. Add fetchReplica to FsDatasetTestUtils to return (lei: rev 5667129276c3123ecb0a96b78d5897431c47a9d5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java > Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica. > -- > > Key: HDFS-9363 > URL: https://issues.apache.org/jira/browse/HDFS-9363 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Minor > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9363.001.patch > > > {{FsDatasetTestUtils()}} abstracts away the details in {{FsDataset}} to allow > writing generic tests regardless of underlying {{FsDataset}} implementations. > We can add a {{fetchReplica()}} method to allow some HDFS tests to avoid > using {{FsDatasetTestUtil#fetchReplicaInfo()}}, which assumes FsDatasetImpl > is the only implementation of FsDataset. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9331) Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem entirely allocated for DFS use
[ https://issues.apache.org/jira/browse/HDFS-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990999#comment-14990999 ] Hudson commented on HDFS-9331: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2510 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2510/]) HDFS-9331. Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account (lei: rev e2a5441b062fd0758138079d24a2740fc5e5e350) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem > entirely allocated for DFS use > --- > > Key: HDFS-9331 > URL: https://issues.apache.org/jira/browse/HDFS-9331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Trivial > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9331.001.patch > > > {{TestNameNodeMXBean#testNameNodeMXBeanInfo}} expects a none-zero nonDFS > size. The nonDFS size is defined as: > {quote} > The space that is not used by HDFS. For instance, once you format a new disk > to ext4, certain space is used for "lost-and-found" directory and ext4 > metadata. > {quote} > It will be possible to fully allocate all spaces in a filesystem for DFS use. > In which case the nonDFS size will be zero. We can relax the check in the > test to account for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9372) Typo in DataStorage.recoverTransitionRead
[ https://issues.apache.org/jira/browse/HDFS-9372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991074#comment-14991074 ] Mingliang Liu commented on HDFS-9372: - +1 (non-binding) pending on Jenkins. > Typo in DataStorage.recoverTransitionRead > - > > Key: HDFS-9372 > URL: https://issues.apache.org/jira/browse/HDFS-9372 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Duo Zhang >Assignee: Duo Zhang > Attachments: HDFS-9372-v0.patch, HDFS-9372-v1.patch > > > {code:title=DataStorage.java} > if (this.initialized) { > LOG.info("DataNode version: " + > HdfsServerConstants.DATANODE_LAYOUT_VERSION > + " and NameNode layout version: " + nsInfo.getLayoutVersion()); > this.storageDirs = new ArrayList(dataDirs.size()); > // mark DN storage is initialized > this.initialized = true; > } > {code} > The first if should be {{!this.initialized}} I think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9380) Builds are failing with protobuf directories as undef
Bob Hansen created HDFS-9380: Summary: Builds are failing with protobuf directories as undef Key: HDFS-9380 URL: https://issues.apache.org/jira/browse/HDFS-9380 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Bob Hansen Assignee: Haohui Mai See recent builds in HDFS-9320 and HDFS-9103. {code} [exec] CMake Error: The following variables are used in this project, but they are set to NOTFOUND. [exec] Please set them or make sure they are set and tested correctly in the CMake files: [exec] PROTOBUF_LIBRARY (ADVANCED) [exec] linked by target "protoc-gen-hrpc" in directory /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/proto [exec] linked by target "inputstream_test" in directory /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tests [exec] linked by target "remote_block_reader_test" in directory /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tests [exec] linked by target "rpc_engine_test" in directory /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tests [exec] PROTOBUF_PROTOC_LIBRARY (ADVANCED) [exec] linked by target "protoc-gen-hrpc" in directory /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/proto {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9038) Reserved space is erroneously counted towards non-DFS used.
[ https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991236#comment-14991236 ] Brahma Reddy Battula commented on HDFS-9038: Uploaded the patch fix the checkstyle issues.. Testcase failures are unrelated..Compilation errors are strange..? [~cnauroth]/[~vinayrpet] kindly review.. > Reserved space is erroneously counted towards non-DFS used. > --- > > Key: HDFS-9038 > URL: https://issues.apache.org/jira/browse/HDFS-9038 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.1 >Reporter: Chris Nauroth >Assignee: Brahma Reddy Battula > Attachments: HDFS-9038-002.patch, HDFS-9038.patch > > > HDFS-5215 changed the DataNode volume available space calculation to consider > the reserved space held by the {{dfs.datanode.du.reserved}} configuration > property. As a side effect, reserved space is now counted towards non-DFS > used. I don't believe it was intentional to change the definition of non-DFS > used. This issue proposes restoring the prior behavior: do not count > reserved space towards non-DFS used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer
[ https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991047#comment-14991047 ] ASF GitHub Bot commented on HDFS-8562: -- GitHub user hash-X opened a pull request: https://github.com/apache/hadoop/pull/42 AltFileInputStream.java replace FileInputStream.java in apache/hadoop/HDFS A brief description Long Stop-The-World GC pauses due to Final Reference processing are observed. So, Where are those Final Reference come from ? 1 : `Finalizer` 2 : `FileInputStream` How to solve this problem ? Here is the detailed description,and I give a solution on this. https://issues.apache.org/jira/browse/HDFS-8562 FileInputStream have a method of finalize , and it can cause GC pause for a long time.In our test,G1 as our GC. So,in AltFileInputStream , no finalize. A new design for a inputstream use in windows and non-windows. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hash-X/hadoop AltFileInputStream Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/42.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #42 commit 8d64ef0feb8c8d8f5d5823ccaa428a1b58f6fd04 Author: zhangmingleiDate: 2015-07-19T09:50:19Z Add some code. commit 3ccf4c70c40cf1ba921d76b949317b5fd6752e3c Author: zhangminglei Date: 2015-07-19T09:56:49Z I cannot replace FileInputStream to NewFileInputStream in a casual way,cause the act of change can damage other part of the HDFS.For example,When I test my code using a Single Node (psedo-distributed) Cluster."Failed to load an FSImage file." will happen when I start HDFS Daemons.At start,I replace many FileInputStream which happend as an arg or constructor to NewFileInputStream,but it seems like wrong.So,I have to do this in another way. commit 4da55130586ee9803a09162f7e2482b533aa12d9 Author: zhangminglei Date: 2015-07-19T10:30:11Z Replace FIS to NFIS( NewFileInputStream ) is not recommend I think,although there is a man named Alan Bateman from https://bugs.openjdk.java.net/browse/JDK-8080225 suggest that.But test shows it is not good.Some problem may happen.And these test consume so long time.Every time I change the source code,I need to build the whole project (maybe it is not needed).But I install the new version hadoop on my computer.So,build the whole project is needed.Maybe should have a good way to do it I think. commit 06b1509e0ad6dd74cf7c903e6ed6f2ec74d9b341 Author: zhangminglei Date: 2015-07-19T11:06:37Z Replace FIS to NFIS,If test success,just do these first.It is not as simple as that. commit 2a79cd9c3b012556af7db5bdbf96663a1c30dcc4 Author: zhangminglei Date: 2015-07-20T02:36:55Z Add a LOG info in DataXceiver for test. commit 436c998ae21b3fe843b2d5ba6506e37ff2a34ab2 Author: zhangminglei Date: 2015-07-20T06:01:41Z Rename NewFileInputStream to AltFileInputStream. commit 14de2788ea2407c6ee252a69cfd3b4f6132c6faa Author: zhangminglei Date: 2015-07-20T06:16:32Z replace License header to Apache. commit 387f7624a96716abef2062986f05523199e1927e Author: zhangminglei Date: 2015-07-20T07:16:25Z Remove open method in AltFileInputStream.java. commit 52b029fac56bc054add1eac836e6cf71a0735304 Author: zhangminglei Date: 2015-07-20T10:14:09Z Performance between AltFileInputStream and FileInputStream is not do from this commit.Important question I think whether AltFileInputStream could convert to FileInputStream safely.I define a frame plan to do it.But I don't know is this correct for the problem ? In HDFS code,compulsory conversion to FileInputStream is happend everywhere. commit e76d5eb4bf0145a4b28c581ecec07dcee7bae4e5 Author: zhangminglei Date: 2015-07-20T13:11:24Z I think the compulsory conversion is safety,cause the AltFileInputSteam is the subclass of the InputStream.In the previous version of the HDFS,the convert to FileInputStream I see is safety cause these method return InputStream which is the superclass of the FileInputStream.In my version of HDFS,InputStream is also the superclass of the AltFileInputStream.So,AltFileInputStream is also a InputStream just like the FileInputStream is a InputStream too.So,I think it is safety.Everyone agree ? If not,please give your opinion and tell me What's wrong with that ? thank you. commit
[jira] [Commented] (HDFS-9129) Move the safemode block count into BlockManager
[ https://issues.apache.org/jira/browse/HDFS-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991106#comment-14991106 ] Hadoop QA commented on HDFS-9129: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 5s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 51s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s {color} | {color:red} Patch generated 6 new checkstyle issues in hadoop-hdfs-project/hadoop-hdfs (total was 808, now 756). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 25s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 51m 6s {color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s {color} | {color:red} Patch generated 56 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 121m 33s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-05 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12770727/HDFS-9129.021.patch | | JIRA Issue | HDFS-9129 | | Optional Tests | asflicense javac javadoc mvninstall unit findbugs checkstyle compile | | uname | Linux 1477f0a694a5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e8bd3ad/precommit/personality/hadoop.sh | | git revision | trunk / 5667129 | | Default Java |
[jira] [Created] (HDFS-9382) Track the acks for the packets which are sent from ErasureCodingWorker as part of reconstruction work
Uma Maheswara Rao G created HDFS-9382: - Summary: Track the acks for the packets which are sent from ErasureCodingWorker as part of reconstruction work Key: HDFS-9382 URL: https://issues.apache.org/jira/browse/HDFS-9382 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Currently we are not tracking the acks for the packets which are sent from DN ECWorker as part of reconstruction work. This jira is proposing to tracks the acks as reconstruction work is really expensive, so we should know if any packets failed to write at target DN -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9103) Retry reads on DN failure
[ https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991104#comment-14991104 ] Bob Hansen commented on HDFS-9103: -- I think the design is sound. We could create an interface and implementation with mock objects for testing, but I don't think that's necessary in this case. Include an abstraction (either by template or base class) for the clock to make time testable. Can we re-use the mock objects in our tests rather than re-declaring them? {{bad_datanode_test.cc}} defines FileHandleTest. We should have the test name match the filename. Recoverable error test should be calling ASSERT_FALSE for clarity I would like to see the {{FileHandle::Pread}} method implement the retry logic internally so we have a simple "read all this data or completely fail" method rather than forcing partial read and retry onto our consumer. Understanding the logic that _these_ errors mean you should retry, but _this_ error means that you shouldn't retry could be abstracted away as a kindness to the consumer. Need tests for: * BadDataNodeTracker as a testable unit. Hit all of its cases; it's pretty small and it should be easy. We especially currently don't have any test coverage for {{RemoveAllExpired}}. * Doing a retry-and-recover While I'm all about cleaning up formatting as we go, this is going to cause a lot of conflicts with HDFS-9144 with no semantic difference. It does risk reviewers missing something that changed because it looked like a formatting fix. Can we punt some of these until HDFS-9144 lands? If it will be onerous to pull out (e.g. not the last change you made in your local log), I can deal with it on merge. *Minor nits* that don't have to be corrected, but youmight want to peek at: In IsBadNode: * Reset the counter so there is never a chance for overflow * remove_counter_ seems like the wrong name; perhaps check_counter? * We're currently traversing the map up to 3 times in count(), get(), and erase(); perhaps one lookup and keep the iterator? In RemoveAllExpired: {code}datanodes_.erase(it++);{code} can be a bit more idiomatic c++11 with {code}it = datanodes_.erase(it);{code} > Retry reads on DN failure > - > > Key: HDFS-9103 > URL: https://issues.apache.org/jira/browse/HDFS-9103 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: James Clampffer > Fix For: HDFS-8707 > > Attachments: HDFS-9103.1.patch, HDFS-9103.2.patch, > HDFS-9103.HDFS-8707.006.patch, HDFS-9103.HDFS-8707.007.patch, > HDFS-9103.HDFS-8707.3.patch, HDFS-9103.HDFS-8707.4.patch, > HDFS-9103.HDFS-8707.5.patch > > > When AsyncPreadSome fails, add the failed DataNode to the excluded list and > try again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9381) When same block came for replication for Striped mode, we can move that block to PendingReplications
Uma Maheswara Rao G created HDFS-9381: - Summary: When same block came for replication for Striped mode, we can move that block to PendingReplications Key: HDFS-9381 URL: https://issues.apache.org/jira/browse/HDFS-9381 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Currently I noticed that we are just returning null if block already exists in pendingReplications in replication flow for striped blocks. {code} if (block.isStriped()) { if (pendingNum > 0) { // Wait the previous recovery to finish. return null; } {code} Here if neededReplications contains only fewer blocks(basically by default if less than numliveNodes*2), then same blocks can be picked again from neededReplications if we just return null as we are not removing element from neededReplications. Since this replication process need to take fsnamesystmem lock and do, we may spend some time unnecessarily in every loop. So my suggestion/improvement is: Instead of just returning null, how about incrementing pendingReplications for this block and remove from neededReplications? and also another point to consider here is, to add into pendingReplications, generally we need target and it is nothing to which node we issued replication command. Later when after replication success and DN reported it, block will be removed from pendingReplications from NN addBlock. So since this is newly picked block from neededReplications, we would not have selected target yet. So which target to be passed to pendingReplications if we add this block.. One Option I am thinking is, how about just passing srcNode itself as target for this special condition? So, anyway if block is really missed, srcNode anyway will not report it. So this block will not be removed from pending replications, so that when it timeout, it will be considered for replication and that time it will find actual target to replicate. So -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9379) Make NNThroughputBenchmark support more than 10 numThreads
[ https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991191#comment-14991191 ] Hadoop QA commented on HDFS-9379: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 5s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 54s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 37s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 49m 50s {color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s {color} | {color:red} Patch generated 56 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 120m 14s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-05 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12770741/HDFS-9379.000.patch | | JIRA Issue | HDFS-9379 | | Optional Tests | asflicense javac javadoc mvninstall unit findbugs checkstyle compile | | uname | Linux 9f2461e5ff8e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build@2/patchprocess/apache-yetus-e8bd3ad/precommit/personality/hadoop.sh | | git revision | trunk / 5667129 | | Default
[jira] [Commented] (HDFS-9007) Fix HDFS Balancer to honor upgrade domain policy
[ https://issues.apache.org/jira/browse/HDFS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990994#comment-14990994 ] Hudson commented on HDFS-9007: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2510 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2510/]) HDFS-9007. Fix HDFS Balancer to honor upgrade domain policy. (Ming Ma (lei: rev ec414600ede8e305c584818565b50e055ea5d2b5) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java > Fix HDFS Balancer to honor upgrade domain policy > > > Key: HDFS-9007 > URL: https://issues.apache.org/jira/browse/HDFS-9007 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9007-2.patch, HDFS-9007-branch-2.patch, > HDFS-9007.patch > > > In the current design of HDFS Balancer, it doesn't use BlockPlacementPolicy > used by namenode runtime. Instead, it has somewhat redundant code to make > sure block allocation conforms with the rack policy. > When namenode uses upgrade domain based policy, we need to make sure that > HDFS balancer doesn't move blocks in a way that could violate upgrade domain > block placement policy. > In the longer term, we should consider how to make Balancer independent of > the actual BlockPlacementPolicy as in HDFS-1431. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9357) NN UI renders icons of decommissioned DN incorrectly
[ https://issues.apache.org/jira/browse/HDFS-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990997#comment-14990997 ] Hudson commented on HDFS-9357: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2510 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2510/]) HDFS-9357. NN UI renders icons of decommissioned DN incorrectly. (wheat9: rev 0eed886a165f5a0850ddbfb1d5f98c7b5e379fb3) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/hadoop.css * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html > NN UI renders icons of decommissioned DN incorrectly > > > Key: HDFS-9357 > URL: https://issues.apache.org/jira/browse/HDFS-9357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Critical > Fix For: 2.8.0 > > Attachments: Decommissioned_Dead_Fixed.PNG, Decommissioned_Fixed.PNG, > HDFS-9357.001.patch, HDFS-9357.001.patch, decommisioned_n_dead_.png, > decommissioned_.png > > > NN UI is not showing which DN is "Decommissioned "and "Decommissioned & dead" > Root Cause -- > "Decommissioned" and "Decommissioned & dead" icon not reflected on NN UI > When DN is in Decommissioned status or in "Decommissioned & dead" status, > same status is not reflected on NN UI > DN status is as below -- > hdfs dfsadmin -report > Name: 10.xx.xx.xx1:50076 (host-xx1) > Hostname: host-xx > Decommission Status : Decommissioned > Configured Capacity: 230501634048 (214.67 GB) > DFS Used: 36864 (36 KB) > Dead datanodes (1): > Name: 10.xx.xx.xx2:50076 (host-xx2) > Hostname: host-xx > Decommission Status : Decommissioned > Same is not reflected on NN UI. > Attached NN UI snapshots for the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990996#comment-14990996 ] Hudson commented on HDFS-8855: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2510 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2510/]) Revert "HDFS-8855. Webhdfs client leaks active NameNode connections. (wheat9: rev 88beb46cf6e6fd3e51f73a411a2750de7595e326) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/DataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestDataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Fix For: 2.8.0 > > Attachments: HDFS-8855.005.patch, HDFS-8855.006.patch, > HDFS-8855.007.patch, HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9103) Retry reads on DN failure
[ https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991030#comment-14991030 ] Hadoop QA commented on HDFS-9103: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 16s {color} | {color:green} HDFS-8707 passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 11s {color} | {color:red} hadoop-hdfs-native-client in HDFS-8707 failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s {color} | {color:red} hadoop-hdfs-native-client in HDFS-8707 failed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} HDFS-8707 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 11s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 11s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 11s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 13s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 13s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 13s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 11s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 12s {color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s {color} | {color:red} Patch generated 425 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 13m 20s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-05 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12770724/HDFS-9103.HDFS-8707.007.patch | | JIRA Issue | HDFS-9103 | | Optional Tests | asflicense cc unit javac compile | | uname | Linux b159d5a96904 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build@2/patchprocess/apache-yetus-e8bd3ad/precommit/personality/hadoop.sh | | git revision | HDFS-8707 / 3ce4230 | | Default Java | 1.7.0_79 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79 | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/13394/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_60.txt | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/13394/artifact/patchprocess/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.7.0_79.txt | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/13394/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_60.txt | | cc |
[jira] [Updated] (HDFS-9372) Typo in DataStorage.recoverTransitionRead
[ https://issues.apache.org/jira/browse/HDFS-9372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HDFS-9372: Attachment: HDFS-9372-v1.patch Just remove the dead code. > Typo in DataStorage.recoverTransitionRead > - > > Key: HDFS-9372 > URL: https://issues.apache.org/jira/browse/HDFS-9372 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Duo Zhang >Assignee: Duo Zhang > Attachments: HDFS-9372-v0.patch, HDFS-9372-v1.patch > > > {code:title=DataStorage.java} > if (this.initialized) { > LOG.info("DataNode version: " + > HdfsServerConstants.DATANODE_LAYOUT_VERSION > + " and NameNode layout version: " + nsInfo.getLayoutVersion()); > this.storageDirs = new ArrayList(dataDirs.size()); > // mark DN storage is initialized > this.initialized = true; > } > {code} > The first if should be {{!this.initialized}} I think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9379) Make NNThroughputBenchmark support more than 10 numThreads
Mingliang Liu created HDFS-9379: --- Summary: Make NNThroughputBenchmark support more than 10 numThreads Key: HDFS-9379 URL: https://issues.apache.org/jira/browse/HDFS-9379 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Mingliang Liu Assignee: Mingliang Liu Currently, the {{NNThroughputBenchmark}} relies on sorted {{datanodes}} array in the lexicographical order of datanode's {{xferAddr}}. * There is an assertion of datanode's {{xferAddr}} lexicographical order when filling the {{datanodes}}, see [the code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152]. * When searching the datanode by {{DatanodeInfo}}, it uses binary search against the {{datanodes}} array, see [the code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187] In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In {{NNThroughputBenchmark}}, the port is simply _the index of the tiny datanode_ plus one. The problem here is that, when there are more than 9 tiny datanodes ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will be invalid as the string value of datanode index is not in lexicographical order any more. For example, {code} ... 192.168.54.40:8 192.168.54.40:9 192.168.54.40:10 192.168.54.40:11 ... {code} {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will fail and the binary search won't work. The simple fix is to calculate the datanode index by port directly, instead of using binary search. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9379) Make NNThroughputBenchmark support more than 10 numThreads
[ https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9379: Status: Patch Available (was: Open) > Make NNThroughputBenchmark support more than 10 numThreads > -- > > Key: HDFS-9379 > URL: https://issues.apache.org/jira/browse/HDFS-9379 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9379.000.patch > > > Currently, the {{NNThroughputBenchmark}} relies on sorted {{datanodes}} array > in the lexicographical order of datanode's {{xferAddr}}. > * There is an assertion of datanode's {{xferAddr}} lexicographical order when > filling the {{datanodes}}, see [the > code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152]. > * When searching the datanode by {{DatanodeInfo}}, it uses binary search > against the {{datanodes}} array, see [the > code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187] > In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In > {{NNThroughputBenchmark}}, the port is simply _the index of the tiny > datanode_ plus one. > The problem here is that, when there are more than 9 tiny datanodes > ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will > be invalid as the string value of datanode index is not in lexicographical > order any more. For example, > {code} > ... > 192.168.54.40:8 > 192.168.54.40:9 > 192.168.54.40:10 > 192.168.54.40:11 > ... > {code} > {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will > fail and the binary search won't work. > The simple fix is to calculate the datanode index by port directly, instead > of using binary search. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9379) Make NNThroughputBenchmark support more than 10 numThreads
[ https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9379: Attachment: HDFS-9379.000.patch > Make NNThroughputBenchmark support more than 10 numThreads > -- > > Key: HDFS-9379 > URL: https://issues.apache.org/jira/browse/HDFS-9379 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9379.000.patch > > > Currently, the {{NNThroughputBenchmark}} relies on sorted {{datanodes}} array > in the lexicographical order of datanode's {{xferAddr}}. > * There is an assertion of datanode's {{xferAddr}} lexicographical order when > filling the {{datanodes}}, see [the > code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152]. > * When searching the datanode by {{DatanodeInfo}}, it uses binary search > against the {{datanodes}} array, see [the > code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187] > In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In > {{NNThroughputBenchmark}}, the port is simply _the index of the tiny > datanode_ plus one. > The problem here is that, when there are more than 9 tiny datanodes > ({{numThreads}}), the lexicographical order of datanode's {{xferAddr}} will > be invalid as the string value of datanode index is not in lexicographical > order any more. For example, > {code} > ... > 192.168.54.40:8 > 192.168.54.40:9 > 192.168.54.40:10 > 192.168.54.40:11 > ... > {code} > {{192.168.54.40:9}} is greater than {{192.168.54.40:10}}. The assertion will > fail and the binary search won't work. > The simple fix is to calculate the datanode index by port directly, instead > of using binary search. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9372) Typo in DataStorage.recoverTransitionRead
[ https://issues.apache.org/jira/browse/HDFS-9372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991156#comment-14991156 ] Hadoop QA commented on HDFS-9372: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 35s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 57s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s {color} | {color:red} Patch generated 58 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 121m 4s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.shortcircuit.TestShortCircuitCache | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 | | | hadoop.hdfs.TestDecommission | | | hadoop.hdfs.TestFileAppend | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-05 | | JIRA Patch URL |
[jira] [Updated] (HDFS-9038) Reserved space is erroneously counted towards non-DFS used.
[ https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HDFS-9038: --- Attachment: HDFS-9038-002.patch > Reserved space is erroneously counted towards non-DFS used. > --- > > Key: HDFS-9038 > URL: https://issues.apache.org/jira/browse/HDFS-9038 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.1 >Reporter: Chris Nauroth >Assignee: Brahma Reddy Battula > Attachments: HDFS-9038-002.patch, HDFS-9038.patch > > > HDFS-5215 changed the DataNode volume available space calculation to consider > the reserved space held by the {{dfs.datanode.du.reserved}} configuration > property. As a side effect, reserved space is now counted towards non-DFS > used. I don't believe it was intentional to change the definition of non-DFS > used. This issue proposes restoring the prior behavior: do not count > reserved space towards non-DFS used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9234) WebHdfs : getContentSummary() should give quota for storage types
[ https://issues.apache.org/jira/browse/HDFS-9234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14991250#comment-14991250 ] Surendra Singh Lilhore commented on HDFS-9234: -- The test failures and findbugs issues are unrelated to this patch > WebHdfs : getContentSummary() should give quota for storage types > - > > Key: HDFS-9234 > URL: https://issues.apache.org/jira/browse/HDFS-9234 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 2.7.1 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > Attachments: HDFS-9234-001.patch, HDFS-9234-002.patch, > HDFS-9234-003.patch, HDFS-9234-004.patch, HDFS-9234-005.patch, > HDFS-9234-006.patch > > > Currently webhdfs API for ContentSummary give only namequota and spacequota > but it will not give storage types quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9357) NN UI renders icons of decommissioned DN incorrectly
[ https://issues.apache.org/jira/browse/HDFS-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990291#comment-14990291 ] Hudson commented on HDFS-9357: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #627 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/627/]) HDFS-9357. NN UI renders icons of decommissioned DN incorrectly. (wheat9: rev 0eed886a165f5a0850ddbfb1d5f98c7b5e379fb3) * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/hadoop.css * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > NN UI renders icons of decommissioned DN incorrectly > > > Key: HDFS-9357 > URL: https://issues.apache.org/jira/browse/HDFS-9357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Critical > Fix For: 2.8.0 > > Attachments: Decommissioned_Dead_Fixed.PNG, Decommissioned_Fixed.PNG, > HDFS-9357.001.patch, HDFS-9357.001.patch, decommisioned_n_dead_.png, > decommissioned_.png > > > NN UI is not showing which DN is "Decommissioned "and "Decommissioned & dead" > Root Cause -- > "Decommissioned" and "Decommissioned & dead" icon not reflected on NN UI > When DN is in Decommissioned status or in "Decommissioned & dead" status, > same status is not reflected on NN UI > DN status is as below -- > hdfs dfsadmin -report > Name: 10.xx.xx.xx1:50076 (host-xx1) > Hostname: host-xx > Decommission Status : Decommissioned > Configured Capacity: 230501634048 (214.67 GB) > DFS Used: 36864 (36 KB) > Dead datanodes (1): > Name: 10.xx.xx.xx2:50076 (host-xx2) > Hostname: host-xx > Decommission Status : Decommissioned > Same is not reflected on NN UI. > Attached NN UI snapshots for the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9007) Fix HDFS Balancer to honor upgrade domain policy
[ https://issues.apache.org/jira/browse/HDFS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990185#comment-14990185 ] Ming Ma commented on HDFS-9007: --- Thanks [~eddyxu]! > Fix HDFS Balancer to honor upgrade domain policy > > > Key: HDFS-9007 > URL: https://issues.apache.org/jira/browse/HDFS-9007 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9007-2.patch, HDFS-9007-branch-2.patch, > HDFS-9007.patch > > > In the current design of HDFS Balancer, it doesn't use BlockPlacementPolicy > used by namenode runtime. Instead, it has somewhat redundant code to make > sure block allocation conforms with the rack policy. > When namenode uses upgrade domain based policy, we need to make sure that > HDFS balancer doesn't move blocks in a way that could violate upgrade domain > block placement policy. > In the longer term, we should consider how to make Balancer independent of > the actual BlockPlacementPolicy as in HDFS-1431. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery
[ https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990215#comment-14990215 ] Tony Wu commented on HDFS-9236: --- Thanks a lot [~yzhangal] for your comments. I incorporated them into the new patch. I added the debug logs but kept the positive logic for determining which replica info to add to syncList in existing code/patch. IMO the positive logic is easier to read/understand. > Missing sanity check for block size during block recovery > - > > Key: HDFS-9236 > URL: https://issues.apache.org/jira/browse/HDFS-9236 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu > Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, > HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, > HDFS-9236.006.patch > > > Ran into an issue while running test against faulty data-node code. > Currently in DataNode.java: > {code:java} > /** Block synchronization */ > void syncBlock(RecoveringBlock rBlock, > List syncList) throws IOException { > … > // Calculate the best available replica state. > ReplicaState bestState = ReplicaState.RWR; > … > // Calculate list of nodes that will participate in the recovery > // and the new block size > List participatingList = new ArrayList(); > final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId, > -1, recoveryId); > switch(bestState) { > … > case RBW: > case RWR: > long minLength = Long.MAX_VALUE; > for(BlockRecord r : syncList) { > ReplicaState rState = r.rInfo.getOriginalReplicaState(); > if(rState == bestState) { > minLength = Math.min(minLength, r.rInfo.getNumBytes()); > participatingList.add(r); > } > } > newBlock.setNumBytes(minLength); > break; > … > } > … > nn.commitBlockSynchronization(block, > newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false, > datanodes, storages); > } > {code} > This code is called by the DN coordinating the block recovery. In the above > case, it is possible for none of the rState (reported by DNs with copies of > the replica being recovered) to match the bestState. This can either be > caused by faulty DN code or stale/modified/corrupted files on DN. When this > happens the DN will end up reporting the minLengh of Long.MAX_VALUE. > Unfortunately there is no check on the NN for replica length. See > FSNamesystem.java: > {code:java} > void commitBlockSynchronization(ExtendedBlock oldBlock, > long newgenerationstamp, long newlength, > boolean closeFile, boolean deleteblock, DatanodeID[] newtargets, > String[] newtargetstorages) throws IOException { > … > if (deleteblock) { > Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock); > boolean remove = iFile.removeLastBlock(blockToDel) != null; > if (remove) { > blockManager.removeBlock(storedBlock); > } > } else { > // update last block > if(!copyTruncate) { > storedBlock.setGenerationStamp(newgenerationstamp); > > // XXX block length is updated without any check <<< storedBlock.setNumBytes(newlength); > } > … > if (closeFile) { > LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock > + ", file=" + src > + (copyTruncate ? ", newBlock=" + truncatedBlock > : ", newgenerationstamp=" + newgenerationstamp) > + ", newlength=" + newlength > + ", newtargets=" + Arrays.asList(newtargets) + ") successful"); > } else { > LOG.info("commitBlockSynchronization(" + oldBlock + ") successful"); > } > } > {code} > After this point the block length becomes Long.MAX_VALUE. Any subsequent > block report (even with correct length) will cause the block to be marked as > corrupted. Since this is block could be the last block of the file. If this > happens and the client goes away, NN won’t be able to recover the lease and > close the file because the last block is under-replicated. > I believe we need to have a sanity check for block size on both DN and NN to > prevent such case from happening. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8968) New benchmark throughput tool for striping erasure coding
[ https://issues.apache.org/jira/browse/HDFS-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990244#comment-14990244 ] Rakesh R commented on HDFS-8968: Good work [~lirui]. I've few comments, please take a look at it. # Can we make this configurable like {{System.getProperty("test.benchmark.data","/tmp/benchmark/data"));}} {code} private static final String DFS_TMP_DIR = "/tmp/benchmark"; {code} # {{printUsage}} can be highlighted using {{System.err.println}}. Also, we can say {{"Usage: ErasureCodeBenchmarkThroughput}} {code} System.out.println("ErasureCodeBenchmarkThroughput" + " [num clients] [stf|pos]\n" + "Stateful and positional option is only available for read."); {code} # It would be good to use hadoop utility {{StopWatch}} for the elapsed time computations. Presently its using {{System.currentTimeMillis() - start) / 1000.0}}. Sample usage: {code} org.apache.hadoop.util.StopWatch sw = new StopWatch().start(); // do the operation sw.stop(); long elapsedtime = sw.now(TimeUnit.SECONDS); {code} # Just a suggestion to use {{java.util.concurrent.ExecutorCompletionService}} here rather than trying to find out which task has completed. {code} +for (Future future : futures) { + results.add(future.get()); +} {code} bq. As to unit test, maybe I can add a test where the tool runs against a MiniDFSCluster. How about running both a real cluster and a MiniDFSCluster inside the ErasureCodeBenchmarkThroughput tool, similar to the {{org.apache.hadoop.hdfs.BenchmarkThroughput}}? > New benchmark throughput tool for striping erasure coding > - > > Key: HDFS-8968 > URL: https://issues.apache.org/jira/browse/HDFS-8968 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rui Li > Attachments: HDFS-8968-HDFS-7285.1.patch, > HDFS-8968-HDFS-7285.2.patch, HDFS-8968.3.patch > > > We need a new benchmark tool to measure the throughput of client writing and > reading considering cases or factors: > * 3-replica or striping; > * write or read, stateful read or positional read; > * which erasure coder; > * striping cell size; > * concurrent readers/writers using processes or threads. > The tool should be easy to use and better to avoid unnecessary local > environment impact, like local disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9363) Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica.
[ https://issues.apache.org/jira/browse/HDFS-9363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990257#comment-14990257 ] Hudson commented on HDFS-9363: -- FAILURE: Integrated in Hadoop-trunk-Commit #8757 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8757/]) HDFS-9363. Add fetchReplica to FsDatasetTestUtils to return (lei: rev 5667129276c3123ecb0a96b78d5897431c47a9d5) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/FsDatasetTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImplTestUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java > Add fetchReplica() to FsDatasetTestUtils to return FsDataset-agnostic Replica. > -- > > Key: HDFS-9363 > URL: https://issues.apache.org/jira/browse/HDFS-9363 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Minor > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9363.001.patch > > > {{FsDatasetTestUtils()}} abstracts away the details in {{FsDataset}} to allow > writing generic tests regardless of underlying {{FsDataset}} implementations. > We can add a {{fetchReplica()}} method to allow some HDFS tests to avoid > using {{FsDatasetTestUtil#fetchReplicaInfo()}}, which assumes FsDatasetImpl > is the only implementation of FsDataset. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9236) Missing sanity check for block size during block recovery
[ https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Wu updated HDFS-9236: -- Attachment: HDFS-9236.007.patch In v7 patch: * Addressed [~yzhangal]'s review comments. * Update the test case. * Add a {{toString()}} method to pretty print {{ReplicaRecoveryInfo}}. > Missing sanity check for block size during block recovery > - > > Key: HDFS-9236 > URL: https://issues.apache.org/jira/browse/HDFS-9236 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu > Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, > HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch, > HDFS-9236.006.patch, HDFS-9236.007.patch > > > Ran into an issue while running test against faulty data-node code. > Currently in DataNode.java: > {code:java} > /** Block synchronization */ > void syncBlock(RecoveringBlock rBlock, > List syncList) throws IOException { > … > // Calculate the best available replica state. > ReplicaState bestState = ReplicaState.RWR; > … > // Calculate list of nodes that will participate in the recovery > // and the new block size > List participatingList = new ArrayList(); > final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId, > -1, recoveryId); > switch(bestState) { > … > case RBW: > case RWR: > long minLength = Long.MAX_VALUE; > for(BlockRecord r : syncList) { > ReplicaState rState = r.rInfo.getOriginalReplicaState(); > if(rState == bestState) { > minLength = Math.min(minLength, r.rInfo.getNumBytes()); > participatingList.add(r); > } > } > newBlock.setNumBytes(minLength); > break; > … > } > … > nn.commitBlockSynchronization(block, > newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false, > datanodes, storages); > } > {code} > This code is called by the DN coordinating the block recovery. In the above > case, it is possible for none of the rState (reported by DNs with copies of > the replica being recovered) to match the bestState. This can either be > caused by faulty DN code or stale/modified/corrupted files on DN. When this > happens the DN will end up reporting the minLengh of Long.MAX_VALUE. > Unfortunately there is no check on the NN for replica length. See > FSNamesystem.java: > {code:java} > void commitBlockSynchronization(ExtendedBlock oldBlock, > long newgenerationstamp, long newlength, > boolean closeFile, boolean deleteblock, DatanodeID[] newtargets, > String[] newtargetstorages) throws IOException { > … > if (deleteblock) { > Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock); > boolean remove = iFile.removeLastBlock(blockToDel) != null; > if (remove) { > blockManager.removeBlock(storedBlock); > } > } else { > // update last block > if(!copyTruncate) { > storedBlock.setGenerationStamp(newgenerationstamp); > > // XXX block length is updated without any check <<< storedBlock.setNumBytes(newlength); > } > … > if (closeFile) { > LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock > + ", file=" + src > + (copyTruncate ? ", newBlock=" + truncatedBlock > : ", newgenerationstamp=" + newgenerationstamp) > + ", newlength=" + newlength > + ", newtargets=" + Arrays.asList(newtargets) + ") successful"); > } else { > LOG.info("commitBlockSynchronization(" + oldBlock + ") successful"); > } > } > {code} > After this point the block length becomes Long.MAX_VALUE. Any subsequent > block report (even with correct length) will cause the block to be marked as > corrupted. Since this is block could be the last block of the file. If this > happens and the client goes away, NN won’t be able to recover the lease and > close the file because the last block is under-replicated. > I believe we need to have a sanity check for block size on both DN and NN to > prevent such case from happening. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9357) NN UI renders icons of decommissioned DN incorrectly
[ https://issues.apache.org/jira/browse/HDFS-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990248#comment-14990248 ] Hudson commented on HDFS-9357: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #1361 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1361/]) HDFS-9357. NN UI renders icons of decommissioned DN incorrectly. (wheat9: rev 0eed886a165f5a0850ddbfb1d5f98c7b5e379fb3) * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/hadoop.css > NN UI renders icons of decommissioned DN incorrectly > > > Key: HDFS-9357 > URL: https://issues.apache.org/jira/browse/HDFS-9357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Critical > Fix For: 2.8.0 > > Attachments: Decommissioned_Dead_Fixed.PNG, Decommissioned_Fixed.PNG, > HDFS-9357.001.patch, HDFS-9357.001.patch, decommisioned_n_dead_.png, > decommissioned_.png > > > NN UI is not showing which DN is "Decommissioned "and "Decommissioned & dead" > Root Cause -- > "Decommissioned" and "Decommissioned & dead" icon not reflected on NN UI > When DN is in Decommissioned status or in "Decommissioned & dead" status, > same status is not reflected on NN UI > DN status is as below -- > hdfs dfsadmin -report > Name: 10.xx.xx.xx1:50076 (host-xx1) > Hostname: host-xx > Decommission Status : Decommissioned > Configured Capacity: 230501634048 (214.67 GB) > DFS Used: 36864 (36 KB) > Dead datanodes (1): > Name: 10.xx.xx.xx2:50076 (host-xx2) > Hostname: host-xx > Decommission Status : Decommissioned > Same is not reflected on NN UI. > Attached NN UI snapshots for the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9007) Fix HDFS Balancer to honor upgrade domain policy
[ https://issues.apache.org/jira/browse/HDFS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990256#comment-14990256 ] Hudson commented on HDFS-9007: -- FAILURE: Integrated in Hadoop-trunk-Commit #8757 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8757/]) HDFS-9007. Fix HDFS Balancer to honor upgrade domain policy. (Ming Ma (lei: rev ec414600ede8e305c584818565b50e055ea5d2b5) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java > Fix HDFS Balancer to honor upgrade domain policy > > > Key: HDFS-9007 > URL: https://issues.apache.org/jira/browse/HDFS-9007 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9007-2.patch, HDFS-9007-branch-2.patch, > HDFS-9007.patch > > > In the current design of HDFS Balancer, it doesn't use BlockPlacementPolicy > used by namenode runtime. Instead, it has somewhat redundant code to make > sure block allocation conforms with the rack policy. > When namenode uses upgrade domain based policy, we need to make sure that > HDFS balancer doesn't move blocks in a way that could violate upgrade domain > block placement policy. > In the longer term, we should consider how to make Balancer independent of > the actual BlockPlacementPolicy as in HDFS-1431. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990258#comment-14990258 ] Hudson commented on HDFS-8855: -- FAILURE: Integrated in Hadoop-trunk-Commit #8757 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8757/]) Revert "HDFS-8855. Webhdfs client leaks active NameNode connections. (wheat9: rev 88beb46cf6e6fd3e51f73a411a2750de7595e326) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/DataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/TestDataNodeUGIProvider.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/Token.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Fix For: 2.8.0 > > Attachments: HDFS-8855.005.patch, HDFS-8855.006.patch, > HDFS-8855.007.patch, HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9331) Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem entirely allocated for DFS use
[ https://issues.apache.org/jira/browse/HDFS-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990260#comment-14990260 ] Hudson commented on HDFS-9331: -- FAILURE: Integrated in Hadoop-trunk-Commit #8757 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8757/]) HDFS-9331. Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account (lei: rev e2a5441b062fd0758138079d24a2740fc5e5e350) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java > Modify TestNameNodeMXBean#testNameNodeMXBeanInfo() to account for filesystem > entirely allocated for DFS use > --- > > Key: HDFS-9331 > URL: https://issues.apache.org/jira/browse/HDFS-9331 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS, test >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu >Priority: Trivial > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9331.001.patch > > > {{TestNameNodeMXBean#testNameNodeMXBeanInfo}} expects a none-zero nonDFS > size. The nonDFS size is defined as: > {quote} > The space that is not used by HDFS. For instance, once you format a new disk > to ext4, certain space is used for "lost-and-found" directory and ext4 > metadata. > {quote} > It will be possible to fully allocate all spaces in a filesystem for DFS use. > In which case the nonDFS size will be zero. We can relax the check in the > test to account for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7764) DirectoryScanner shouldn't abort the scan if one directory had an error
[ https://issues.apache.org/jira/browse/HDFS-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14989523#comment-14989523 ] Rakesh R commented on HDFS-7764: Thanks [~cmccabe] for the suggestion. I've attached another patch addressing the same. Could you please review it again when you get a chance. > DirectoryScanner shouldn't abort the scan if one directory had an error > --- > > Key: HDFS-7764 > URL: https://issues.apache.org/jira/browse/HDFS-7764 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.0 >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-7764-01.patch, HDFS-7764.patch > > > If there is an exception while preparing the ScanInfo for the blocks in the > directory, DirectoryScanner is immediately throwing exception and coming out > of the current scan cycle. It would be good if he can signal #cancel() to the > other pending tasks . > DirectoryScanner.java > {code} > for (Entryreport : > compilersInProgress.entrySet()) { > try { > dirReports[report.getKey()] = report.getValue().get(); > } catch (Exception ex) { > LOG.error("Error compiling report", ex); > // Propagate ex to DataBlockScanner to deal with > throw new RuntimeException(ex); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9375) Set balancer bandwidth for specific node
[ https://issues.apache.org/jira/browse/HDFS-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated HDFS-9375: Status: Patch Available (was: Open) > Set balancer bandwidth for specific node > > > Key: HDFS-9375 > URL: https://issues.apache.org/jira/browse/HDFS-9375 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9375.001.patch > > > Now even though the balancer is impove a lot, but In some cases that is still > slow. For example when the cluster is extended,the new nodes all need > balancer datas from existed nodes.In order to improve the balancer > velocity,generally,we will use {{setBalancerBandwidth}} of {{dfsadmin}} > command.But this is set for every node, obviously,we can increase more > bandwidth for new nodes because these nodes lacking of data.When the new > nodes balancer data enough,we can let new nodes to work.So we can define a > new clientDatanode interface to set specific node's bandwidth. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9375) Set balancer bandwidth for specific node
[ https://issues.apache.org/jira/browse/HDFS-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated HDFS-9375: Attachment: HDFS-9375.001.patch I attach the patch, add a new dfsadmin command {{setBalancerBandwidthForSpecificNode}},this is a usage help {{[-setBalancerBandwidthForSpecificNode }} > Set balancer bandwidth for specific node > > > Key: HDFS-9375 > URL: https://issues.apache.org/jira/browse/HDFS-9375 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9375.001.patch > > > Now even though the balancer is impove a lot, but In some cases that is still > slow. For example when the cluster is extended,the new nodes all need > balancer datas from existed nodes.In order to improve the balancer > velocity,generally,we will use {{setBalancerBandwidth}} of {{dfsadmin}} > command.But this is set for every node, obviously,we can increase more > bandwidth for new nodes because these nodes lacking of data.When the new > nodes balancer data enough,we can let new nodes to work.So we can define a > new clientDatanode interface to set specific node's bandwidth. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9038) Reserved space is erroneously counted towards non-DFS used.
[ https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14989659#comment-14989659 ] Hadoop QA commented on HDFS-9038: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 6 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 14s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 35s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 29s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s {color} | {color:red} Patch generated 17 new checkstyle issues in hadoop-hdfs-project (total was 293, now 308). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 38s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 52s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 5s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 59s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s {color} | {color:red} Patch generated 56 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 174m 47s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.shortcircuit.TestShortCircuitCache | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.TestAclsEndToEnd | | JDK v1.7.0_79 Failed junit tests |