[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338554#comment-14338554 ] Hudson commented on HDFS-7537: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2066 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2066/]) HDFS-7537. Add UNDER MIN REPL'D BLOCKS count to fsck. Contributed by GAO Rui (szetszwo: rev 725cc499f00abeeab9f58cbc778e65522eec9d98) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338515#comment-14338515 ] Hudson commented on HDFS-7537: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #116 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/116/]) HDFS-7537. Add UNDER MIN REPL'D BLOCKS count to fsck. Contributed by GAO Rui (szetszwo: rev 725cc499f00abeeab9f58cbc778e65522eec9d98) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338434#comment-14338434 ] Hudson commented on HDFS-7537: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2048 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2048/]) HDFS-7537. Add UNDER MIN REPL'D BLOCKS count to fsck. Contributed by GAO Rui (szetszwo: rev 725cc499f00abeeab9f58cbc778e65522eec9d98) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338451#comment-14338451 ] Hudson commented on HDFS-7537: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #107 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/107/]) HDFS-7537. Add UNDER MIN REPL'D BLOCKS count to fsck. Contributed by GAO Rui (szetszwo: rev 725cc499f00abeeab9f58cbc778e65522eec9d98) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338262#comment-14338262 ] Hudson commented on HDFS-7537: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #116 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/116/]) HDFS-7537. Add UNDER MIN REPL'D BLOCKS count to fsck. Contributed by GAO Rui (szetszwo: rev 725cc499f00abeeab9f58cbc778e65522eec9d98) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338288#comment-14338288 ] Hudson commented on HDFS-7537: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #850 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/850/]) HDFS-7537. Add UNDER MIN REPL'D BLOCKS count to fsck. Contributed by GAO Rui (szetszwo: rev 725cc499f00abeeab9f58cbc778e65522eec9d98) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337685#comment-14337685 ] Hadoop QA commented on HDFS-7537: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700935/Screen%20Shot%202015-02-26%20at%2010.30.35.png against trunk revision d140d76. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9670//console This message is automatically generated. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337799#comment-14337799 ] Tsz Wo Nicholas Sze commented on HDFS-7537: --- ..., I think changes in fsck should not influent TestLeaseRecovery2.java tests, right? Agree. The failure of TestLeaseRecovery2 has nothing to do with the patch. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337806#comment-14337806 ] GAO Rui commented on HDFS-7537: --- So, what should I do to apply this patch to branch trunk? fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337807#comment-14337807 ] GAO Rui commented on HDFS-7537: --- So, what should I do to apply this patch to branch trunk? fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337831#comment-14337831 ] GAO Rui commented on HDFS-7537: --- I see. Thank you very much for you kind help. Our team will try our best to contribute to HADOOP, HDFS. Thank you again. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337818#comment-14337818 ] Hudson commented on HDFS-7537: -- FAILURE: Integrated in Hadoop-trunk-Commit #7204 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7204/]) HDFS-7537. Add UNDER MIN REPL'D BLOCKS count to fsck. Contributed by GAO Rui (szetszwo: rev 725cc499f00abeeab9f58cbc778e65522eec9d98) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337811#comment-14337811 ] GAO Rui commented on HDFS-7537: --- ? you mean this patch HDFS-7537.1.patch? or, you suggest me to attached a new patch which has the same content as HDFS-7537.2.patch but named HDFS-7537.3.patch? fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337827#comment-14337827 ] Tsz Wo Nicholas Sze commented on HDFS-7537: --- ? you mean this patch HDFS-7537.1.patch? By +1, we mean that the patch is perfect and can be committed. Thanks again for working on this. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337669#comment-14337669 ] GAO Rui commented on HDFS-7537: --- I have attached two screen shots image: Screen Shot 2015-02-26 at 10.27.46 and Screen Shot 2015-02-26 at 10.30.35. The first image shows that Hadoop QA ran the unit tests and encountered test failures in TestLeaseRecovery2.java. But, I have ran the same unit tests in IntelliJ IDEA, the second image shows that all the tests are passed. I have no idea how could that happen, I think changes in fsck should not influent TestLeaseRecovery2.java tests, right? fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336321#comment-14336321 ] Hadoop QA commented on HDFS-7537: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700691/HDFS-7537.2.patch against trunk revision 6cbd9f1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestLeaseRecovery2 Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9663//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9663//console This message is automatically generated. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335523#comment-14335523 ] Allen Wittenauer commented on HDFS-7537: bq. When numUnderMinimalRelicatedBlocks 0 and there is no missing/corrupted block, all under minimal replicated blocks have at least one good replica so that they can be replicated and there is no data loss. It makes sense to consider the file system as healthy. Exactly this. I made a prototype to play with. One of things I did was put the number of blocks that didn't meet the replication minimum surrounded by the asterisks that the corrupted output did. This made it absolutely crystal clear why the NN wasn't coming out of safemode. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334686#comment-14334686 ] Hadoop QA commented on HDFS-7537: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700346/HDFS-7537.1.patch against trunk revision 1dba572. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9653//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9653//console This message is automatically generated. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334742#comment-14334742 ] GAO Rui commented on HDFS-7537: --- Thank you very much for your review and comment. 1. I think minReplication may get its value from DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY in the first place. I’ll try to figure this out and add it to the output. 2. In Allen’s comment, the Mock-up output shows status as HEALTHY when numUnderMinimalRelicatedBlocks 0. It’s his careless mistake or maybe he has his reason to keep the status as HEALTHY while show the numUnderMinimalRelicatedBlocks in the same time? 3. I haven’t added unit test before, but I’ll try to do that. 4. Sorry, I’ll fix it and avoid this kind of mistakes in future codes. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335054#comment-14335054 ] Tsz Wo Nicholas Sze commented on HDFS-7537: --- In Allen’s comment, the Mock-up output shows status as HEALTHY when numUnderMinimalRelicatedBlocks 0. ... I see. Let's keep showing HEALTHY for the moment. When numUnderMinimalRelicatedBlocks 0 and there is no missing/corrupted block, all under minimal replicated blocks have at least one good replica so that they can be replicated and there is no data loss. It makes sense to consider the file system as healthy. Currently, we only have two statuses, HEALTHY and CORRUPT. In the future, we may want to add one more status for this case. BTW, there is a typo: numUnderMinimalRelicatedBlocks should be numUnderMinimalReplicatedBlocks fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336111#comment-14336111 ] GAO Rui commented on HDFS-7537: --- I have attached a new patch which added DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY to the output of fsck and made a unit test to confirm this change. Please review that when you are free, thanks a lot. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336110#comment-14336110 ] GAO Rui commented on HDFS-7537: --- I have attached a new patch which added DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY to the output of fsck and made a unit test to confirm this change. Please review that when you are free, thanks a lot. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335528#comment-14335528 ] Allen Wittenauer commented on HDFS-7537: Also: I'm not sure what to do about the web UI component. It may not be necessary; better practice should be to run fsck under situations like these. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334555#comment-14334555 ] Tsz Wo Nicholas Sze commented on HDFS-7537: --- Thanks Gao. Some comments on the patch: - When numUnderMinimalRelicatedBlocks 0, we should also print the value of minReplication, e.g. {code} if (numUnderMinimalRelicatedBlocks 0) { res.append(\n UNDER MIN REPL'D BLOCKS:\t) .append(numUnderMinimalRelicatedBlocks).append( () .append(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY) .append( = ).append(minReplication).append()); //need to pass minReplication to Result. {code} - We should change Result.isHealthy() to return false when numUnderMinimalRelicatedBlocks 0. - I wonder if it is easy to add a unit test? See TestFsck to get some idea. - We do not use the tab character in hadoop. Please replace it with spaces (indentation is two spaces). fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324028#comment-14324028 ] GAO Rui commented on HDFS-7537: --- bq. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. [~aw], I am not sure what you mean by the above description. If you were to change the output, what do you propose we show? fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14322278#comment-14322278 ] GAO Rui commented on HDFS-7537: --- I want to try to do this. Please assign it to me, thank you! fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Reporter: Allen Wittenauer Attachments: dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)