[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622014#comment-14622014 ] Walter Su commented on HDFS-8729: - Thanks Jing for review and helpful analysis! Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622128#comment-14622128 ] Hudson commented on HDFS-8729: -- FAILURE: Integrated in Hadoop-Yarn-trunk #982 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/982/]) HDFS-8729. Fix TestFileTruncate#testTruncateWithDataNodesRestartImmediately which occasionally failed. Contributed by Walter Su. (jing9: rev f4ca530c1cc9ece25c5ef01f99a94eb9e678e890) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622119#comment-14622119 ] Hudson commented on HDFS-8729: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #252 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/252/]) HDFS-8729. Fix TestFileTruncate#testTruncateWithDataNodesRestartImmediately which occasionally failed. Contributed by Walter Su. (jing9: rev f4ca530c1cc9ece25c5ef01f99a94eb9e678e890) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622369#comment-14622369 ] Hudson commented on HDFS-8729: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #240 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/240/]) HDFS-8729. Fix TestFileTruncate#testTruncateWithDataNodesRestartImmediately which occasionally failed. Contributed by Walter Su. (jing9: rev f4ca530c1cc9ece25c5ef01f99a94eb9e678e890) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622287#comment-14622287 ] Hudson commented on HDFS-8729: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2179 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2179/]) HDFS-8729. Fix TestFileTruncate#testTruncateWithDataNodesRestartImmediately which occasionally failed. Contributed by Walter Su. (jing9: rev f4ca530c1cc9ece25c5ef01f99a94eb9e678e890) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622297#comment-14622297 ] Hudson commented on HDFS-8729: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2198 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2198/]) HDFS-8729. Fix TestFileTruncate#testTruncateWithDataNodesRestartImmediately which occasionally failed. Contributed by Walter Su. (jing9: rev f4ca530c1cc9ece25c5ef01f99a94eb9e678e890) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14622432#comment-14622432 ] Hudson commented on HDFS-8729: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #250 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/250/]) HDFS-8729. Fix TestFileTruncate#testTruncateWithDataNodesRestartImmediately which occasionally failed. Contributed by Walter Su. (jing9: rev f4ca530c1cc9ece25c5ef01f99a94eb9e678e890) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620219#comment-14620219 ] Hadoop QA commented on HDFS-8729: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 6m 0s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 59s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 19s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 36s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 36s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 6s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 159m 20s | Tests failed in hadoop-hdfs. | | | | 180m 4s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Timed out tests | org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12744426/HDFS-8729.02.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 63d0365 | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11643/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11643/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11643/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11643/console | This message was automatically generated. Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620257#comment-14620257 ] Walter Su commented on HDFS-8729: - Findbugs is broken recently. The time out test not related. Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621250#comment-14621250 ] Hudson commented on HDFS-8729: -- FAILURE: Integrated in Hadoop-trunk-Commit #8142 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8142/]) HDFS-8729. Fix TestFileTruncate#testTruncateWithDataNodesRestartImmediately which occasionally failed. Contributed by Walter Su. (jing9: rev f4ca530c1cc9ece25c5ef01f99a94eb9e678e890) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14621182#comment-14621182 ] Jing Zhao commented on HDFS-8729: - bq. That's not what I do. In the 01 patch, checkBlockRecovery(p) will make sure truncation is completed. triggerBlockReports() is for second time blockReport. Ahh, I missed the location of the new triggerBlockReports call. +1 for the 02 patch. I will commit it shortly. Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8729.01.patch, HDFS-8729.02.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619035#comment-14619035 ] Jing Zhao commented on HDFS-8729: - To trigger block report or not before restarting DataNodes may test different code paths: if DNs send report to NN before restarting, it is very possible that the truncate can be done before the restarting. Otherwise the recovery process may happen after DN restarts. In these two scenarios the block replicas reported from DN, and the block info stored in NN, can have different states when the restarted DNs send their first block reports to NN. In my test looks like the reason of the timeout is a race scenario in the block recovery process: the second dn sends block report after the block truncation is finished thus its replica is marked as corrupted. However the replication monitor cannot schedule an extra replica because there are only 3 datanodes in the test. So maybe a quick fix is to change the total number of DN from 3 to 4. What do you think, Walter? Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8729.01.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11599/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618088#comment-14618088 ] Hadoop QA commented on HDFS-8729: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 8m 30s | Pre-patch trunk has 1 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 45s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 19s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 20s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 22s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 20s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 158m 22s | Tests failed in hadoop-hdfs. | | | | 184m 6s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12744123/HDFS-8729.01.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / c9dd2ca | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/11616/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11616/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11616/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11616/console | This message was automatically generated. Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8729.01.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8729) Fix testTruncateWithDataNodesRestartImmediately occasionally failed
[ https://issues.apache.org/jira/browse/HDFS-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618114#comment-14618114 ] Walter Su commented on HDFS-8729: - The patch passes lots times locally. The tests failed because HDFS-8642. I'll trigger jenkins again after HDFS-8642 fixed. Fix testTruncateWithDataNodesRestartImmediately occasionally failed --- Key: HDFS-8729 URL: https://issues.apache.org/jira/browse/HDFS-8729 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8729.01.patch https://builds.apache.org/job/PreCommit-HDFS-Build/11449/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11593/testReport/ https://builds.apache.org/job/PreCommit-HDFS-Build/11596/testReport/ {noformat} java.util.concurrent.TimeoutException: Timed out waiting for /test/testTruncateWithDataNodesRestartImmediately to reach 3 replicas at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:761) at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestartImmediately(TestFileTruncate.java:814) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)