[
https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296157#comment-15296157
]
Rakesh R commented on HDFS-10434:
---------------------------------
Thanks [~libo-intel], we are updating the dn metrics at the finally block of
{{StripedReconstructor}} thread as shown below. The failure occurs because
{{StripedFileTestUtil.waitForReconstructionFinished()}} is waiting for the
block recovery but not waiting to finish executing the
StripedReconstructor#run() finally block section. Probably can try debugging
the failed test case {{TestDataNodeErasureCodingMetrics#testEcTasks}} by
putting a break point at the finally block and you can see
{{StripedFileTestUtil.waitForReconstructionFinished(file, fs, GROUPSIZE);}} is
coming out and failing the test case. To fix this, I added grace period so that
the thread will get a chance to execute the finally block to update the metrics
data.
StripedReconstructor.java
{code}
} finally {
datanode.decrementXmitsInProgress();
datanode.getMetrics().incrECReconstructionTasks();
stripedReader.close();
stripedWriter.close();
{code}
> Fix intermittent test failure of TestDataNodeErasureCodingMetrics
> -----------------------------------------------------------------
>
> Key: HDFS-10434
> URL: https://issues.apache.org/jira/browse/HDFS-10434
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Rakesh R
> Assignee: Rakesh R
> Attachments: HDFS-10434-00.patch
>
>
> This jira is to fix the test case failure.
> Reference :
> [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/]
> {code}
> Error Message
> Bad value for metric EcReconstructionTasks expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: Bad value for metric EcReconstructionTasks
> expected:<1> but was:<0>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at
> org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228)
> at
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]