[
https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407242#comment-15407242
]
Rakesh R edited comment on HDFS-10434 at 8/4/16 6:14 AM:
---------------------------------------------------------
I think I got the cause of the failure. It is wrongly finding out the datanode
to be corrupted from the block locations. Instead of finding out a datanode
which is used in the block locations it is simply getting a datanode from the
cluster, which may not be a datanode present in the block locations.
{code}
byte[] indices = lastBlock.getBlockIndices();
//corrupt the first block
DataNode toCorruptDn = cluster.getDataNodes().get(indices[0]);
{code}
For example, datanodes in the {{cluster.getDataNodes()}} array indexed like,
0->Dn1, 1->Dn2, 2->Dn3, 3->Dn4, 4->Dn5, 5->Dn6, 6->Dn7, 7->Dn8, 8->Dn9, 9->Dn10
Assume the datanodes which are part of block location is => Dn2, Dn3, Dn4,
Dn5, Dn6, Dn7, Dn8, Dn9, Dn10. Now, in the failed scenario, it is getting the
corrupted datanode as {{cluster.getDataNodes().get(0)}} which will be Dn1 and
corruption of this datanode will not result in ECWork and is failing the tests.
Ideally, the test should find a datanode from the block locations for
corruption.
Basically there are two problems in this test case. First one was fixed as part
of this jira. For the second part, I think will raise another jira and fix it
as there is no relation between first and second. Please review the HDFS-10720
fix. Thanks!
was (Author: rakeshr):
I think I got the cause of the failure. It is wrongly finding out the datanode
to be corrupted from the block locations. Instead of finding out a datanode
which is used in the block locations it is simply getting a datanode from the
cluster, which may not be a datanode present in the block locations.
{code}
byte[] indices = lastBlock.getBlockIndices();
//corrupt the first block
DataNode toCorruptDn = cluster.getDataNodes().get(indices[0]);
{code}
For example, datanodes in the {{cluster.getDataNodes()}} array indexed like,
0->Dn1, 1->Dn2, 2->Dn3, 3->Dn4, 4->Dn5, 5->Dn6, 6->Dn7, 7->Dn8, 8->Dn9, 9->Dn10
Assume the datanodes which are part of block location is => Dn2, Dn3, Dn4,
Dn5, Dn6, Dn7, Dn8, Dn9, Dn10. Now, in the failed scenario, it is getting the
corrupted datanode as {{cluster.getDataNodes().get(0)}} which will be Dn1 and
corruption of this datanode will not result in ECWork and is failing the tests.
Ideally, the test should find a datanode from the block locations.
Basically there are two problems in this test case. First one was fixed as part
of this jira. For the second part, I think will raise another jira and fix it
as there is no relation between first and second.
> Fix intermittent test failure of TestDataNodeErasureCodingMetrics
> -----------------------------------------------------------------
>
> Key: HDFS-10434
> URL: https://issues.apache.org/jira/browse/HDFS-10434
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Rakesh R
> Assignee: Rakesh R
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10434-00.patch, HDFS-10434-01.patch
>
>
> This jira is to fix the test case failure.
> Reference :
> [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/]
> {code}
> Error Message
> Bad value for metric EcReconstructionTasks expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: Bad value for metric EcReconstructionTasks
> expected:<1> but was:<0>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at
> org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228)
> at
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]