[
https://issues.apache.org/jira/browse/HDFS-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024741#comment-17024741
]
Íñigo Goiri edited comment on HDFS-15144 at 1/27/20 11:20 PM:
--------------------------------------------------------------
I have to say this is a very anti-intuitive piece of code.
I thinks it might be worth clarifying this a little; either adding comments or
making the restart method not move nodes back into a list in a different order.
Maybe even adding a method like restart all datanodes.
was (Author: elgoiri):
I have to say this is a very anti-intuitive piece of code.
I thinks it might be worth clarifying this a little; either adding comments or
making the restart method not move nodes back into a list in a different order.
> TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect
> -----------------------------------------------------------------------
>
> Key: HDFS-15144
> URL: https://issues.apache.org/jira/browse/HDFS-15144
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: Ahmed Hussein
> Assignee: Ahmed Hussein
> Priority: Minor
> Attachments: 2020-01-24-09-30-TestBlockStatsMXBean-output.txt,
> HDFS-15144.001.patch
>
>
> {{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three
> times to restart Datanodes. However, the code restart the DN-0 three times.
> As a result, the JUnit does not really execute the scenario it was supposed
> to.
> {code:java}
> DataNodeTestUtils.restoreDataDirFromFailure(dn1ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn2ArcVol1);
> DataNodeTestUtils.restoreDataDirFromFailure(dn3ArcVol1);
> for (int i = 0; i < 3; i++) {
> cluster.restartDataNode(0, true);
> }
> // wait for heartbeat
> Thread.sleep(6000);
> storageTypeStatsMap = cluster.getNamesystem().getBlockManager()
> .getStorageTypeStats();
> storageTypeStats = storageTypeStatsMap.get(StorageType.RAM_DISK);
> assertEquals(6, storageTypeStats.getNodesInService());
> {code}
> When I changed the loop inner block to {{cluster.restartDataNode(i, true)}},
> the test did not pass with the stack trace below. I suspect that one of the
> datanodes does not start properly after calling restart.
> {code:bash}
> [INFO] Running
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed:
> 28.805 s <<< FAILURE! - in
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR]
> testStorageTypeStatsWhenStorageFailed(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean)
> Time elapsed: 17.682 s <<< FAILURE!
> java.lang.AssertionError: expected:<6> but was:<5>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:834)
> at org.junit.Assert.assertEquals(Assert.java:645)
> at org.junit.Assert.assertEquals(Assert.java:631)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean.testStorageTypeStatsWhenStorageFailed(TestBlockStatsMXBean.java:213)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> at
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.lang.Thread.run(Thread.java:748)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]