[ 
https://issues.apache.org/jira/browse/HDFS-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024741#comment-17024741
 ] 

Íñigo Goiri edited comment on HDFS-15144 at 1/27/20 11:20 PM:
--------------------------------------------------------------

I have to say this is a very anti-intuitive piece of code.
I thinks it might be worth clarifying this a little; either adding comments or 
making the restart method not move nodes back into a list in a different order.
Maybe even adding a method like restart all datanodes.


was (Author: elgoiri):
I have to say this is a very anti-intuitive piece of code.
I thinks it might be worth clarifying this a little; either adding comments or 
making the restart method not move nodes back into a list in a different order.

> TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed is incorrect
> -----------------------------------------------------------------------
>
>                 Key: HDFS-15144
>                 URL: https://issues.apache.org/jira/browse/HDFS-15144
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Ahmed Hussein
>            Assignee: Ahmed Hussein
>            Priority: Minor
>         Attachments: 2020-01-24-09-30-TestBlockStatsMXBean-output.txt, 
> HDFS-15144.001.patch
>
>
> {{TestBlockStatsMXBean#testStorageTypeStatsWhenStorageFailed}} loops three 
> times to restart Datanodes. However, the code restart the DN-0 three times.
> As a result, the JUnit does not really execute the scenario it was supposed 
> to.
> {code:java}
>     DataNodeTestUtils.restoreDataDirFromFailure(dn1ArcVol1);
>     DataNodeTestUtils.restoreDataDirFromFailure(dn2ArcVol1);
>     DataNodeTestUtils.restoreDataDirFromFailure(dn3ArcVol1);
>     for (int i = 0; i < 3; i++) {
>       cluster.restartDataNode(0, true);
>     }
>     // wait for heartbeat
>     Thread.sleep(6000);
>     storageTypeStatsMap = cluster.getNamesystem().getBlockManager()
>         .getStorageTypeStats();
>     storageTypeStats = storageTypeStatsMap.get(StorageType.RAM_DISK);
>     assertEquals(6, storageTypeStats.getNodesInService());
> {code}
> When I changed the loop inner block to {{cluster.restartDataNode(i, true)}}, 
> the test did not pass with the stack trace below. I suspect that one of the 
> datanodes  does not start properly after calling restart.
> {code:bash}
> [INFO] Running 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.805 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
> [ERROR] 
> testStorageTypeStatsWhenStorageFailed(org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean)
>   Time elapsed: 17.682 s  <<< FAILURE!
> java.lang.AssertionError: expected:<6> but was:<5>
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.failNotEquals(Assert.java:834)
>       at org.junit.Assert.assertEquals(Assert.java:645)
>       at org.junit.Assert.assertEquals(Assert.java:631)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean.testStorageTypeStatsWhenStorageFailed(TestBlockStatsMXBean.java:213)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to