[
https://issues.apache.org/jira/browse/HDFS-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546654#comment-15546654
]
Manoj Govindassamy commented on HDFS-10960:
-------------------------------------------
Looking at the code, remove volumes at DataNode can potentially interrupt
BlockReceiver and if the BlockReceiver happens to be in some IO operations like
flushing or setting channel position for the new checksum then it can throw
IOException. {{BlockReceiver}} on getting IOexception, starts a thread to check
for disk errors.
TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten verification fails if
the DataNode ever started a disk error check thread. This verification doesn't
seem to be fruitful as we already have another verification for checking the
block replication factor. So, the proposal here is to replace this not so
useful verification with another verification to check for if the disk removal
happened successfully and if the replication factor of the block caught up even
after the volume removal.
> TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten fails at disk error
> verification after volume remove
> ------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-10960
> URL: https://issues.apache.org/jira/browse/HDFS-10960
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Affects Versions: 3.0.0-alpha2
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Priority: Minor
>
> TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten fails occasionally in
> the following verification.
> {code}
> 700 // If an IOException thrown from BlockReceiver#run, it triggers
> 701 // DataNode#checkDiskError(). So we can test whether
> checkDiskError() is called,
> 702 // to see whether there is IOException in BlockReceiver#run().
> 703 assertEquals(lastTimeDiskErrorCheck, dn.getLastDiskErrorCheck());
> 704
> {code}
> {noformat}
> Error Message
> expected:<0> but was:<6498109>
> Stacktrace
> java.lang.AssertionError: expected:<0> but was:<6498109>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWrittenForDatanode(TestDataNodeHotSwapVolumes.java:703)
> at
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten(TestDataNodeHotSwapVolumes.java:620)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]