[
https://issues.apache.org/jira/browse/HDFS-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15380107#comment-15380107
]
Yongjun Zhang commented on HDFS-10627:
--------------------------------------
Thanks guys for the work here!
When detecting corruption of block (checksum error), in pipeline write, or
block transfer of pipeline recovery, I hope a checksum exception can be thrown
and delivered back to sender, instead of just disconnect. Is this totally not
feasible here?
> Volume Scanner mark a block as "suspect" even if the block sender encounters
> 'Broken pipe' or 'Connection reset by peer' exception
> ----------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-10627
> URL: https://issues.apache.org/jira/browse/HDFS-10627
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Affects Versions: 2.7.0
> Reporter: Rushabh S Shah
> Assignee: Rushabh S Shah
> Attachments: HDFS-10627.patch
>
>
> In the BlockSender code,
> {code:title=BlockSender.java|borderStyle=solid}
> if (!ioem.startsWith("Broken pipe") && !ioem.startsWith("Connection
> reset")) {
> LOG.error("BlockSender.sendChunks() exception: ", e);
> }
> datanode.getBlockScanner().markSuspectBlock(
> volumeRef.getVolume().getStorageID(),
> block);
> {code}
> Before HDFS-7686, the block was marked as suspect only if the exception
> message doesn't start with Broken pipe or Connection reset.
> But after HDFS-7686, the block is marked as corrupt irrespective of the
> exception message.
> In one of our datanode, it took approximately a whole day (22 hours) to go
> through all the suspect blocks to scan one corrupt block.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]