[
https://issues.apache.org/jira/browse/HDFS-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895550#comment-16895550
]
Wei-Chiu Chuang commented on HDFS-13709:
----------------------------------------
[~zhangchen] what is the version of Hadoop you're using?
When client receives a block, it verifies using the checksum. If the
verification fails it reports to NameNode and NameNode schedules a replacement
block.
If a block is not being accessed by client, then this can happen.
I thought we already verify checksum during block transfer, but I was wrong.
Here's the code in {{DataNode#transferBlock}}
{code:java}
if (replicaNotExist || replicaStateNotFinalized) {
String errStr = "Can't send invalid block " + block;
LOG.info(errStr);
bpos.trySendErrorReport(DatanodeProtocol.INVALID_BLOCK, errStr);
return;
}
if (blockFileNotExist) {
// Report back to NN bad block caused by non-existent block file.
reportBadBlock(bpos, block, "Can't replicate block " + block
+ " because the block file doesn't exist, or is not accessible");
return;
}
if (lengthTooShort) {
// Check if NN recorded length matches on-disk length
// Shorter on-disk len indicates corruption so report NN the corrupt block
reportBadBlock(bpos, block, "Can't replicate block " + block
+ " because on-disk length " + data.getLength(block)
+ " is shorter than NameNode recorded length " + block.getNumBytes());
return;
}
{code}
We only report bad blocks when the block is missing or the length doesn't
match. We don't do checksum. Not sure why. Is there a computation overhead
concern?
> Report bad block to NN when transfer block encounter EIO exception
> ------------------------------------------------------------------
>
> Key: HDFS-13709
> URL: https://issues.apache.org/jira/browse/HDFS-13709
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Reporter: Chen Zhang
> Assignee: Chen Zhang
> Priority: Major
> Attachments: HDFS-13709.patch
>
>
> In our online cluster, the BlockPoolSliceScanner is turned off, and sometimes
> disk bad track may cause data loss.
> For example, there are 3 replicas on 3 machines A/B/C, if a bad track occurs
> on A's replica data, and someday B and C crushed at the same time, NN will
> try to replicate data from A but failed, this block is corrupt now but no one
> knows, because NN think there is at least 1 healthy replica and it keep
> trying to replicate it.
> When reading a replica which have data on bad track, OS will return an EIO
> error, if DN reports the bad block as soon as it got an EIO, we can find
> this case ASAP and try to avoid data loss
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]