[
https://issues.apache.org/jira/browse/HDFS-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273764#comment-14273764
]
Nathan Roberts commented on HDFS-7548:
--------------------------------------
- I think we need to prioritize a scan for that block.
- Also, some comments on addBlockToFirstLocation().
- imo, WARN should be INFO.
- If this block has been scanned in the last 5 minutes (or some reasonable
time frame), then maybe we shouldn't add it back to the list of blocks to be
scanned. If all IOExceptions are going to re-prioritize the scan of a block,
having a minimum delay between scans would avoid corner cases where a network
glitch or badly behaving clients are causing IOExceptions that don't really
warrant rescans.
> Corrupt block reporting delayed until datablock scanner thread detects it
> -------------------------------------------------------------------------
>
> Key: HDFS-7548
> URL: https://issues.apache.org/jira/browse/HDFS-7548
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.5.0
> Reporter: Rushabh S Shah
> Assignee: Rushabh S Shah
> Attachments: HDFS-7548.patch
>
>
> When there is one datanode holding the block and that block happened to be
> corrupt, namenode would keep on trying to replicate the block repeatedly but
> it would only report the block as corrupt only when the data block scanner
> thread of the datanode picks up this bad block.
> Requesting improvement in namenode reporting so that corrupt replica would be
> reported when there is only 1 replica and the replication of that replica
> keeps on failing with the checksum error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)