[ 
https://issues.apache.org/jira/browse/HDFS-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273764#comment-14273764
 ] 

Nathan Roberts commented on HDFS-7548:
--------------------------------------

- I think we need to prioritize a scan for that block.

- Also, some comments on addBlockToFirstLocation().
  - imo, WARN should be INFO. 
  - If this block has been scanned in the last 5 minutes (or some reasonable 
time frame), then maybe we shouldn't add it back to the list of blocks to be 
scanned. If all IOExceptions are going to re-prioritize the scan of a block, 
having a minimum delay between scans would avoid corner cases where a network 
glitch or badly behaving clients are causing IOExceptions that don't really 
warrant rescans.

> Corrupt block reporting delayed until datablock scanner thread detects it
> -------------------------------------------------------------------------
>
>                 Key: HDFS-7548
>                 URL: https://issues.apache.org/jira/browse/HDFS-7548
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HDFS-7548.patch
>
>
> When there is one datanode holding the block and that block happened to be
> corrupt, namenode would keep on trying to replicate the block repeatedly but 
> it would only report the block as corrupt only when the data block scanner 
> thread of the datanode picks up this bad block.
> Requesting improvement in namenode reporting so that corrupt replica would be 
> reported when there is only 1 replica and the replication of that replica 
> keeps on failing with the checksum error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to