[
https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949752#comment-14949752
]
Tsz Wo Nicholas Sze commented on HDFS-9205:
-------------------------------------------
Thanks Zhe for the comments.
> ... those blocks won't be re-replicated, even though
> chooseUnderReplicatedBlocks returns them? Or they are re-replicated in the
> current logic, but they should not be (IIUC that's the case)?
Those blocks have zero replicas so that it is impossible to replicate them.
(Let's ignore read-only storage here since it is an incomplete feature.)
> ... But is there a use case for an admin to list corrupt blocks and reason
> about them by accessing the local blk_ (and metadata) files? ...
This patch does not prevent that.
> If we do want to save the replication work for corrupt blocks, should we get
> rid of QUEUE_WITH_CORRUPT_BLOCKS altogether?
The block priority could possibly be updated.
> Do not schedule corrupt blocks for replication
> ----------------------------------------------
>
> Key: HDFS-9205
> URL: https://issues.apache.org/jira/browse/HDFS-9205
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Tsz Wo Nicholas Sze
> Priority: Minor
> Attachments: h9205_20151007.patch, h9205_20151007b.patch,
> h9205_20151008.patch
>
>
> Corrupted blocks by definition are blocks cannot be read. As a consequence,
> they cannot be replicated. In UnderReplicatedBlocks, there is a queue for
> QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks
> from it. It seems that scheduling corrupted block for replication is wasting
> resource and potentially slow down replication for the higher priority blocks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)