[
https://issues.apache.org/jira/browse/HDFS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742916#action_12742916
]
Jitendra Nath Pandey commented on HDFS-15:
------------------------------------------
The proposal is as follows. First 3 points are same as the approach suggested
in the first coment, except a slight change that the blocks which are not
sufficiently replicated take higher priority over the blocks that have required
replicas but violate the rack requirement.
1. Both under-replicated blocks and blocks that do not satisfy rack
requirement should be included in the neededReplication queue.
2. neededReplication queue should have 4 priorites
Priority 0: Blocks that have only one replicas
Priority 1: Blocks whose number of replicas is no greater than 1/3 of it
replication factor.
Priority 2: All other blocks which do not have required number of
replicas.
Priority 3: Blocks which have required number of replicas or more but all
of them on the same rack.
3. In methods addStoredBlock, removeStoredBlock, startDecomission, and
markBlockAsCorrupt in FSNamesystem, put both under-replication and 1 rack
blocks into the neededReplication queue. Replicator will in addition replicate
one more replicas for only 1 rack not under-replicated blocks.
4. If a block is in priority 3 of neededReplication queue and
ReplicationTargetChooser is unable to find a location for replica that meets
the rack requirement, do not schedule a replication, instead keep the block in
the same queue.
> All replicas of a block end up on only 1 rack
> ---------------------------------------------
>
> Key: HDFS-15
> URL: https://issues.apache.org/jira/browse/HDFS-15
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Hairong Kuang
> Assignee: Jitendra Nath Pandey
> Priority: Critical
>
> HDFS replicas placement strategy guarantees that the replicas of a block
> exist on at least two racks when its replication factor is greater than one.
> But fsck still reports that the replicas of some blocks end up on one rack.
> The cause of the problem is that decommission and corruption handling only
> check the block's replication factor but not the rack requirement. When an
> over-replicated block loses a replica due to decomission, corruption, or
> heartbeat lost, namenode does not take any action to guarantee that remaining
> replicas are on different racks.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.