[ 
https://issues.apache.org/jira/browse/HDFS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742916#action_12742916
 ] 

Jitendra Nath Pandey commented on HDFS-15:
------------------------------------------

The proposal is as follows. First 3 points are same as the approach suggested 
in the first coment, except a slight change that the blocks which are not 
sufficiently replicated take higher priority over the blocks that have required 
replicas but violate the rack requirement.
  1. Both under-replicated blocks and blocks that do not satisfy rack 
requirement should be included in the neededReplication queue.
  2. neededReplication queue should have 4 priorites
      Priority 0: Blocks that have only one replicas
      Priority 1: Blocks whose number of replicas is no greater than 1/3 of it 
replication factor.
      Priority 2: All other blocks which do not have required number of 
replicas.
      Priority 3: Blocks which have required number of replicas or more but all 
of them on the same rack.
  3. In methods addStoredBlock, removeStoredBlock, startDecomission, and 
markBlockAsCorrupt in FSNamesystem, put both under-replication and 1 rack 
blocks into the neededReplication queue. Replicator will in addition replicate 
one more replicas for only 1 rack not under-replicated blocks.
  4. If a block is in priority 3 of neededReplication queue and 
ReplicationTargetChooser is unable to find a location for replica that meets 
the rack requirement, do not schedule a replication, instead keep the block in 
the same queue. 

> All replicas of a block end up on only 1 rack
> ---------------------------------------------
>
>                 Key: HDFS-15
>                 URL: https://issues.apache.org/jira/browse/HDFS-15
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>            Assignee: Jitendra Nath Pandey
>            Priority: Critical
>
> HDFS replicas placement strategy guarantees that the replicas of a block 
> exist on at least two racks when its replication factor is greater than one. 
> But fsck still reports that the replicas of some blocks  end up on one rack.
> The cause of the problem is that decommission and corruption handling only 
> check the block's replication factor but not the rack requirement. When an 
> over-replicated block loses a replica due to decomission, corruption, or 
> heartbeat lost, namenode does not take any action to guarantee that remaining 
> replicas are on different racks.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to