[ 
https://issues.apache.org/jira/browse/HDFS-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486614#comment-14486614
 ] 

Walter Su commented on HDFS-7613:
---------------------------------

As discussed in HDFS-7891, we think best rack failure tolerance maybe not 
suitable for EC. Here we discuss a specific policy for EC.
As suggested by [~szetszwo], fault tolerance for one rack is enough for 
EC/non-EC. Because normally only network/power failure bring rack down. Rack 
will come up soon. The failure will not cause data loss. In short, fault 
tolerance for one rack is enough.
I propose a policy for EC:
||rack_0||rack_1||rack_2||rack_3||rack_4||rack_5||
|data_0|data_1|data_2|data_3|data_4|data_5|
|prty_0|prty_1|prty_2|
Data block spread the most racks as they can.  Parity block is colocated with 
data block. Just like [~szetszwo] said, we should allow max 2 replicas per 
rack. A little difference is that we should not place data_3 and data_4 
together on rack_3. Client reconstruction should not slow down client reading. 
Once rack_0 is down, client only need to reconstruct data_0, while parity_0 is 
left to NN to reconstruct later. But once rack_3 is down, client need to 
reconstruct 2 blocks before it can read them. In short, we should separate 
data_3 and data_4.



> Block placement policy for erasure coding groups
> ------------------------------------------------
>
>                 Key: HDFS-7613
>                 URL: https://issues.apache.org/jira/browse/HDFS-7613
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Walter Su
>
> Blocks in an erasure coding group should be placed in different failure 
> domains -- different DataNodes at the minimum, and different racks ideally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to