[ 
https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370205#comment-14370205
 ] 

Zhe Zhang commented on HDFS-7891:
---------------------------------

Great analysis [~walter.k.su]! 

It seems the 003 patch has removed {{rack2hosts}}. So the {{sortedRack}} 
results were obtained with 002 patch right?

Conceptually, I think the {{sortedRack}} method makes sense for EC. It 
essentially introduces 2 levels of random selection: first choosing a rack and 
the choosing a node in the selected rack. This is much more efficient than 
selecting a random node from the entire cluster with the rack constraint. In 
particular, in a typical setup, the number of racks in the cluster should be 
close to the EC width. So the choice of racks should be easy (e.g., choosing 14 
from 15). Its performance benefit should be even larger if you have more nodes 
per rack, like 20. 

So I think the question is whether we can have a simpler implementation of the 
{{sortedRack}} method, without duplicating a lot of code.

> A block placement policy with best fault tolerance
> --------------------------------------------------
>
>                 Key: HDFS-7891
>                 URL: https://issues.apache.org/jira/browse/HDFS-7891
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Walter Su
>            Assignee: Walter Su
>         Attachments: HDFS-7891.002.patch, HDFS-7891.003.patch, 
> HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt
>
>
> a block placement policy tries its best to place replicas to most racks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to