[ 
https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762553#comment-13762553
 ] 

Junping Du commented on HDFS-5168:
----------------------------------

IMO, this is a more generic scope effort as proposing storage group as an 
orthogonal failure group that across previous network location. I would 
recommend you to start from a proposal for everyone to discuss. Also, 
delivering a demo patch would help others understanding better. Thoughts? 
Thanks again for working on this. Nikola!
                
> BlockPlacementPolicy does not work for cross rack/node group dependencies
> -------------------------------------------------------------------------
>
>                 Key: HDFS-5168
>                 URL: https://issues.apache.org/jira/browse/HDFS-5168
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Nikola Vujic
>            Assignee: Nikola Vujic
>            Priority: Critical
>
> Block placement policies do not work for cross rack/node group dependencies. 
> In reality this is needed when compute servers and storage fall in two 
> independent fault domains, then both BlockPlacementPolicyDefault and 
> BlockPlacementPolicyWithNodeGroup are not able to provide proper block 
> placement.
> Let's suppose that we have Hadoop cluster with one rack with two servers, and 
> we run 2 VMs per server. Node group topology for this cluster would be:
>  server1-vm1 -> /d1/r1/n1
>  server1-vm2 -> /d1/r1/n1
>  server2-vm1 -> /d1/r1/n2
>  server2-vm2 -> /d1/r1/n2
> This is working fine as long as server and storage fall into the same fault 
> domain but if storage is in a different fault domain from the server, we will 
> not be able to handle that. For example, if storage of server1-vm1 is in the 
> same fault domain as storage of server2-vm1, then we must not place two 
> replicas on these two nodes although they are in different node groups.
> Two possible approaches:
> - One approach would be to define cross rack/node group dependencies and to 
> use them when excluding nodes from the search space. This looks as the 
> cleanest way to fix this as it requires minor changes in the 
> BlockPlacementPolicy classes.
> - Other approach would be to allow nodes to fall in more than one node group. 
> When we chose a node to hold a replica we have to exclude from the search 
> space all nodes from the node groups where the chosen node belongs. This 
> approach may require major changes in the NetworkTopology.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to