[ 
https://issues.apache.org/jira/browse/HDFS-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712854#comment-14712854
 ] 

Masatake Iwasaki commented on HDFS-8945:
----------------------------------------

Thanks for the comments, [~andrew.wang]!


bq. For 4+ replicas, since we've already guaranteed multi-rack with the first 
3, I thought the 4th+ are just pure random.

{{BlockPlacementPolicyDefault#isGoodDatanode}} checks that the number of 
replicas in the same rack is under the limit given by 
{{BlockPlacementPolicyDefault#getMaxNodesPerRack}} (which was added by 
HDFS-2576).

{code}
int maxNodesPerRack = (totalNumOfReplicas-1)/numOfRacks + 2;
{code}

The limit avoids that the rest of replicas allocated under the same rack.


In addition, experiment using the code of {{TestDefaultBlockPlacementPolicy}} 
showed me that setting replication factor to total number of nodes in the 
cluster does not always result in replicas located on all nodes. 

I changed the number of nodes of mini culster to 9. /RACK0 has 6 nodes, /RACK2 
has 2 and /RACK3 has 1.

{code}
    final String[] racks = { "/RACK0", "/RACK0", "/RACK2", "/RACK3", "/RACK2", 
"/RACK0", "/RACK0", "/RACK0", "/RACK0" };
    final String[] hosts = { "/host0", "/host1", "/host2", "/host3", "/host4" 
,"/host5", "/host6", "/host7", "/host8" };
{code}

When I added the code to create a file with replication factor 9, I always got 
7 replicas located as below because maxNodesPerRack is 4 in this case, though 
this is unusual case in which nodes are not evenly distributed among racks.

{noformat}
/RACK0
/RACK0
/RACK0
/RACK0
/RACK2
/RACK2
/RACK3
{noformat}


> Update the description about replica placement in HDFS Architecture 
> documentation
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-8945
>                 URL: https://issues.apache.org/jira/browse/HDFS-8945
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: documentation
>            Reporter: Masatake Iwasaki
>            Assignee: Masatake Iwasaki
>            Priority: Minor
>         Attachments: HDFS-8945.001.patch
>
>
> The description about replica placement should have
> * Explanation about storage types and storage policies should be added
> * placement policy for replication factor greater than 4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to