[ https://issues.apache.org/jira/browse/HDFS-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712854#comment-14712854 ]
Masatake Iwasaki commented on HDFS-8945: ---------------------------------------- Thanks for the comments, [~andrew.wang]! bq. For 4+ replicas, since we've already guaranteed multi-rack with the first 3, I thought the 4th+ are just pure random. {{BlockPlacementPolicyDefault#isGoodDatanode}} checks that the number of replicas in the same rack is under the limit given by {{BlockPlacementPolicyDefault#getMaxNodesPerRack}} (which was added by HDFS-2576). {code} int maxNodesPerRack = (totalNumOfReplicas-1)/numOfRacks + 2; {code} The limit avoids that the rest of replicas allocated under the same rack. In addition, experiment using the code of {{TestDefaultBlockPlacementPolicy}} showed me that setting replication factor to total number of nodes in the cluster does not always result in replicas located on all nodes. I changed the number of nodes of mini culster to 9. /RACK0 has 6 nodes, /RACK2 has 2 and /RACK3 has 1. {code} final String[] racks = { "/RACK0", "/RACK0", "/RACK2", "/RACK3", "/RACK2", "/RACK0", "/RACK0", "/RACK0", "/RACK0" }; final String[] hosts = { "/host0", "/host1", "/host2", "/host3", "/host4" ,"/host5", "/host6", "/host7", "/host8" }; {code} When I added the code to create a file with replication factor 9, I always got 7 replicas located as below because maxNodesPerRack is 4 in this case, though this is unusual case in which nodes are not evenly distributed among racks. {noformat} /RACK0 /RACK0 /RACK0 /RACK0 /RACK2 /RACK2 /RACK3 {noformat} > Update the description about replica placement in HDFS Architecture > documentation > --------------------------------------------------------------------------------- > > Key: HDFS-8945 > URL: https://issues.apache.org/jira/browse/HDFS-8945 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation > Reporter: Masatake Iwasaki > Assignee: Masatake Iwasaki > Priority: Minor > Attachments: HDFS-8945.001.patch > > > The description about replica placement should have > * Explanation about storage types and storage policies should be added > * placement policy for replication factor greater than 4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)