[ https://issues.apache.org/jira/browse/HADOOP-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Boudnik updated HADOOP-5734: --------------------------------------- Assignee: Konstantin Boudnik > HDFS architecture documentation describes outdated placement policy > ------------------------------------------------------------------- > > Key: HADOOP-5734 > URL: https://issues.apache.org/jira/browse/HADOOP-5734 > Project: Hadoop Core > Issue Type: Bug > Components: documentation > Affects Versions: 0.20.0 > Reporter: Konstantin Boudnik > Assignee: Konstantin Boudnik > Priority: Minor > Fix For: 0.21.0 > > Attachments: HADOOP-5734.patch > > Time Spent: 2h > Remaining Estimate: 0h > > The "Replica Placement: The First Baby Steps" section of HDFS architecture > document states: > "... > For the common case, when the replication factor is three, HDFS's placement > policy is to put one replica on one node in the local rack, another on a > different node in the local rack, and the last on a different node in a > different rack. This policy cuts the inter-rack write traffic which generally > improves write performance. > ..." > However, according to the ReplicationTargetChooser.chooseTarger()'s code the > actual logic is to put the second replica on a different rack as well as the > third replica. So you have two replicas located on a different nodes of > remote rack and one (initial replica) on the local rack's node. Thus, the > sentence should say something like this: > "For the common case, when the replication factor is three, HDFS's placement > policy is to put one replica on one node in the local rack, another on a node > in a different (remote) rack, and the last on a different node in the same > remote rack. This policy cuts the inter-rack write traffic which generally > improves write performance." -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.