[ https://issues.apache.org/jira/browse/HADOOP-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473569 ]
Milind Bhandarkar commented on HADOOP-972: ------------------------------------------ A few comments: Replicator class needs to move out of FSNameSystem, by supplying it the clusterMap at construction. It adds about 450 lines of code to FSNamesystem.java, which is already quite big, affecting readablity. We should rename that class ReplicaChooser or some such thing. The name Replicator creates a false impression. > Improve the rack-aware replica placement performance > ---------------------------------------------------- > > Key: HADOOP-972 > URL: https://issues.apache.org/jira/browse/HADOOP-972 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Affects Versions: 0.11.0 > Reporter: Hairong Kuang > Assigned To: Hairong Kuang > Fix For: 0.12.0 > > Attachments: rack_performance.patch > > > This issue aims to improve the rack-aware replica placement performance. A > major idea is to avoid constructing lists of possible targets for random > selection in chooseTarget, which currently needs interating all > DatanodeDescriptors. I plan to change the NetworkTopology data structure as > follow: > 1. each InnerNode stores its childrens as a list; > 2. each InnerNode adds a new field numberOfLeaves the total number of leaves > (i.e. data nodes) in its subtree. > NetworkTopology will support two new methods: > 1. DatanodeDescriptor chooseRandom( String scope): it randomly choose one > leave from scope. > 2. DatanodeDescriptor chooseRandomExclude(String excludedScope): it randomly > choose one leave from ~scope > In addition, Issue 971 will also help improve the performance of the > rack-aware DFS patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.