[ https://issues.apache.org/jira/browse/HADOOP-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784467#comment-16784467 ]
He Xiaoqiao commented on HADOOP-16161: -------------------------------------- I would like to offer more comment about this issue. The following is the complete #getWeightUsingNetworkLocation (only for no datanode client) code based on branch trunk. a. normalize both reader and datanode network location, the result is *rack location* marked readerPath and nodePath which is parent of reader or datanode, both are calculate by rack aware script if configure. b. split both network location by slash, then get the smaller one level. c. find the deepest node which is the common ancestor/parent of the network location mentioned in step a. d. based on step c, calculate topology distance between readerPath and nodePath. All above steps are correct, but the result is distance between parent of reader and parent of node, rather than reader to node. So adding a +2 can avoid this issue I think. welcome discuss and please help to correct me if there are something wrong. {code:java} private static int getWeightUsingNetworkLocation(Node reader, Node node) { //Start off by initializing to Integer.MAX_VALUE int weight = Integer.MAX_VALUE; if(reader != null && node != null) { String readerPath = normalizeNetworkLocationPath( reader.getNetworkLocation()); String nodePath = normalizeNetworkLocationPath( node.getNetworkLocation()); //same rack if(readerPath.equals(nodePath)) { if(reader.getName().equals(node.getName())) { weight = 0; } else { weight = 2; } } else { String[] readerPathToken = readerPath.split(PATH_SEPARATOR_STR); String[] nodePathToken = nodePath.split(PATH_SEPARATOR_STR); int maxLevelToCompare = readerPathToken.length > nodePathToken.length ? nodePathToken.length : readerPathToken.length; int currentLevel = 1; //traverse through the path and calculate the distance while(currentLevel < maxLevelToCompare) { if(!readerPathToken[currentLevel] .equals(nodePathToken[currentLevel])){ break; } currentLevel++; } weight = (readerPathToken.length - currentLevel) + (nodePathToken.length - currentLevel); } } return weight; } {code} > NetworkTopology#getWeightUsingNetworkLocation return unexpected result > ---------------------------------------------------------------------- > > Key: HADOOP-16161 > URL: https://issues.apache.org/jira/browse/HADOOP-16161 > Project: Hadoop Common > Issue Type: Bug > Components: net > Reporter: He Xiaoqiao > Assignee: He Xiaoqiao > Priority: Major > Attachments: HADOOP-16161.001.patch > > > Consider the following scenario: > 1. there are 4 slaves and topology like: > Rack: /IDC/RACK1 > hostname1 > hostname2 > Rack: /IDC/RACK2 > hostname3 > hostname4 > 2. Reader from hostname1, and calculate weight between reader and [hostname1, > hostname3, hostname4] by #getWeight, and their corresponding values are > [0,4,4] > 3. Reader from client which is not in the topology, and in the same IDC but > in none rack of the topology, and calculate weight between reader and > [hostname1, hostname3, hostname4] by #getWeightUsingNetworkLocation, and > their corresponding values are [2,2,2] > 4. Other different Reader can get the similar results. > The weight result for case #3 is obviously not the expected value, the truth > is [4,4,4]. this issue may cause reader not really following arrange: local > -> local rack -> remote rack. > After dig the detailed implement, the root cause is > #getWeightUsingNetworkLocation only calculate distance between Racks rather > than hosts. > I think we should add constant 2 to correct the weight of > #getWeightUsingNetworkLocation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org