[ 
https://issues.apache.org/jira/browse/HADOOP-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784467#comment-16784467
 ] 

He Xiaoqiao commented on HADOOP-16161:
--------------------------------------

I would like to offer more comment about this issue. The following is the 
complete #getWeightUsingNetworkLocation (only for no datanode client) code 
based on branch trunk.
a. normalize both reader and datanode network location, the result is *rack 
location* marked readerPath and nodePath which is parent of reader or datanode, 
both are calculate by rack aware script if configure.
b. split both network location by slash, then get the smaller one level.
c. find the deepest node which is the common ancestor/parent of the network 
location mentioned in step a.
d. based on step c, calculate topology distance between readerPath and nodePath.

All above steps are correct, but the result is distance between parent of 
reader and parent of node, rather than reader to node. So adding a +2 can avoid 
this issue I think. welcome discuss and please help to correct me if there are 
something wrong. 

{code:java}
  private static int getWeightUsingNetworkLocation(Node reader, Node node) {
    //Start off by initializing to Integer.MAX_VALUE
    int weight = Integer.MAX_VALUE;
    if(reader != null && node != null) {
      String readerPath = normalizeNetworkLocationPath(
          reader.getNetworkLocation());
      String nodePath = normalizeNetworkLocationPath(
          node.getNetworkLocation());

      //same rack
      if(readerPath.equals(nodePath)) {
        if(reader.getName().equals(node.getName())) {
          weight = 0;
        } else {
          weight = 2;
        }
      } else {
        String[] readerPathToken = readerPath.split(PATH_SEPARATOR_STR);
        String[] nodePathToken = nodePath.split(PATH_SEPARATOR_STR);
        int maxLevelToCompare = readerPathToken.length > nodePathToken.length ?
            nodePathToken.length : readerPathToken.length;
        int currentLevel = 1;
        //traverse through the path and calculate the distance
        while(currentLevel < maxLevelToCompare) {
          if(!readerPathToken[currentLevel]
              .equals(nodePathToken[currentLevel])){
            break;
          }
          currentLevel++;
        }
        weight = (readerPathToken.length - currentLevel) +
            (nodePathToken.length - currentLevel);
      }
    }
    return weight;
  }
{code}

> NetworkTopology#getWeightUsingNetworkLocation return unexpected result
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-16161
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16161
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: net
>            Reporter: He Xiaoqiao
>            Assignee: He Xiaoqiao
>            Priority: Major
>         Attachments: HADOOP-16161.001.patch
>
>
> Consider the following scenario:
> 1. there are 4 slaves and topology like:
> Rack: /IDC/RACK1
>    hostname1
>    hostname2
> Rack: /IDC/RACK2
>    hostname3
>    hostname4
> 2. Reader from hostname1, and calculate weight between reader and [hostname1, 
> hostname3, hostname4] by #getWeight, and their corresponding values are 
> [0,4,4]
> 3. Reader from client which is not in the topology, and in the same IDC but 
> in none rack of the topology, and calculate weight between reader and 
> [hostname1, hostname3, hostname4] by #getWeightUsingNetworkLocation, and 
> their corresponding values are [2,2,2]
> 4. Other different Reader can get the similar results.
> The weight result for case #3 is obviously not the expected value, the truth 
> is [4,4,4]. this issue may cause reader not really following arrange: local 
> -> local rack -> remote rack. 
> After dig the detailed implement, the root cause is 
> #getWeightUsingNetworkLocation only calculate distance between Racks rather 
> than hosts.
> I think we should add constant 2 to correct the weight of 
> #getWeightUsingNetworkLocation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to