[ 
https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15926819#comment-15926819
 ] 

Hudson commented on HDFS-11419:
-------------------------------

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11410 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11410/])
HDFS-11419. DFSTopologyNodeImpl#chooseRandom optimizations. Contributed (arp: 
rev 615ac09499dc0b391cbb99bb0e9877959a9173a6)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/net/TestDFSNetworkTopology.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/net/DFSTopologyNodeImpl.java


> BlockPlacementPolicyDefault is choosing datanode in an inefficient way
> ----------------------------------------------------------------------
>
>                 Key: HDFS-11419
>                 URL: https://issues.apache.org/jira/browse/HDFS-11419
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>
> Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
> calling into {{chooseRandom}}, which will first find a random datanode by 
> calling
> {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
> excludedNodes);{code}, then it checks whether that returned datanode 
> satisfies storage type requirement
> {code}storage = chooseStorage4Block(
>               chosenNode, blocksize, results, entry.getKey());{code}
> If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
> and runs the loop again until {{numOfReplicas}} is down to 0.
> A problem here is that, storage type is not being considered only until after 
> a random node is already returned.  We've seen a case where a cluster has a 
> large number of datanodes, while only a few satisfy the storage type 
> condition. So, for the most part, this code blindly picks random datanodes 
> that do not satisfy the storage type requirement.
> To make matters worse, the way {{NetworkTopology#chooseRandom}} works is 
> that, given a set of excluded nodes, it first finds a random datanodes, then 
> if it is in excluded nodes set, try find another random nodes. So the more 
> excluded nodes there are, the more likely a random node will be in the 
> excluded set, in which case we basically wasted one iteration.
> Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
> that datanodes can be found more efficiently. There are currently two 
> different high level solutions we are considering:
> 1. add some field to Node base types to describe the storage type info, and 
> when searching for a node, we take into account such field(s), and do not 
> return node that does not meet the storage type requirement.
> 2. change {{NetworkTopology}} class to be aware of storage types, e.g. for 
> one storage type, there is one tree subset that connects all the nodes with 
> that type. And one search happens on only one such subset. So unexpected 
> storage types are simply not in the search space. 
> Thanks [~szetszwo] for the offline discussion, and thanks [~linyiqun] for 
> pointing out a wrong statement (corrected now) in the description. Any 
> further comments are more than welcome.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to