[ 
https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-11419:
------------------------------
    Description: 
Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
calling into {{chooseRandom}}, which will first find a random datanode by 
calling
{code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
excludedNodes);{code}, then it checks whether that returned datanode satisfies 
storage type requirement
{code}storage = chooseStorage4Block(
              chosenNode, blocksize, results, entry.getKey());{code}
If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
and runs the loop again until {{numOfReplicas}} is down to 0.

A problem here is that, storage type is not being considered only until after a 
random node is already returned.  We've seen a case where a cluster has a large 
number of datanodes, while only a few satisfy the storage type condition. So, 
for the most part, this code blindly picks random datanodes that do not satisfy 
the storage type requirement.

To make matters worse, the way {{NetworkTopology#chooseRandom}} works is that, 
given a set of excluded nodes, it first finds a random datanodes, then if it is 
in excluded nodes set, try find another random nodes. So the more excluded 
nodes there are, the more likely a random node will be in the excluded set, in 
which case we basically wasted one iteration.

Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
that datanodes can be found more efficiently. There are currently two different 
high level solutions we are considering:

1. add some field to Node base types to describe the storage type info, and 
when searching for a node, we take such field(s) in to account, and do not 
return node that does not meet the storage type requirement.

2. change {{NetworkTopology}} class to be aware of storage types: for one 
storage type, there is one tree subset that connects all the nodes with that 
type. And one search happens on only one such subset. So unexpected storage 
types are simply in the search space. 

Thanks [~szetszwo] for the offline discussion, and thanks [~linyiqun] for 
pointing out a wrong statement in the description, and any comments are more 
than welcome.

  was:
Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
calling into {{chooseRandom}}, which will first find a random datanode by 
calling
{code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
excludedNodes);{code}, then it checks whether that returned datanode satisfies 
storage type requirement
{code}storage = chooseStorage4Block(
              chosenNode, blocksize, results, entry.getKey());{code}
If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
and runs the loop again until {{numOfReplicas}} is down to 0.

A problem here is that, storage type is not being considered only until after a 
random node is already returned.  We've seen a case where a cluster has a large 
number of datanodes, while only a few satisfy the storage type condition. So, 
for the most part, this code blindly picks random datanodes that do not satisfy 
the storage type, and adds the node to excluded and tries again and again.

To make matters worse, the way {{NetworkTopology#chooseRandom}} works is that, 
given a set of excluded nodes, it first finds a random datanodes, then if it is 
in excluded nodes set, try find another random nodes. So the more excluded 
nodes there are, the more likely a random node will be in the excluded set, in 
which case we basically wasted one iteration.

Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
that datanodes can be found more efficiently. There are currently two different 
high level solutions we are considering:

1. add some field to Node base types to describe the storage type info, and 
when searching for a node, we take such field(s) in to account, and do not 
return node that does not meet the storage type requirement.

2. change {{NetworkTopology}} class to be aware of storage types: for one 
storage type, there is one tree subset that connects all the nodes with that 
type. And one search happens on only one such subset. So unexpected storage 
types are simply in the search space. 

Thanks [~szetszwo] for the offline discussion, and any comments are more than 
welcome.


> BlockPlacementPolicyDefault is choosing datanode in an inefficient way
> ----------------------------------------------------------------------
>
>                 Key: HDFS-11419
>                 URL: https://issues.apache.org/jira/browse/HDFS-11419
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>
> Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
> calling into {{chooseRandom}}, which will first find a random datanode by 
> calling
> {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
> excludedNodes);{code}, then it checks whether that returned datanode 
> satisfies storage type requirement
> {code}storage = chooseStorage4Block(
>               chosenNode, blocksize, results, entry.getKey());{code}
> If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
> and runs the loop again until {{numOfReplicas}} is down to 0.
> A problem here is that, storage type is not being considered only until after 
> a random node is already returned.  We've seen a case where a cluster has a 
> large number of datanodes, while only a few satisfy the storage type 
> condition. So, for the most part, this code blindly picks random datanodes 
> that do not satisfy the storage type requirement.
> To make matters worse, the way {{NetworkTopology#chooseRandom}} works is 
> that, given a set of excluded nodes, it first finds a random datanodes, then 
> if it is in excluded nodes set, try find another random nodes. So the more 
> excluded nodes there are, the more likely a random node will be in the 
> excluded set, in which case we basically wasted one iteration.
> Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
> that datanodes can be found more efficiently. There are currently two 
> different high level solutions we are considering:
> 1. add some field to Node base types to describe the storage type info, and 
> when searching for a node, we take such field(s) in to account, and do not 
> return node that does not meet the storage type requirement.
> 2. change {{NetworkTopology}} class to be aware of storage types: for one 
> storage type, there is one tree subset that connects all the nodes with that 
> type. And one search happens on only one such subset. So unexpected storage 
> types are simply in the search space. 
> Thanks [~szetszwo] for the offline discussion, and thanks [~linyiqun] for 
> pointing out a wrong statement in the description, and any comments are more 
> than welcome.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to