sodonnel commented on a change in pull request #668: HDDS-3139 Pipeline
placement should max out pipeline usage
URL: https://github.com/apache/hadoop-ozone/pull/668#discussion_r400184622
##########
File path:
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelinePlacementPolicy.java
##########
@@ -316,36 +315,75 @@ DatanodeDetails fallBackPickNodes(
}
/**
- * Find a node from the healthy list and return it after removing it from the
- * list that we are operating on.
- *
- * @param healthyNodes - Set of healthy nodes we can choose from.
- * @return chosen datanodDetails
+ * Random pick two nodes and compare with the pipeline load.
+ * Return the node with lower pipeline load.
+ * @param healthyNodes healthy nodes
+ * @return node
*/
- @Override
- public DatanodeDetails chooseNode(
- List<DatanodeDetails> healthyNodes) {
- if (healthyNodes == null || healthyNodes.isEmpty()) {
- return null;
- }
+ private DatanodeDetails randomPick(List<DatanodeDetails> healthyNodes) {
+ DatanodeDetails datanodeDetails;
int firstNodeNdx = getRand().nextInt(healthyNodes.size());
int secondNodeNdx = getRand().nextInt(healthyNodes.size());
- DatanodeDetails datanodeDetails;
// There is a possibility that both numbers will be same.
// if that is so, we just return the node.
if (firstNodeNdx == secondNodeNdx) {
datanodeDetails = healthyNodes.get(firstNodeNdx);
} else {
DatanodeDetails firstNodeDetails = healthyNodes.get(firstNodeNdx);
DatanodeDetails secondNodeDetails = healthyNodes.get(secondNodeNdx);
- SCMNodeMetric firstNodeMetric =
- nodeManager.getNodeStat(firstNodeDetails);
- SCMNodeMetric secondNodeMetric =
- nodeManager.getNodeStat(secondNodeDetails);
- datanodeDetails = firstNodeMetric.isGreater(secondNodeMetric.get())
- ? firstNodeDetails : secondNodeDetails;
+ datanodeDetails = nodeManager.getPipelinesCount(firstNodeDetails)
+ >= nodeManager.getPipelinesCount(secondNodeDetails)
+ ? secondNodeDetails : firstNodeDetails;
}
+ return datanodeDetails;
+ }
+
+ /**
+ * Get a list of nodes with lower load than max pipeline number.
+ */
+ private List<DatanodeDetails> getLowerLoadNodes(
+ List<DatanodeDetails> nodes, int num) {
+ int maxPipelineUsage = nodes.size() * heavyNodeCriteria /
Review comment:
I am not sure if this calculation is correct. The reason is that the healthy
node list is already filtered to include only nodes with fewer than
`heavyNodeCritera` pipelines in `filterViableNodes()`. Therefore the size of
the list passed into this method gets smaller as the nodes are used up and
eventually it stops returning any nodes, even though there are nodes valid to
return. From some debug messages I added:
```
2020-03-30 14:18:31,682 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:chooseDatanodes(202)) - There is no topology
2020-03-30 14:18:31,682 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:chooseNode(391)) - In chooseNode
2020-03-30 14:18:31,682 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:getLowerLoadNodes(351)) - Max pipeline usage is: 8
2020-03-30 14:18:31,682 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:lowerLoadPick(368)) - getLowerLoadNodes()
returned empty list
2020-03-30 14:18:31,682 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:chooseNode(391)) - In chooseNode
2020-03-30 14:18:31,682 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:getLowerLoadNodes(351)) - Max pipeline usage is: 6
2020-03-30 14:18:31,682 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:lowerLoadPick(368)) - getLowerLoadNodes()
returned empty list
2020-03-30 14:18:31,682 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:chooseNode(391)) - In chooseNode
2020-03-30 14:18:31,683 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:getLowerLoadNodes(351)) - Max pipeline usage is: 5
2020-03-30 14:18:31,683 INFO pipeline.PipelinePlacementPolicy
(PipelinePlacementPolicy.java:lowerLoadPick(368)) - getLowerLoadNodes()
returned empty list
2020-03-30 14:18:31,683 INFO pipeline.PipelineStateManager
(PipelineStateManager.java:addPipeline(54)) - Created pipeline Pipeline[ Id:
3da2deda-7839-4e5c-88e0-a613729b99fd, Nodes:
0cbe69da-7ef6-43ff-a11a-7ba716ec7c9c{ip: 242.96.90.116, host:
localhost-242.96.90.116, networkLocation: /default-rack, certSerialId:
null}b4d70058-fb84-410a-8ca8-cc4e8e944fae{ip: 28.158.147.87, host:
localhost-28.158.147.87, networkLocation: /default-rack, certSerialId:
null}03846f48-f6e5-4dba-97b2-5ab8a7be1564{ip: 128.200.5.118, host:
localhost-128.200.5.118, networkLocation: /default-rack, certSerialId: null},
Type:RATIS, Factor:THREE, State:ALLOCATED, leaderId:null,
CreationTimestamp2020-03-30T13:18:31.683Z]
2020-03-30 14:18:31,683 INFO pipeline.TestPipelinePlacementPolicy
(TestPipelinePlacementPolicy.java:testPickLowestLoadAnchor(112)) - Pipeline
count for this node is 5
2020-03-30 14:18:31,683 INFO pipeline.TestPipelinePlacementPolicy
(TestPipelinePlacementPolicy.java:testPickLowestLoadAnchor(112)) - Pipeline
count for this node is 5
2020-03-30 14:18:31,683 INFO pipeline.TestPipelinePlacementPolicy
(TestPipelinePlacementPolicy.java:testPickLowestLoadAnchor(112)) - Pipeline
count for this node is 5
2020-03-30 14:18:31,683 INFO pipeline.TestPipelinePlacementPolicy
(TestPipelinePlacementPolicy.java:testPickLowestLoadAnchor(112)) - Pipeline
count for this node is 3
2020-03-30 14:18:31,683 INFO pipeline.TestPipelinePlacementPolicy
(TestPipelinePlacementPolicy.java:testPickLowestLoadAnchor(112)) - Pipeline
count for this node is 2
2020-03-30 14:18:31,683 INFO pipeline.TestPipelinePlacementPolicy
(TestPipelinePlacementPolicy.java:testPickLowestLoadAnchor(112)) - Pipeline
count for this node is 2
2020-03-30 14:18:31,683 INFO pipeline.TestPipelinePlacementPolicy
(TestPipelinePlacementPolicy.java:testPickLowestLoadAnchor(112)) - Pipeline
count for this node is 3
2020-03-30 14:18:31,684 INFO pipeline.TestPipelinePlacementPolicy
(TestPipelinePlacementPolicy.java:testPickLowestLoadAnchor(112)) - Pipeline
count for this node is 2
```
If the nodes are filtered by load count in filterViableNodes, do we actually
need this getLowerLoadNodes method?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]