BrightK7 opened a new pull request #4043:
URL: https://github.com/apache/hadoop/pull/4043
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
AbstractCSQueue#canAssignToThisQueue will check current queue useage and
limit, and DRF will use cluster resource as denominator to check which resource
is dominated and comapre the ratio however if our cluster's nodes resource are
not blance such as there is larger proportion of memory/vores, then DRF will
chose wrong dominated resource.
For Example our cluster's total resouce are <memory:175117312, vCores:40222>
the ratio is 1 vores : 4.25 GB, and the ratio changed to 1 : 4.8 under node
label x.
```java
2021-12-09 10:24:37,069 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator:
assignedContainer application attempt=appattempt_xxx_xxx container=xxx
queue=root.a.a1.a2 clusterResource=<memory:175117312, vCores:40222>
type=RACK_LOCAL requestedPartition=x
2021-12-09 10:24:37,069 DEBUG
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue:
Used resource=<memory:3381248, vCores:687> exceeded maxResourceLimit of the
queue =<memory:3420315, vCores:687>
2021-12-09 10:24:37,069 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Failed to accept allocation proposal
```
clusterResource = <memory:175117312, vCores:40222>
usedExceptKillable = <memory:3381248, vCores:687>
currentLimitResource = <memory:3420315, vCores:687>
currentLimitResource:
memory : 3381248/175117312 = 0.01930847362
vCores : 687/40222 = 0.01708020486
usedExceptKillable:
memory : 3384320/175117312 = 0.01932601615
vCores : 688/40222 = 0.01710506687
DRF will think memory is dominated resource and compare the ratio of memeory
in this scenario
### How was this patch tested?
### For code changes:
- [ ] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]