craigcondit commented on pull request #323:
URL:
https://github.com/apache/incubator-yunikorn-core/pull/323#issuecomment-920986918
@yangwwei, thanks for the review.
> There is another thing, with this algorithm, we calculate ratios only,
this could lead to some issues when the size of the node is very different, for
example, if I have 2 nodes (just use cpu as an example)
>
> nodeA:
>
> * cpuTotal: 2
> * cpuUsed: 1
> * usedRation: 0.5
>
> nodeB
>
> * cpuTotal: 10
> * cpuUsed: 6
> * usedRatio: 0.6
>
> with the current algorithm, under FAIR we will prefer A over B; however, A
has only 1 cpu available but B has 4... so it might be better to allocate B
first. I guess this is fine... but need to keep in mind when the size of the
nodes is very different, this algorithm may tend to fill the smaller node
first...
In your example, yes we would place on nodeA first. However, if we consider
a more realistic example, something with larger CPU counts and where the size
of an allocation is not 50% of one of our nodes, it's far less dramatic, even
if nodes vary in size. Consider:
nodeC: cpuTotal=8, cpuUsed=4, usedRatio=0.5
nodeD: cpuTotal=40, cpuUsed=24, usedRatio=0.6
The ratios between the node sizes are equal, and the current usedRatios
match your example. Let's consider where subsequent allocations go and how this
affects things. I will assume 1 CPU as allocation size for simplicity:
Request 1: nodeC, as 0.5 < 0.6. Total usage on nodeC: 0.625. Total node
allocations for C/D: { 5, 24 }
Request 2: nodeD, as 0.6 < 0.625. Total usage on nodeD: 0.625. Total
allocations: { 5, 25 }
Request 3: nodeC as scores are both 0.625. Total usage on nodeC: 0.75 Total
allocations: { 6, 25 }
Request 4: nodeD as 0.625 < 0.75. Total usage on nodeD: 0.65. Total node
allocations: { 6, 26 }
Request 4: nodeD as 0.65 < 0.75. Total usage on nodeD: 0.675. Total node
allocations: { 6, 27 }
Request 5: nodeD as 0.675 < 0.75. Total usage on nodeD: 0.7. Total node
allocations: { 6, 28 }
Request 6: nodeD as 0.7 < 0.75. Total usage on nodeD: 0.725. Total node
allocations: { 6, 29 }
Request 7: nodeD as 0.725 < 0.75. Total usage on nodeD: 0.75. Total node
allocations: { 6, 30 }
Request 8: nodeC as scores are both 0.75. Total usage on nodeC: 0.875. Total
node allocations: { 7, 30 }
... and so on.
We don't actually end up filling nodeC until both reach 87.5% capacity, and
at that point, nodeC happens to fill first just because it hit the tie-breaker
scenario (using node ID) chosen. However, you can see that all the way along,
the allocations favor nodeB in a 5:1 ratio, which exactly matches their node
size ratio.
> Another thing is when we find 2 scores are the same, we compare nodeID,
instead of doing that, can we compare the node available resources?
The node ID is not a part of the scoring algorithms. It is a technical
requirement that the keys we use for the B-Tree implementation be unique, and
node ID is the one thing we know will not change and will be unique across all
nodes. It will ONLY be used in the case where two nodes score identically.
However, leaving it out, or using some other value that may not be unique,
could result in collisions, causing a node to vanish from the node collection.
We could generate something like a UUID for each node on addition to the node
collection, but this isn't really any better; it just makes the tie-breaker
algorithm dependent on something that isn't externally visible (and in fact
makes testing more difficult). It doesn't improve the behavior at all. If we
want different behavior for different nodes, they should produce different
scores.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]