[PR] Add new balancer implementation called EnumeratedDistribution Balancer (druid)

via GitHub Thu, 25 Apr 2024 12:15:16 -0700


capistrant opened a new pull request, #16335:
URL: https://github.com/apache/druid/pull/16335

<!-- If you are a committer, follow the PR action item checklist for
committers:

https://github.com/apache/druid/blob/master/dev/committer-instructions.md#pr-and-issue-action-item-checklist-for-committers.
-->

### Description

#### EnumeratedDistribution Balancer

I have implemented a new BalancerStrategy implementation called
EnumeratedDistribution. This Balancer Uses
[org.apache.commons.math3.distribution.EnumeratedDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/EnumeratedDistribution.html#sample(int,%20T[]))
to decide what servers to load segments to and what server to move a segment
to. For choosing which server(s) to drop a segment from, it does not use this
probabilistic sampling, but instead just orders the servers by % utilized in
order to drop from the most "full" candidate server first.

The overarching goal of the balancer is to drive a cluster's storage
distribution towards uniform disk utilization % across historical servers.

It differs from the diskNormalized balancer in that this balancer is
completely disconnected from the cost balancer. It is strictly assigning
segment destinations by probabilistic sampling that EnumeratedDistribution uses.

##### Replicating a segment

Assign probabilities to every candidate server. The probability input for
server s is a float between 0 and 1. The probability for a server correlates to
it's utilization %. The most utilized server in the cluster has the lowest
probability assigned and the least utilized server has the highest probability
assigned.

This is where the impl gets a little funky compared to the other Balancers.
The other balancers return an Iterator over all of the candidate servers after
they have been sorted according to the balancer algorithm. For this new
balancer, it is less straightforward. We sample the set of candidates (with
probability) a number of times that is equal to the size of the candidate list.
We then de-dup the list of sampled servers since it will likely - by design -
have picked the higher probability servers multiple times. This means the
return list to the caller is not going to have every candidate. The feelin is
that this shouldn't be a problem when you have a meaningful number of
historicals since the number of replicas is likely fairly low. However, the
method signature could be changed to define a minimum number of servers
returned if we needed to guarantee that N servers were in the iterable that the
caller receives.

##### Moving a segment

Assign probabilities to every candidate server. The probability input for an
individual server is a float between 0 and 1 that correlates to it's disk
utilization %. The most utilized server in the cluster has the lowest
probability assigned and the least utilized server has the highest probability
assigned.

Sample from the set of candidate servers with the aforementioned
probabilities in mind (EnumeratedDistribution library handles all this for us).
Return the sampled server.

##### Dropping a segment

Order the candidate servers in descending order by segment cache space
utilization % and return an iterable over them. This pushes the cluster to drop
from the "most full" servers first, working towards the ultimate goal of
converging on a uniform utilization % across historical servers.

#### Release note
<!-- Give your best effort to summarize your changes in a couple of
sentences aimed toward Druid users.

If your change doesn't have end user impact, you can skip this section.

For tips about how to write a good release note, see [Release
notes](https://github.com/apache/druid/blob/master/CONTRIBUTING.md#release-notes).

-->

Add a new BalancerStrategy called `EnumeratedDistributionBalancerStrategy`.
This balancer uses
[org.apache.commons.math3.distribution.EnumeratedDistribution](https://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/org/apache/commons/math3/distribution/EnumeratedDistribution.html#sample(int,%20T%5B%5D))
to make segment assignment, move, and drop decisions using probabilities that
correlate to how much available segment storage a server has remaining. This
balancer will converge historical servers to a uniform level of storage
consumption (expressed as a % of available storage utilized on each server). To
enable this balancer, set `druid.coordinator.balancer.strategy` to
`enumeratedDistribution`.

<hr>

##### Key changed/added classes in this PR
* `EnumeratedDistributionBalancerStrategy`

<hr>

This PR has:

- [ ] been self-reviewed.
- [X] added documentation for new or modified features or behaviors.
- [X] a release note entry in the PR description.
- [X] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [X] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [X] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] added integration tests.
- [X] been tested in a test Druid cluster.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Add new balancer implementation called EnumeratedDistribution Balancer (druid)

Reply via email to