hangc0276 opened a new pull request #6772:
URL: https://github.com/apache/pulsar/pull/6772
### Motivation
The Only one overload shedder strategy is `OverloadShedder`, which collects
each broker's max resource usage and compare with threshold (default value is
85%). When max resource usage reaches the threshold, it will trigger bundle
unloading, which will migrate parts of bundles to other brokers. The overload
shedder strategy has some drawbacks as follows:
- Not support configure other overload shedder strategies
- It is hard to determine the threshold value, the default threshold is
85%. But for a broker, the max resource usage is few to reach 85%, which will
lead to unbalanced traffic between brokers. The heavy traffic broker's read
cache hit rate will decrease.
- When you restart the most brokers of the pulsar cluster at the same time,
the whole traffic in the cluster will goes to the rest brokers. The restarted
brokers will have no traffic for a long time, due to the rest brokers max
resource usage not reach the threshold.
### Changes
1. Support multiple overload shedder strategy, which only need to configure
in `broker.conf`
2. I develop `ThresholdShedder` strategy, the main idea as follow:
- Calculate the average resource usage of the brokers, and individual
broker resource usage will compare with the average value. If it greatter than
average value plus threshold, the overload shedder will be triggered.
`broker resource usage > average resource usage + threshold`
- Each kind of resources (ie bandwithIn, bandwithOut, CPU, Memory,
Direct Memory), has weight(default is 1.0) when calculate broker's resource
usage.
- Record the pulsar broker cluster history average resource usage, new
average resource usage will be calculate as follow:
`new_avg = old_avg * factor + (1-factor) * avg`
`new_avg`: newest average resoruce usage
`old_avg`: old average resource usge which is calculate in last round.
`factor`: the decrease factor, default value is `0.9`
`avg`: the average resource usage of the brokers
3. expose load balance metric to prometheus
4. fix a bug in `OverloadShedder`, which specify the unloaded bundle in the
overload's own broker.
Please help check this implementation, if it is ok, i will add test case.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]