hangc0276 opened a new pull request #6772:
URL: https://github.com/apache/pulsar/pull/6772


   ### Motivation
   The Only one overload shedder strategy is `OverloadShedder`, which collects 
each broker's max resource usage and compare with threshold (default value is 
85%). When max resource usage reaches the threshold, it will trigger bundle 
unloading, which will migrate parts of bundles to other brokers. The overload 
shedder strategy has some drawbacks as follows:
   - Not support configure other overload shedder strategies
   - It is hard to determine the threshold value,  the default threshold is 
85%. But for a broker, the max resource usage is few to reach 85%, which will 
lead to unbalanced traffic between brokers. The heavy traffic broker's read 
cache hit rate will decrease.
   - When you restart the most brokers of the pulsar cluster at the same time, 
the whole traffic in the cluster will goes to the rest brokers. The restarted 
brokers will have no traffic for a long time, due to the rest brokers max 
resource usage not reach the threshold.
   
   ### Changes
   1. Support multiple overload shedder strategy, which only need to configure 
in `broker.conf`
   2. I develop `ThresholdShedder` strategy, the main idea as follow:
       - Calculate the average resource usage of the brokers, and individual 
broker resource usage will compare with the average value. If it greatter than 
average value plus threshold, the overload shedder will be triggered.
       `broker resource usage > average resource usage + threshold`
       - Each kind of resources (ie bandwithIn, bandwithOut, CPU, Memory, 
Direct Memory), has weight(default is 1.0) when calculate broker's resource 
usage.
       - Record the pulsar broker cluster history average resource usage, new 
average resource usage will be calculate as follow:
       `new_avg = old_avg * factor + (1-factor) * avg`
       `new_avg`: newest average resoruce usage
       `old_avg`: old average resource usge which is calculate in last round.
       `factor`: the decrease factor, default value is `0.9`
       `avg`: the average resource usage of the brokers
   3. expose load balance metric to prometheus
   4. fix a bug in `OverloadShedder`, which specify the unloaded bundle in the 
overload's own broker.
   
   Please help check this implementation, if it is ok, i will add test case.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to