laddcn opened a new pull request #8916: URL: https://github.com/apache/dubbo/pull/8916
## What is the purpose of the change Improve consistent hashing load balancing with a new algorithm which can resolve problems mentioned in https://github.com/apache/dubbo/issues/4103 ## Brief changelog A new algorithm "[Consistent Hashing with Bounded Loads](https://cn.bing.com/academic/profile?id=444fbd87cfcfb664f18caac119986420&encoded=0&v=paper_preview&mkt=zh-cn#)"introduced by Vahab Mirrokni (work at Google Research) in 2018 can resolve this problem. Brief introduction quoted from a blog which use this new algorithm and works well (https://medium.com/vimeo-engineering-blog/improving-load-balancing-with-a-new-consistent-hashing-algorithm-9f1bd75709ed): Here is a simplified sketch of the algorithm. Some details are left out, and if you intend to implement it yourself, you should definitely go to the original paper for information. First, define a balancing factor, c, which is greater than 1. c controls how much imbalance is allowed between the servers. For example, if c = 1.25, no server should get more than 125% of the average load. In the limit as c increases to ∞, the algorithm becomes equivalent to plain consistent hashing, without balancing; as c decreases to near 1 it becomes more like a least-connection policy and the hash becomes less important. In my experience, values between 1.25 and 2 are good for practical use. When a request arrives, compute the average load (the number of outstanding requests, m, including the one that just arrived, divided by the number of available servers, n). Multiply the average load by c to get a “target load”, t. In the original paper, capacities are assigned to servers so that each server gets a capacity of either ⌊t⌋ or ⌈t⌉, and the total capacity is ⌈cm⌉. Therefore the maximum capacity of a server is ⌈cm/n⌉, which is greater than c times the average load by less than 1 request. To support giving servers different “weights”, as HAProxy does, the algorithm has to change slightly, but the spirit is the same — no server can exceed its fair share of the load by more than 1 request. To dispatch a request, compute its hash and the nearest server, as usual. If that server is below its capacity, then assign the request to that server. Otherwise, go to the next server in the hash ring and check its capacity, continuing until you find a server that has capacity remaining. There has to be one, since the highest capacity is above the average load, and it’s impossible for every server’s load to be above average. This guarantees some nice things: No server is allowed to get overloaded by more than a factor of c plus 1 request. The distribution of requests is the same as consistent hashing as long as servers aren’t overloaded. If a server is overloaded, the list of fallback servers chosen will be the same for the same request hash — i.e. the same server will consistently be the “second choice” for a popular piece of content. This is good for caching. If a server is overloaded, the list of fallback servers will usually be different for different request hashes — i.e. the overloaded server’s spillover load will be distributed among the available servers, instead of all landing on a single server. This depends on each server being assigned multiple points in the consistent hash ring. ## Verifying this change Test cases work well Follow this checklist to help us incorporate your contribution quickly and easily: - [x] Make sure there is a [GITHUB_issue](https://github.com/apache/dubbo/issues) field for the change (usually before you start working on it). Trivial changes like typos do not require a GITHUB issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue. - [x] Format the pull request title like `[Dubbo-XXX] Fix UnknownException when host config not exist #XXX`. Each commit in the pull request should have a meaningful subject line and body. - [x] Write a pull request description that is detailed enough to understand what the pull request does, how, and why. - [x] Write necessary unit-test to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add sample in [dubbo samples](https://github.com/apache/dubbo-samples) project. - [x] Run `mvn clean install -DskipTests=false` & `mvn clean test-compile failsafe:integration-test` to make sure unit-test and integration-test pass. - [ ] If this contribution is large, please follow the [Software Donation Guide](https://github.com/apache/dubbo/wiki/Software-donation-guide). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
