[
https://issues.apache.org/jira/browse/DUBBO-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16805609#comment-16805609
]
Daniela Morais commented on DUBBO-34:
-------------------------------------
[~chicken] [~huxing]
Hey,
Here are my ideas for this task:
The load balancer will randomly choose between 2 servers and if the selected is
unavailable or not a good option (a lot of inflight requests for example) it
will retry X times until some server is available (if no one retry was
successfully the load balancer will send the request to the last picked one).
Also, seems a good option some settings specifying what are normal conditions
for each server (the maximum percentage of CPU utilization, response time, what
is the region/zone, server age, etc), this can be optional.
How choose the server:
* the number of inflight request count for both servers and CPU % (servers will
send the inflight request count and CPU utilization)
* the current number of inflight requests from this load balancer
* percentage of fewer errors and how much longer is server response time
* the server that is in the same region as the client
* server age: launched servers will receive less traffic in the first x seconds
after launch
I'm not sure it is possible to do everything in GSoC or if there's something
already developed (I'm working on improving the java doc of dubbo-cluster so I
can study more). What do you think?
> GSoC 2019: New Load Balancer for higher availability and resilience.
> --------------------------------------------------------------------
>
> Key: DUBBO-34
> URL: https://issues.apache.org/jira/browse/DUBBO-34
> Project: Apache Dubbo
> Issue Type: Task
> Reporter: Jun Liu
> Priority: Major
> Labels: GSoC2019
>
> This is an idea for Google Summer of Code (GSoC). Get to know about Dubbo[0].
> As an RPC framework, LoadBalance is a key part of Dubbo for distributing
> traffics among servers. Below are the built-in strategies already supported:
> * Round Robin
> * Least Active
> * Consistent Hash
> * Random
> Now, we are considering some more intelligent and adaptive strategies that
> can learn the healthy status of servers at runtime and automatically adjust
> traffic distributions, something like P2C for Finagle[1 ]and JSQ for
> Netflix[2].
> 0. https://issues.apache.org/jira/browse/DUBBO-33.
> 1. https://twitter.github.io/finagle/guide/Clients.html.
> 2.
> https://medium.com/netflix-techblog/netflix-edge-load-balancing-695308b5548c.
> How to achieve it, guidance for your reference:
> The new load-balancing strategy should be able to automatically isolate
> abnormal instances based on the statistics of the load or health status of
> the back-end Provider instance. This ensures that traffic is forwarded to the
> processing-capable instance. The load balancer should also know when to
> recover, periodically checks the health status of the isolated instances,
> put back the instance into the normal instance pool to be scheduled once it's
> recovered.
> A quite similar project is [Circuit Breaker|http://example.com], except that
> circuit breaker treats the downstream cluster as a whole while this Load
> Balancer needs to distinguish the state of each instance.
> This topic can be achieved by extending the
> [LoadBalance|https://github.com/apache/incubator-dubbo/blob/master/dubbo-cluster/src/main/java/org/apache/dubbo/rpc/cluster/LoadBalance.java]
> SPI.
> To provide the basic statistics for LB to make a decision, you may need to
> count the data of each RPC request, such as QPS, RT, Active Request, etc.
> This can be achieved by extending the
> [Filter|https://github.com/apache/incubator-dubbo/blob/master/dubbo-rpc/dubbo-rpc-api/src/main/java/org/apache/dubbo/rpc/Filter.java]
> SPI. For more details, see How [MetricsFilter|http://example.com] does it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)