[ 
https://issues.apache.org/jira/browse/DUBBO-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810449#comment-16810449
 ] 

Huxing Zhang commented on DUBBO-34:
-----------------------------------

> I was thinking about filtering those servers that are available (least than 
> x% of errors happened, CPU <= the minimum of CPU utilization, server response 
> time <= y, etc.) and return one of them. Note that can be a properties, for 
> example, we allow that server response time can be Y ms.

I would suggest to start from simplest case, consider just one factor from the 
three. Among them I think server response time is a good candidate. In some 
cases, x% of errors may be not reflect the actual situation, a server may have 
low x% of errors, but is actually slow in responding. CPU utilization may run 
into the same cases. For example, if server A calls server B, and server B 
calls server C. If B -> C is slow, the x% of errors and cpu utilization for B 
may be low, but the response time for B can be very high.

> Another option is some equation that put a score in every server and returns 
> the best. But what should be the priority of this equation? CPU, number of 
> requests? What do you think?

Number of in flight request is another candidate. I feel it is difficult to 
decide the priority, they are both very intuitive metric, or even orthogonal. 
Maybe you can try and adjust. And I think the priority should be configurable.


> I've some questions: I asked in the mailing list about using "active" in 
> query param in the LeastActiveLoadBalance, is there any server data that is 
> possible to send in query param?

Not quite sure about that, I will check that and get back to you.

Can we reuse anything dubbo-metrics? I saw that MetricsFilter already saves the 
server response time and if any error happened.

Absolutely yes.


> GSoC 2019: New Load Balancer for higher availability and resilience.
> --------------------------------------------------------------------
>
>                 Key: DUBBO-34
>                 URL: https://issues.apache.org/jira/browse/DUBBO-34
>             Project: Apache Dubbo
>          Issue Type: Task
>            Reporter: Jun Liu
>            Priority: Major
>              Labels: GSoC2019
>
> This is an idea for Google Summer of Code (GSoC). Get to know about Dubbo[0].
> As an RPC framework, LoadBalance is a key part of Dubbo for distributing 
> traffics among servers. Below are the built-in strategies already supported:
> * Round Robin
> * Least Active
> * Consistent Hash
> * Random
> Now, we are considering some more intelligent and adaptive strategies that 
> can learn the healthy status of servers at runtime and automatically adjust 
> traffic distributions, something like P2C for Finagle[1 ]and JSQ for 
> Netflix[2].
> 0. https://issues.apache.org/jira/browse/DUBBO-33.
> 1. https://twitter.github.io/finagle/guide/Clients.html.
> 2. 
> https://medium.com/netflix-techblog/netflix-edge-load-balancing-695308b5548c. 
> How to achieve it, guidance for your reference:  
> The new load-balancing strategy should be able to automatically isolate 
> abnormal instances based on the statistics of the load or health status of 
> the back-end Provider instance. This ensures that traffic is forwarded to the 
> processing-capable instance. The load balancer should also know when to 
> recover,  periodically checks the health status of the isolated instances, 
> put back the instance into the normal instance pool to be scheduled once it's 
> recovered. 
> A quite similar project is [Circuit Breaker|http://example.com], except that 
> circuit breaker treats the downstream cluster as a whole while this Load 
> Balancer needs to distinguish the state of each instance.
> This topic can be achieved by extending the 
> [LoadBalance|https://github.com/apache/incubator-dubbo/blob/master/dubbo-cluster/src/main/java/org/apache/dubbo/rpc/cluster/LoadBalance.java]
>  SPI.
> To provide the basic statistics for LB to make a decision, you may need to 
> count the data of each RPC request, such as QPS, RT, Active Request, etc. 
> This can be achieved by extending the 
> [Filter|https://github.com/apache/incubator-dubbo/blob/master/dubbo-rpc/dubbo-rpc-api/src/main/java/org/apache/dubbo/rpc/Filter.java]
>  SPI. For more details, see How [MetricsFilter|http://example.com] does it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to