Hey folks,

I'm trying to solve a problem of even load (or at least connection) 
distribution between  grpc clients and our backend servers.

First of all let me describe our setup:
We are using network load balancing (L4) in front of our grpc servers.
Clients will see one endpoint (LB) and connect to it. This means that 
standard client-side load balancing features like round robing wouldn't 
work as there will only be one sub-channel for client-server communication.

One issue with this approach can be demonstrated by the following example:
Let's say we have 2 servers running and 20 clients connect to them. At the 
beginning, since we go through the network load balancer, connections will 
be distributed evenly (or close to that), so we'll roughly have 50% of 
connections to each server. Now let's assume these servers reboot one after 
another, like in a deployment. What would happen is that server that comes 
up first would get all 20 worker connections and server that comes up later 
would have zero. This situation won't change unless client or server would 
drop a connection periodically or more clients request connections.

I've considered a few options for solving this:
1. Connection management on the client side - do something to reset the 
channel (like 
[enterIdle](https://grpc.github.io/grpc-java/javadoc/io/grpc/ManagedChannel.html#enterIdle)
 
in grpc-java). Downside - it seems that this feature has been developed for 
android and I can't find similar functionality in grpc-go.
2. Connection management on the server side - drop connections periodically 
on the server. Downside - this approach looks less graceful than the client 
side one and may impact request latency and result in request failures on 
the client side.
3. Use request based grpc-aware L7 LB, this way client would connect to the 
LB, which would fan out requests to the servers. Downside - I've been told 
by our infra guys that it is hard to implement in our setup due to the way 
we use TLS and manage certificates.
4. Expose our servers outside and use grpc-lb or client side load 
balancing. Downside - it seems less secure and would make it harder to 
protect against DDoS attacks if we go this route. I think this downside 
makes this approach unviable.

My bias is towards going with option 3 and doing request based load 
balancing because it allows much more fine grained control based on load, 
but since our infra can not support it at the moment, I might be forced to 
use option 1 or 2 in the short to mid term. Option 2 I like the least, as 
it might result in latency spikes and errors on the client side.

My questions are:
1. Which approach is generally preferable? 
2. Are there other options to consider?
3. Is it possible to influence grpc channel state in grpc-go, which would 
trigger resolver and balancer to establish a new connection similar to what 
enterIdle does in java? From what I see in the 
[clientconn.go](https://github.com/grpc/grpc-go/blob/master/clientconn.go) 
there is no option to change the channel state to idle or trigger a 
reconnect in some other way.
4. Is there a way to implement server side connection management cleanly 
without impacting client-side severely?

Here are links that I find useful for some context:
grpc/load-balancing.md at master · grpc/grpc 
<https://www.google.com/url?q=https://github.com/grpc/grpc/blob/master/doc/load-balancing.md&sa=D&source=calendar&ust=1614134668829000&usg=AOvVaw21tfy7_lvaEmuQ_VRla1tY>
proposal/A9-server-side-conn-mgt.md at master · grpc/proposal 
<https://www.google.com/url?q=https://github.com/grpc/proposal/blob/master/A9-server-side-conn-mgt.md&sa=D&source=calendar&ust=1614134668829000&usg=AOvVaw3CEasUxdbyoBhDZoX9oYB3>
 
proposal/A8-client-side-keepalive.md at master · grpc/proposal 
<https://www.google.com/url?q=https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md&sa=D&source=calendar&ust=1614134668829000&usg=AOvVaw2EuL2EScC-WhnwJStxikI4>
  
grpc/keepalive.md at master · grpc/grpc 
<https://www.google.com/url?q=https://github.com/grpc/grpc/blob/master/doc/keepalive.md&sa=D&source=calendar&ust=1614134668829000&usg=AOvVaw1T5WVe-QM5uc6UzblVzhKp>


Sorry for the long read,
Vitaly

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/5934500b-4084-40e9-874f-3027592fce17n%40googlegroups.com.

Reply via email to