Hey all,

We're considering implementing some patches to the golang grpc 
implementation. These are things we think would better fit inside of grpc 
rather than trying to achieve from outside. Before we go through the 
effort, we'd like to gauge whether these features would be welcome 
(assuming we'll work with owners to get a quality implementation). Some of 
these ideas are not fully fleshed out or may not be the best solution to 
the problem they aim to solve. I also try to state the problem, so if you 
have ideas on better ways to address these problems, please share :)

*Add DialOption MaxConnectionLifetime*
Currently, once a connection is established, it lives until there is a 
transport error or the client proactively closes the connection. These 
long-lived connections are problematic when using a TCP load balancer, such 
as the one provided by Google Container Engine and Google Compute Engine. 
At a a clean start, clients will be somewhat distributed among the servers 
behind the load balancer, but if the servers go through a rolling restart 
server will become unbalanced as clients will have a higher likelihood of 
being connected to the first server that restarts, with the most recently 
restarted server having close to zero clients.

We propose fixing this by adding a MaxConnectionLifetime, which will force 
clients to disconnect after some period of time. We'll use the same 
mechanism as when an address is removed from a balancer (e.g. drain the 
connection, rather than abruptly throw errors).

*Add DialOption NumConnectionsPerSever*
This is related to the problem above. When a client is provided with a 
single address that points to a TCP load balancer, it's sometimes 
beneficial to have the client have multiple connections since they 
underlying performance might vary.

*Add ServerOption MaxConcurrentGlobalStreams*
Currently there is only a way to limit the number of streams per client, 
but it'd be useful to do this globally. This could be achieved via an 
interceptor that returns StreamRefused, but thought it might be useful in 
grpc.

*Add facility for retries*
Currently, retries must happen in user-level code, but it'd be beneficial 
for performance and robustness to do have a way to do this with GRPC. 
Today, if the server refuses a request with StreamRefused, the client 
doesn't have a way to retry on a different server, it can only just issue 
the request and hope it gets a different server. It also forces the client 
to reserialize the request which is unnecessary and given the cost of 
serialization with proto, it'd be nice to avoid this.

*Change behavior of Dial to not block on the balancer's initial list*
Currently, when you construct a *grpc.ClientConn with a balancer, the call 
to Dial blocks until the initial set of servers is returned from the 
balancer and errors if the balancer returns an empty list. This is 
inconsistent with the behavior of the client when the balancer produces an 
empty list later in the life of the client.

We propose changing the behavior such that Dial does not wait for the 
response of the balancer and thus also can't return an error when the list 
is empty. This not only makes the behavior consistent, it has the added 
benefit that callers don't need to their own retries to Dial.



To reiterate, these are just rough ideas and we're also in search of other 
solutions to these problems if you have ideas.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/abaa9977-78ee-41d0-b0f5-a4e273dfd13a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to