*Summary: *In my opinion: the gRPC server libraries should include a 
mechanism for limiting the number of concurrent requests and connections by 
default. This is necessary for ensuring that servers stays within memory 
limits during overload. The core libraries currently do not support this, 
which requires everyone using gRPC for production use cases to reinvent the 
wheel (see below).


   - Is this something the core gRPC team agrees with? There is a 
   discussion on grpc-java that suggests maybe it is: 
   https://github.com/grpc/grpc-java/issues/1886
   - If so, I have a very bad Go implementation I would be happy to 
   contribute.
   - Maybe we should create a "best practices" document that suggests 
   people implement limits themselves, if we are waiting for the "right" 
   implementation?


Without these limits, every gRPC server is a few hundred extra connections 
away from a cascading failure. I personally would like the core library to 
set some "reasonable" limits for both requests and connections that is not 
infinite, and allow me to configure them. This is similar to how the 
library sets a maximum message size by default. However, I can understand 
that might be controversial, and potentially a significant behaviour 
change. At the very least, the libraries should include the right pieces to 
make it easy for me to configure them, without needing to implement my own 
semaphore interceptor and listener.


*Counter-argument: This is an application policy, not a gRPC policy*

An argument against this is that the "right" settings are going to be 
different for each application, so gRPC should not set something that will 
be wrong. Additionally, some applications will want sophisticated policies, 
such as allowing a large number of "cheap" requests, but only a small 
number of "expensive" requests. Today, gRPC provides the necessary hooks to 
implement these limits yourself (e.g. I implement a version at 
https://github.com/evanj/concurrentlimit).

Personally: This is true, but gRPC should at least include a "simple" 
implementation that would cover the "basic" use case. Advanced users could 
still override the default, if necessary.



*Context*

At work, we recently had a classic cascading failure of a gRPC service 
because the service ran out of memory. We believe one of the root causes is 
that during overload, our Go gRPC server accepted an unlimited amount of 
work, causing it to exceed its memory limits and get OOM killed. In 
testing, I can easily cause the server to run out of memory either by 
sending too many concurrent requests, or by establishing a very large 
number of idle gRPC connections. To be able to stay within memory limits, 
even in overload scenarios, I need this server to limit both the number of 
connections and the number of requests.

The good news is that gRPC provides the right hooks for me to implement 
this myself (see https://github.com/evanj/concurrentlimit). However, it is 



*Related Issues*

This issue on the Java implementation has the largest relevant discussion: 
https://github.com/grpc/grpc-java/issues/1886

Searching in the gRPC closed issues finds a large number of issues that may 
have been caused by a lack of limits.

It turns out that none of the gRPC implementations in any language out of 
the box support the limits necessary for a server to survive this type of 
overload. Our service is in Go, but after a quick review, it appears to me 
the same limitations exist in the Java and C++ implementations.



*People who have solved the same problem*


Lyft uses Envoy to limit concurrent requests to their services: 
https://www.infoq.com/articles/envoy-service-mesh-cascading-failure/

Dropbox created their own service proxy with similar goals: 
https://blogs.dropbox.com/tech/2018/03/meet-bandaid-the-dropbox-service-proxy/

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/f6475ce5-1c90-4427-a2cc-d6241e76b3af%40googlegroups.com.

Reply via email to