Limiting open connections is not the same as rate limiting. Open connections is
a count of the requests being processed by a node. When the load balancer gets
a new request and all current connections are waiting for a response, a new
connection is opened.
If the requests are all the same
Walter, it sounds like you were doing rate limiting, just in a different
way that is more dynamic than a simple (yet fiddly) constant?
~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley
On Sun, Feb 14, 2021 at 2:54 PM Walter Underwood
wrote:
> Rate
Rate limiting is a good idea. It requires a lot of ongoing engineering to
adjust the rates to the current cluster behavior. It doesn’t help with some
kinds of overload. The ROI just doesn’t work out. It is too much work for not
enough benefit.
Rate limiting works if the collection size doesn’t
This is a debate better suited for a different forum -- but I would
disagree with your assertion that rate limiting is a bad idea.
Solr allows you to specify node level request quotas which also follow the
principle of not limiting internal requests. I find that to be pretty
useful in two
We’ve looked at and rejected rate limiters as high-maintenance and not
sufficient protection.
We would have run nginx on each node, sent external traffic to nginx on a
different port and let internal traffic stay on the default Solr port. This has
other advantages (monitoring), but the rate
The way I look at it is that for cluster level stability, rate limiters
should be used which allow rate limiting of only external requests. They
are "circuit breakers" in the sense of defending against cluster level
instability, which is what you describe.
Circuit breakers, in Solr world, are
Ideally, it would only affect a few queries. In reality, with a sharded system,
the impact will be large.
I disagree that the goal is to protect a node. The goal is to make the entire
cluster avoid congestion failure when overloaded, while providing good service
for the load that it can
This has an issue of still leading to node outages if the fanout for a
query is high.
Circuit breakers follow a simple rule -- defend the node at the cost of
degraded responses.
Ideally, only few requests will be completely rejected -- some will see
partial results. Due to this non
This got zero responses on the solr-user list, so I’ll raise the issue here.
Should circuit breakers only kill external search requests and not
cluster-internal requests to shards?
Circuit breakers can kill any request, whether it is a client request from
outside the cluster or an internal