Re: Circuit Breakers interaction with Shards

2021-02-16 Thread Walter Underwood
Limiting open connections is not the same as rate limiting. Open connections is a count of the requests being processed by a node. When the load balancer gets a new request and all current connections are waiting for a response, a new connection is opened. If the requests are all the same

Re: Circuit Breakers interaction with Shards

2021-02-16 Thread David Smiley
Walter, it sounds like you were doing rate limiting, just in a different way that is more dynamic than a simple (yet fiddly) constant? ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sun, Feb 14, 2021 at 2:54 PM Walter Underwood wrote: > Rate

Re: Circuit Breakers interaction with Shards

2021-02-14 Thread Walter Underwood
Rate limiting is a good idea. It requires a lot of ongoing engineering to adjust the rates to the current cluster behavior. It doesn’t help with some kinds of overload. The ROI just doesn’t work out. It is too much work for not enough benefit. Rate limiting works if the collection size doesn’t

Re: Circuit Breakers interaction with Shards

2021-02-14 Thread Atri Sharma
This is a debate better suited for a different forum -- but I would disagree with your assertion that rate limiting is a bad idea. Solr allows you to specify node level request quotas which also follow the principle of not limiting internal requests. I find that to be pretty useful in two

Re: Circuit Breakers interaction with Shards

2021-02-14 Thread Walter Underwood
We’ve looked at and rejected rate limiters as high-maintenance and not sufficient protection. We would have run nginx on each node, sent external traffic to nginx on a different port and let internal traffic stay on the default Solr port. This has other advantages (monitoring), but the rate

Re: Circuit Breakers interaction with Shards

2021-02-14 Thread Atri Sharma
The way I look at it is that for cluster level stability, rate limiters should be used which allow rate limiting of only external requests. They are "circuit breakers" in the sense of defending against cluster level instability, which is what you describe. Circuit breakers, in Solr world, are

Re: Circuit Breakers interaction with Shards

2021-02-14 Thread Walter Underwood
Ideally, it would only affect a few queries. In reality, with a sharded system, the impact will be large. I disagree that the goal is to protect a node. The goal is to make the entire cluster avoid congestion failure when overloaded, while providing good service for the load that it can

Re: Circuit Breakers interaction with Shards

2021-02-14 Thread Atri Sharma
This has an issue of still leading to node outages if the fanout for a query is high. Circuit breakers follow a simple rule -- defend the node at the cost of degraded responses. Ideally, only few requests will be completely rejected -- some will see partial results. Due to this non

Circuit Breakers interaction with Shards

2021-02-14 Thread Walter Underwood
This got zero responses on the solr-user list, so I’ll raise the issue here. Should circuit breakers only kill external search requests and not cluster-internal requests to shards? Circuit breakers can kill any request, whether it is a client request from outside the cluster or an internal