Re: [EXTERNAL] [Discuss] Generic Purpose Rate Limiter in Cassandra

Benedict Elliott Smith Thu, 19 Sep 2024 15:02:37 -0700

I just want to flag here that this is a topic I have strong opinions on, but 
the CEP is not really specific or detailed enough to understand precisely how 
it will be implemented. So, if a patch is already being produced, most of my 
feedback is likely to be provided some time after a patch appears, through the 
normal review process. I want to flag this now to avoid any surprise.


I will say that upfront that, ideally, this system should be designed to have 
~zero overhead when disabled, and with minimal coupling (between its own 
components and C* itself), so that entirely orthogonal approaches can be 
integrated in future without polluting the codebase.


> On 19 Sep 2024, at 19:14, Patrick McFadin <pmcfa...@gmail.com> wrote:
> 
> The work has begun but we don't have a VOTE thread for this CEP. Can one get 
> started?
> 
> On Mon, May 6, 2024 at 9:24 PM Jaydeep Chovatia <chovatia.jayd...@gmail.com 
> <mailto:chovatia.jayd...@gmail.com>> wrote:
>> Sure, Caleb. I will include the work as part of CASSANDRA-19534 
>> <https://issues.apache.org/jira/browse/CASSANDRA-19534> in the CEP-41.
>> 
>> Jaydeep
>> 
>> On Fri, May 3, 2024 at 7:48 AM Caleb Rackliffe <calebrackli...@gmail.com 
>> <mailto:calebrackli...@gmail.com>> wrote:
>>> FYI, there is some ongoing sort-of-related work going on in CASSANDRA-19534 
>>> <https://issues.apache.org/jira/browse/CASSANDRA-19534>
>>> On Wed, Apr 10, 2024 at 6:35 PM Jaydeep Chovatia 
>>> <chovatia.jayd...@gmail.com <mailto:chovatia.jayd...@gmail.com>> wrote:
>>>> Just created an official CEP-41 
>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-41+%28DRAFT%29+Apache+Cassandra+Unified+Rate+Limiter>
>>>>  incorporating the feedback from this discussion. Feel free to let me know 
>>>> if I may have missed some important feedback in this thread that is not 
>>>> captured in the CEP-41.
>>>> 
>>>> Jaydeep
>>>> 
>>>> On Thu, Feb 22, 2024 at 11:36 AM Jaydeep Chovatia 
>>>> <chovatia.jayd...@gmail.com <mailto:chovatia.jayd...@gmail.com>> wrote:
>>>>> Thanks, Josh. I will file an official CEP with all the details in a few 
>>>>> days and update this thread with that CEP number.
>>>>> Thanks a lot everyone for providing valuable insights!
>>>>> 
>>>>> Jaydeep
>>>>> 
>>>>> On Thu, Feb 22, 2024 at 9:24 AM Josh McKenzie <jmcken...@apache.org 
>>>>> <mailto:jmcken...@apache.org>> wrote:
>>>>>>> Do folks think we should file an official CEP and take it there?
>>>>>> +1 here.
>>>>>> 
>>>>>> Synthesizing your gdoc, Caleb's work, and the feedback from this thread 
>>>>>> into a draft seems like a solid next step.
>>>>>> 
>>>>>> On Wed, Feb 7, 2024, at 12:31 PM, Jaydeep Chovatia wrote:
>>>>>>> I see a lot of great ideas being discussed or proposed in the past to 
>>>>>>> cover the most common rate limiter candidate use cases. Do folks think 
>>>>>>> we should file an official CEP and take it there?
>>>>>>> 
>>>>>>> Jaydeep
>>>>>>> 
>>>>>>> On Fri, Feb 2, 2024 at 8:30 AM Caleb Rackliffe 
>>>>>>> <calebrackli...@gmail.com <mailto:calebrackli...@gmail.com>> wrote:
>>>>>>> I just remembered the other day that I had done a quick writeup on the 
>>>>>>> state of compaction stress-related throttling in the project:
>>>>>>> 
>>>>>>> https://docs.google.com/document/d/1dfTEcKVidRKC1EWu3SO1kE1iVLMdaJ9uY1WMpS3P_hs/edit?usp=sharing
>>>>>>> 
>>>>>>> I'm sure most of it is old news to the people on this thread, but I 
>>>>>>> figured I'd post it just in case :)
>>>>>>> 
>>>>>>> On Tue, Jan 30, 2024 at 11:58 AM Josh McKenzie <jmcken...@apache.org 
>>>>>>> <mailto:jmcken...@apache.org>> wrote:
>>>>>>> 
>>>>>>>> 2.) We should make sure the links between the "known" root causes of 
>>>>>>>> cascading failures and the mechanisms we introduce to avoid them 
>>>>>>>> remain very strong.
>>>>>>> Seems to me that our historical strategy was to address individual 
>>>>>>> known cases one-by-one rather than looking for a more holistic 
>>>>>>> load-balancing and load-shedding solution. While the engineer in me 
>>>>>>> likes the elegance of a broad, more-inclusive actual SEDA-like 
>>>>>>> approach, the pragmatist in me wonders how far we think we are today 
>>>>>>> from a stable set-point.
>>>>>>> 
>>>>>>> i.e. are we facing a handful of cases where nodes can still get pushed 
>>>>>>> over and then cascade that we can surgically address, or are we facing 
>>>>>>> a broader lack of back-pressure that rears its head in different 
>>>>>>> domains (client -> coordinator, coordinator -> replica, internode with 
>>>>>>> other operations, etc) at surprising times and should be considered 
>>>>>>> more holistically?
>>>>>>> 
>>>>>>> On Tue, Jan 30, 2024, at 12:31 AM, Caleb Rackliffe wrote:
>>>>>>>> I almost forgot CASSANDRA-15817, which introduced 
>>>>>>>> reject_repair_compaction_threshold, which provides a mechanism to stop 
>>>>>>>> repairs while compaction is underwater.
>>>>>>>> 
>>>>>>>>> On Jan 26, 2024, at 6:22 PM, Caleb Rackliffe 
>>>>>>>>> <calebrackli...@gmail.com <mailto:calebrackli...@gmail.com>> wrote:
>>>>>>>>> 
>>>>>>>>> Hey all,
>>>>>>>>> 
>>>>>>>>> I'm a bit late to the discussion. I see that we've already discussed 
>>>>>>>>> CASSANDRA-15013 
>>>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-15013> and 
>>>>>>>>> CASSANDRA-16663 
>>>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-16663> at least in 
>>>>>>>>> passing. Having written the latter, I'd be the first to admit it's a 
>>>>>>>>> crude tool, although it's been useful here and there, and provides a 
>>>>>>>>> couple primitives that may be useful for future work. As Scott 
>>>>>>>>> mentions, while it is configurable at runtime, it is not adaptive, 
>>>>>>>>> although we did make configuration easier in CASSANDRA-17423 
>>>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-17423>. It also is 
>>>>>>>>> global to the node, although we've lightly discussed some ideas 
>>>>>>>>> around making it more granular. (For example, keyspace-based 
>>>>>>>>> limiting, or limiting "domains" tagged by the client in requests, 
>>>>>>>>> could be interesting.) It also does not deal with inter-node traffic, 
>>>>>>>>> of course.
>>>>>>>>> 
>>>>>>>>> Something we've not yet mentioned (that does address internode 
>>>>>>>>> traffic) is CASSANDRA-17324 
>>>>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-17324>, which I 
>>>>>>>>> proposed shortly after working on the native request limiter (and 
>>>>>>>>> have just not had much time to return to). The basic idea is this:
>>>>>>>>> 
>>>>>>>>> When a node is struggling under the weight of a compaction backlog 
>>>>>>>>> and becomes a cause of increased read latency for clients, we have 
>>>>>>>>> two safety valves:
>>>>>>>>> 
>>>>>>>>> 1.) Disabling the native protocol server, which stops the node from 
>>>>>>>>> coordinating reads and writes.
>>>>>>>>> 2.) Jacking up the severity on the node, which tells the dynamic 
>>>>>>>>> snitch to avoid the node for reads from other coordinators.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> These are useful, but we don’t appear to have any mechanism that 
>>>>>>>>> would allow us to temporarily reject internode hint, batch, and 
>>>>>>>>> mutation messages that could further delay resolution of the 
>>>>>>>>> compaction backlog.
>>>>>>>>> 
>>>>>>>>> Whether it's done as part of a larger framework or on its own, it 
>>>>>>>>> still feels like a good idea.
>>>>>>>>> 
>>>>>>>>> Thinking in terms of opportunity costs here (i.e. where we spend our 
>>>>>>>>> finite engineering time to holistically improve the experience of 
>>>>>>>>> operating this database) is healthy, but we probably haven't reached 
>>>>>>>>> the point of diminishing returns on nodes being able to protect 
>>>>>>>>> themselves from clients and from other nodes. I would just keep in 
>>>>>>>>> mind two things:
>>>>>>>>> 
>>>>>>>>> 1.) The effectiveness of rate-limiting in the system (which includes 
>>>>>>>>> the database and all clients) as a whole necessarily decreases as we 
>>>>>>>>> move from the application to the lowest-level database internals. 
>>>>>>>>> Limiting correctly at the client will save more resources than 
>>>>>>>>> limiting at the native protocol server, and limiting correctly at the 
>>>>>>>>> native protocol server will save more resources than limiting after 
>>>>>>>>> we've dispatched requests to some thread pool for processing.
>>>>>>>>> 2.) We should make sure the links between the "known" root causes of 
>>>>>>>>> cascading failures and the mechanisms we introduce to avoid them 
>>>>>>>>> remain very strong.
>>>>>>>>> 
>>>>>>>>> In any case, I'd be happy to help out in any way I can as this moves 
>>>>>>>>> forward (especially as it relates to our past/current attempts to 
>>>>>>>>> address this problem space).
>>>>>>> 
>>>>>>

Re: [EXTERNAL] [Discuss] Generic Purpose Rate Limiter in Cassandra

Reply via email to