The work has begun but we don't have a VOTE thread for this CEP. Can one get started?
On Mon, May 6, 2024 at 9:24 PM Jaydeep Chovatia <chovatia.jayd...@gmail.com> wrote: > Sure, Caleb. I will include the work as part of CASSANDRA-19534 > <https://issues.apache.org/jira/browse/CASSANDRA-19534> in the CEP-41. > > Jaydeep > > On Fri, May 3, 2024 at 7:48 AM Caleb Rackliffe <calebrackli...@gmail.com> > wrote: > >> FYI, there is some ongoing sort-of-related work going on in >> CASSANDRA-19534 <https://issues.apache.org/jira/browse/CASSANDRA-19534> >> >> On Wed, Apr 10, 2024 at 6:35 PM Jaydeep Chovatia < >> chovatia.jayd...@gmail.com> wrote: >> >>> Just created an official CEP-41 >>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-41+%28DRAFT%29+Apache+Cassandra+Unified+Rate+Limiter> >>> incorporating the feedback from this discussion. Feel free to let me know >>> if I may have missed some important feedback in this thread that is not >>> captured in the CEP-41. >>> >>> Jaydeep >>> >>> On Thu, Feb 22, 2024 at 11:36 AM Jaydeep Chovatia < >>> chovatia.jayd...@gmail.com> wrote: >>> >>>> Thanks, Josh. I will file an official CEP with all the details in a few >>>> days and update this thread with that CEP number. >>>> Thanks a lot everyone for providing valuable insights! >>>> >>>> Jaydeep >>>> >>>> On Thu, Feb 22, 2024 at 9:24 AM Josh McKenzie <jmcken...@apache.org> >>>> wrote: >>>> >>>>> Do folks think we should file an official CEP and take it there? >>>>> >>>>> +1 here. >>>>> >>>>> Synthesizing your gdoc, Caleb's work, and the feedback from this >>>>> thread into a draft seems like a solid next step. >>>>> >>>>> On Wed, Feb 7, 2024, at 12:31 PM, Jaydeep Chovatia wrote: >>>>> >>>>> I see a lot of great ideas being discussed or proposed in the past to >>>>> cover the most common rate limiter candidate use cases. Do folks think we >>>>> should file an official CEP and take it there? >>>>> >>>>> Jaydeep >>>>> >>>>> On Fri, Feb 2, 2024 at 8:30 AM Caleb Rackliffe < >>>>> calebrackli...@gmail.com> wrote: >>>>> >>>>> I just remembered the other day that I had done a quick writeup on the >>>>> state of compaction stress-related throttling in the project: >>>>> >>>>> >>>>> https://docs.google.com/document/d/1dfTEcKVidRKC1EWu3SO1kE1iVLMdaJ9uY1WMpS3P_hs/edit?usp=sharing >>>>> >>>>> I'm sure most of it is old news to the people on this thread, but I >>>>> figured I'd post it just in case :) >>>>> >>>>> On Tue, Jan 30, 2024 at 11:58 AM Josh McKenzie <jmcken...@apache.org> >>>>> wrote: >>>>> >>>>> >>>>> 2.) We should make sure the links between the "known" root causes of >>>>> cascading failures and the mechanisms we introduce to avoid them remain >>>>> very strong. >>>>> >>>>> Seems to me that our historical strategy was to address individual >>>>> known cases one-by-one rather than looking for a more holistic >>>>> load-balancing and load-shedding solution. While the engineer in me likes >>>>> the elegance of a broad, more-inclusive *actual SEDA-like* approach, >>>>> the pragmatist in me wonders how far we think we are today from a stable >>>>> set-point. >>>>> >>>>> i.e. are we facing a handful of cases where nodes can still get pushed >>>>> over and then cascade that we can surgically address, or are we facing a >>>>> broader lack of back-pressure that rears its head in different domains >>>>> (client -> coordinator, coordinator -> replica, internode with other >>>>> operations, etc) at surprising times and should be considered more >>>>> holistically? >>>>> >>>>> On Tue, Jan 30, 2024, at 12:31 AM, Caleb Rackliffe wrote: >>>>> >>>>> I almost forgot CASSANDRA-15817, which introduced >>>>> reject_repair_compaction_threshold, which provides a mechanism to stop >>>>> repairs while compaction is underwater. >>>>> >>>>> On Jan 26, 2024, at 6:22 PM, Caleb Rackliffe <calebrackli...@gmail.com> >>>>> wrote: >>>>> >>>>> >>>>> Hey all, >>>>> >>>>> I'm a bit late to the discussion. I see that we've already discussed >>>>> CASSANDRA-15013 >>>>> <https://issues.apache.org/jira/browse/CASSANDRA-15013> and >>>>> CASSANDRA-16663 >>>>> <https://issues.apache.org/jira/browse/CASSANDRA-16663> at least in >>>>> passing. Having written the latter, I'd be the first to admit it's a crude >>>>> tool, although it's been useful here and there, and provides a couple >>>>> primitives that may be useful for future work. As Scott mentions, while it >>>>> is configurable at runtime, it is not adaptive, although we did >>>>> make configuration easier in CASSANDRA-17423 >>>>> <https://issues.apache.org/jira/browse/CASSANDRA-17423>. It also is >>>>> global to the node, although we've lightly discussed some ideas around >>>>> making it more granular. (For example, keyspace-based limiting, or >>>>> limiting >>>>> "domains" tagged by the client in requests, could be interesting.) It also >>>>> does not deal with inter-node traffic, of course. >>>>> >>>>> Something we've not yet mentioned (that does address internode >>>>> traffic) is CASSANDRA-17324 >>>>> <https://issues.apache.org/jira/browse/CASSANDRA-17324>, which I >>>>> proposed shortly after working on the native request limiter (and have >>>>> just >>>>> not had much time to return to). The basic idea is this: >>>>> >>>>> When a node is struggling under the weight of a compaction backlog and >>>>> becomes a cause of increased read latency for clients, we have two safety >>>>> valves: >>>>> >>>>> >>>>> 1.) Disabling the native protocol server, which stops the node from >>>>> coordinating reads and writes. >>>>> 2.) Jacking up the severity on the node, which tells the dynamic >>>>> snitch to avoid the node for reads from other coordinators. >>>>> >>>>> >>>>> These are useful, but we don’t appear to have any mechanism that would >>>>> allow us to temporarily reject internode hint, batch, and mutation >>>>> messages >>>>> that could further delay resolution of the compaction backlog. >>>>> >>>>> >>>>> Whether it's done as part of a larger framework or on its own, it >>>>> still feels like a good idea. >>>>> >>>>> Thinking in terms of opportunity costs here (i.e. where we spend our >>>>> finite engineering time to holistically improve the experience of >>>>> operating >>>>> this database) is healthy, but we probably haven't reached the point of >>>>> diminishing returns on nodes being able to protect themselves from clients >>>>> and from other nodes. I would just keep in mind two things: >>>>> >>>>> 1.) The effectiveness of rate-limiting in the system (which includes >>>>> the database and all clients) as a whole necessarily decreases as we move >>>>> from the application to the lowest-level database internals. Limiting >>>>> correctly at the client will save more resources than limiting at the >>>>> native protocol server, and limiting correctly at the native protocol >>>>> server will save more resources than limiting after we've dispatched >>>>> requests to some thread pool for processing. >>>>> 2.) We should make sure the links between the "known" root causes of >>>>> cascading failures and the mechanisms we introduce to avoid them remain >>>>> very strong. >>>>> >>>>> In any case, I'd be happy to help out in any way I can as this moves >>>>> forward (especially as it relates to our past/current attempts to address >>>>> this problem space). >>>>> >>>>> >>>>> >>>>>