I think this would beneficial since we have a loose agreement that UCS is a promising option as new default.
I plan to summarize the views expressed in this thread to propose a plan to make compactions usability smoother to new users in Cassandra, if there are any short term actions we can agree to address outstanding issues. On Sun, 8 Dec 2024 at 12:48 Jordan West <jw...@apache.org> wrote: > While we continue the discussion here on short term defaults do we all > feel it would be beneficial to start a new thread on what is required to > get UCS over the line as a default? So we can have both discussions going > at once? > > On Sun, Dec 8, 2024 at 8:44 AM Paulo Motta <pa...@apache.org> wrote: > >> >> Hi Dave, >> >> I appreciate these performance/cost considerations and I believe these >> should be taken into account when evaluating default changes. >> >> I am trying to frame this as an usability issue with the database by >> shipping with STCS by default. >> >> I think it's possible to classify workloads into two types: >> a) Mutable (superset) >> b) Immutable (or semi-immutable) >> >> The majority of current use cases might be b) Immutable, but shipping >> with STCS provides a bad user experience to users of a) Mutable use cases. >> In turn, this reinforces that "cassandra is not good for mutable use cases". >> >> I believe the use cases that will be covered by CQL Transactions tend to >> be a) mutable, and it might make sense to optimize to this reality. >> >> Existing users of immutable use cases are familiar with STCS and can >> remain using this choice. >> >> Thanks, >> >> Paulo >> >> On Sun, Dec 8, 2024 at 10:33 AM Dave Herrington <he...@rhinosource.com> >> wrote: >> >>> …the analysis I describe would need to be weighted by table size. I >>> have several representative production cluster tablestats analyses that >>> show r:w ratio by table, including table size. I can check to see how this >>> analysis plays out on a few of these. >>> >>> -Dave >>> >>> David A. Herrington II >>> President and Chief Engineer >>> RhinoSource, Inc. >>> >>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.* >>> >>> www.rhinosource.com >>> >>> >>> On Sun, Dec 8, 2024 at 7:22 AM Dave Herrington <he...@rhinosource.com> >>> wrote: >>> >>>> Paulo, >>>> >>>> I understand your perspective. >>>> >>>> Short of waiting for UCS to prove itself out, I guess it comes down to >>>> the assertion that a strong majority of Cassandra use cases would benefit >>>> from using LCS vs. STCS. >>>> >>>> The conventional wisdom is that workloads need to be read-heavy to make >>>> the extra resource consumption of LCS pay off. 4:1 read:write is the >>>> threshold I use to decide whether or not to use LCS. >>>> >>>> I think this ratio is important in this analysis. Has this LCS >>>> “payoff” threshold changed to 2:1 or better, in favor of LCS? This would >>>> be good to know. >>>> >>>> With an up-to-date threshold in hand, what is the fraction of Cassandra >>>> use cases that meet this updatedthreshold? >>>> >>>> For example, say this LCS payoff r:w ratio has improved to 2:1. What >>>> percentage of Cassandra tables across all clusters currently in operation >>>> are 2:1 read-to-write or more? >>>> >>>> If the answer is a solid majority, I think this would justify the >>>> default change. >>>> >>>> -Dave >>>> >>>> David A. Herrington II >>>> President and Chief Engineer >>>> RhinoSource, Inc. >>>> >>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.* >>>> >>>> www.rhinosource.com >>>> >>>> >>>> On Sun, Dec 8, 2024 at 5:43 AM Paulo Motta <pa...@apache.org> wrote: >>>> >>>>> Hi Dave, >>>>> >>>>> I'm also in the field and my experience is different. >>>>> >>>>> I have seen new users shooting themselves in the foot with the default >>>>> compaction strategy STCS on a regular basis over the past few years and >>>>> have been recommending them to switch to LCS and they no longer encounter >>>>> issues after making this switch. I would like to generalize this >>>>> recommendation to prevent new users from having bad experiences and >>>>> abandoning the database. >>>>> >>>>> This is not a cost issue, it's an ease of use matter. STCS does not >>>>> work for mutable workloads and this is a massive functional limitation >>>>> with >>>>> the database. >>>>> >>>>> I don't want people to download Cassandra 5.1 to try out transactions >>>>> and start facing issues due to bad STCS performance on mutable data. >>>>> >>>>> If you would like to optimize for cost, then you can read the docs or >>>>> hire a consultant to optimize the cost for you. Otherwise, the database >>>>> should work out of the box and this is provided by LCS. If LCS can not >>>>> keep >>>>> up, it means the cluster is under provisioned and needs to be expanded, >>>>> it's not a functional issue but a capacity issue. >>>>> >>>>> Cheers, >>>>> >>>>> Paulo >>>>> >>>>> On Sun, Dec 8, 2024 at 1:26 AM Dave Herrington <he...@rhinosource.com> >>>>> wrote: >>>>> >>>>>> Chiming in from the field, I think maintaining the familiar status >>>>>> quo until a panacea compaction strategy proves itself out (could that be >>>>>> UCS?) makes sense to me. I feel it could be maddening to customers if >>>>>> LCS >>>>>> started showing up in schemas after an upgrade just because the default >>>>>> changed. If UCS proves itself as the fits-all solution, then we’d be >>>>>> doing >>>>>> them a favor by making the default. In time. >>>>>> >>>>>> -Dave >>>>>> >>>>>> David A. Herrington II >>>>>> President and Chief Engineer >>>>>> RhinoSource, Inc. >>>>>> >>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.* >>>>>> >>>>>> www.rhinosource.com >>>>>> >>>>>> >>>>>> On Sat, Dec 7, 2024 at 7:32 PM Jeff Jirsa <jji...@gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Dec 7, 2024, at 7:08 PM, Mick Semb Wever <m...@apache.org> wrote: >>>>>>> >>>>>>> Chiming in with my two cents… >>>>>>> >>>>>>> >>>>>>> When people have the luxury of working in environments where >>>>>>>> clusters are massively over provisioned, LCS as a default makes a lot >>>>>>>> of >>>>>>>> sense, because there's not much downside. The use cases where you'd >>>>>>>> actually fall behind in compaction are pretty slim, so the negative >>>>>>>> impact >>>>>>>> isn't felt. >>>>>>>> >>>>>>>> Most people aren't doing this. Putting LCS as the default >>>>>>>> significantly changes the performance profile of new clusters in a way >>>>>>>> that >>>>>>>> actively harms a portion of the community. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> Haddad's statement here resonates above everything else that's been >>>>>>> said so far. It is this particular audience that I'm thinking first >>>>>>> about >>>>>>> not screwing over, everyone else is a step in front of them wrt knowing >>>>>>> what compaction is and making an informed decision into changing it. >>>>>>> >>>>>>> >>>>>>> “You have to over-provision (iops) to use LCS” isn’t that different >>>>>>> from “you have to over-provision (space) to use LCS” (by perhaps 50%). >>>>>>> >>>>>>> Both of them are sub-optimal and you’re trading off either extra >>>>>>> space or extra compute/ops. >>>>>>> >>>>>>> >>>>>>>