Re: Re-evaluate compaction defaults in 5.1/trunk

Jordan West Sun, 08 Dec 2024 09:48:50 -0800

While we continue the discussion here on short term defaults do we all feel
it would be beneficial to start a new thread on what is required to get UCS
over the line as a default? So we can have both discussions going at once?


On Sun, Dec 8, 2024 at 8:44 AM Paulo Motta <pa...@apache.org> wrote:

>
> Hi Dave,
>
> I appreciate these performance/cost considerations and I believe these
> should be taken into account when evaluating default changes.
>
> I am trying to frame this as an usability issue with the database by
> shipping with STCS by default.
>
> I think it's possible to classify workloads into two types:
> a) Mutable (superset)
> b) Immutable (or semi-immutable)
>
> The majority of current use cases might be b) Immutable, but shipping with
> STCS provides a bad user experience to users of a) Mutable use cases. In
> turn, this reinforces that "cassandra is not good for mutable use cases".
>
> I believe the use cases that will be covered by CQL Transactions tend to
> be a) mutable, and it might make sense to optimize to this reality.
>
> Existing users of immutable use cases are familiar with STCS and can
> remain using this choice.
>
> Thanks,
>
> Paulo
>
> On Sun, Dec 8, 2024 at 10:33 AM Dave Herrington <he...@rhinosource.com>
> wrote:
>
>> …the analysis I describe would need to be weighted by table size.  I have
>> several representative production cluster tablestats analyses that show r:w
>> ratio by table, including table size.  I can check to see how this analysis
>> plays out on a few of these.
>>
>> -Dave
>>
>> David A. Herrington II
>> President and Chief Engineer
>> RhinoSource, Inc.
>>
>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>
>> www.rhinosource.com
>>
>>
>> On Sun, Dec 8, 2024 at 7:22 AM Dave Herrington <he...@rhinosource.com>
>> wrote:
>>
>>> Paulo,
>>>
>>> I understand your perspective.
>>>
>>> Short of waiting for UCS to prove itself out, I guess it comes down to
>>> the assertion that a strong majority of Cassandra use cases would benefit
>>> from using LCS vs. STCS.
>>>
>>> The conventional wisdom is that workloads need to be read-heavy to make
>>> the extra resource consumption of LCS pay off.  4:1 read:write is the
>>> threshold I use to decide whether or not to use LCS.
>>>
>>> I think this ratio is important in this analysis.  Has this LCS “payoff”
>>> threshold changed to 2:1 or better, in favor of LCS?  This would be good to
>>> know.
>>>
>>> With an up-to-date threshold in hand, what is the fraction of Cassandra
>>> use cases that meet this updatedthreshold?
>>>
>>> For example, say this LCS payoff r:w ratio has improved to 2:1.  What
>>> percentage of Cassandra tables across all clusters currently in operation
>>> are 2:1 read-to-write or more?
>>>
>>> If the answer is a solid majority, I think this would justify the
>>> default change.
>>>
>>> -Dave
>>>
>>> David A. Herrington II
>>> President and Chief Engineer
>>> RhinoSource, Inc.
>>>
>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>
>>> www.rhinosource.com
>>>
>>>
>>> On Sun, Dec 8, 2024 at 5:43 AM Paulo Motta <pa...@apache.org> wrote:
>>>
>>>> Hi Dave,
>>>>
>>>> I'm also in the field and my experience is different.
>>>>
>>>> I have seen new users shooting themselves in the foot with the default
>>>> compaction strategy STCS on a regular basis over the past few years and
>>>> have been recommending them to switch to LCS and they no longer encounter
>>>> issues after making this switch. I would like to generalize this
>>>> recommendation to prevent new users from having bad experiences and
>>>> abandoning the database.
>>>>
>>>> This is not a cost issue, it's an ease of use matter. STCS does not
>>>> work for mutable workloads and this is a massive functional limitation with
>>>> the database.
>>>>
>>>> I don't want people to download Cassandra 5.1 to try out transactions
>>>> and start facing issues due to bad STCS performance on mutable data.
>>>>
>>>> If you would like to optimize for cost, then you can read the docs or
>>>> hire a consultant to optimize the cost for you. Otherwise, the database
>>>> should work out of the box and this is provided by LCS. If LCS can not keep
>>>> up, it means the cluster is under provisioned and needs to be expanded,
>>>> it's not a functional issue but a capacity issue.
>>>>
>>>> Cheers,
>>>>
>>>> Paulo
>>>>
>>>> On Sun, Dec 8, 2024 at 1:26 AM Dave Herrington <he...@rhinosource.com>
>>>> wrote:
>>>>
>>>>> Chiming in from the field, I think maintaining the familiar status quo
>>>>> until a panacea compaction strategy proves itself out (could that be UCS?)
>>>>> makes sense to me.  I feel it could be maddening to customers if LCS
>>>>> started showing up in schemas after an upgrade just because the default
>>>>> changed.  If UCS proves itself as the fits-all solution, then we’d be 
>>>>> doing
>>>>> them a favor by making the default. In time.
>>>>>
>>>>> -Dave
>>>>>
>>>>> David A. Herrington II
>>>>> President and Chief Engineer
>>>>> RhinoSource, Inc.
>>>>>
>>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>>
>>>>> www.rhinosource.com
>>>>>
>>>>>
>>>>> On Sat, Dec 7, 2024 at 7:32 PM Jeff Jirsa <jji...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Dec 7, 2024, at 7:08 PM, Mick Semb Wever <m...@apache.org> wrote:
>>>>>>
>>>>>> Chiming in with my two cents…
>>>>>>
>>>>>>
>>>>>> When people have the luxury of working in environments where clusters
>>>>>>> are massively over provisioned, LCS as a default makes a lot of sense,
>>>>>>> because there's not much downside.  The use cases where you'd actually 
>>>>>>> fall
>>>>>>> behind in compaction are pretty slim, so the negative impact isn't felt.
>>>>>>>
>>>>>>> Most people aren't doing this.  Putting LCS as the default
>>>>>>> significantly changes the performance profile of new clusters in a way 
>>>>>>> that
>>>>>>> actively harms a portion of the community.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Haddad's statement here resonates above everything else that's been
>>>>>> said so far.  It is this particular audience that I'm thinking first 
>>>>>> about
>>>>>> not screwing over, everyone else is a step in front of them wrt knowing
>>>>>> what compaction is and making an informed decision into changing it.
>>>>>>
>>>>>>
>>>>>> “You have to over-provision (iops) to use LCS” isn’t that different
>>>>>> from “you have to over-provision (space) to use LCS” (by perhaps 50%).
>>>>>>
>>>>>> Both of them are sub-optimal and you’re trading off either extra
>>>>>> space or extra compute/ops.
>>>>>>
>>>>>>
>>>>>>

Re: Re-evaluate compaction defaults in 5.1/trunk

Reply via email to