Re: Re-evaluate compaction defaults in 5.1/trunk

Tolbert, Andy Fri, 06 Dec 2024 22:48:00 -0800

> @Andy - you can set the default compaction strategy in C* yaml now.

Oh, this is very cool and I'm happy to see it!  Looks like that landed as
part of the UCS contribution itself (CASSANDRA-18397 Unified Compaction
Strategy <https://issues.apache.org/jira/browse/CASSANDRA-18397>), great
idea.


> For a very common example, a lot of clusters are now using the k8ssandra
operator in AWS, which needs EBS.  It's incredibly easy to fall behind on
compaction there.

Yeah, this reinforces my feelings that with LCS; you feel it early, you get
to reason about that right away.  Cluster can't keep up with writes, do I
need more capacity?  do I need to tune something?  Not to say the situation
of having thousands of pending compactions and SSTables in L0 is a good
user experience....

With STCS, you may not have to reason about this until later where you
might have nodes get out of sync, and now repairs are going to really
punish you; that could be worse when using STCS if you have large
SSTables.  Or if your read performance is really bad because your data is
spread out amount a ton of sstables, or you are touching a ton of
tombstones that aren't going away for some reason.  This is not a trivial
thing for someone to debug or understand why their reads got slow or are
failing suddenly.   There's just a lot of things that I feel can work
better operationally and more predictably with LCS because SSTables have a
capped size and partitions don't overlap many sstables (reads, streaming,
repairs, anticompaction, etc.).

In any case,  I agree that changing the default to LCS would not be best
right now, even if I feel it is generally better.   Hopefully in the future
UCS is viewed as production ready and can be considered as a possible
default, especially if it can address the issues users encounter with STCS
and LCS in a better way.

Thanks,
Andy

On Fri, Dec 6, 2024 at 11:23 PM Jon Haddad <[email protected]> wrote:

> For a very common example, a lot of clusters are now using the k8ssandra
> operator in AWS, which needs EBS.  It's incredibly easy to fall behind on
> compaction there.  It's why I'm so interested in seeing CASSANDRA-15452 get
> merged in.  I've dealt with quite a few of these clusters, in fact I just
> worked on one this week.  They're now happily running UCS on 5.0.
>
> Like it or not, LCS is a poor fit for a non-trivial number of teams.  Not
> saying STCS doesn't have some poor use cases, but read amplification from
> reading lots of SSTables is generally better for the end user than being
> thousands of compactions behind.  I'm trying to do the least amount of harm
> to the fewest number of teams.
>
> @Andy - you can set the default compaction strategy in C* yaml now.
>
> # default_compaction:
> #   class_name: SizeTieredCompactionStrategy
> #   parameters:
> #     min_threshold: 4
> #     max_threshold: 32
>
> Jon
>
>
> On Fri, Dec 6, 2024 at 8:58 PM Dinesh Joshi <[email protected]> wrote:
>
>> I’m genuinely curious to understand how is defaulting to LCS going to
>> cause a nightmare? I am not sure what the concern is over here.
>>
>> On Fri, Dec 6, 2024 at 8:53 PM Jon Haddad <[email protected]>
>> wrote:
>>
>>> You're ignoring the other side here.  For the folks who *can't* use LCS,
>>> defaulting to it is a nightmare.
>>>
>>> Sorry, but you can't screw over 20% of the community to make life a
>>> little better for the 80%.  This is a terrible tradeoff.
>>>
>>>
>>> Jon
>>>
>>> On Fri, Dec 6, 2024 at 8:36 PM Dinesh Joshi <[email protected]> wrote:
>>>
>>>> I would argue that vast majority of real world workloads are read
>>>> heavy. LCS would therefore be a net benefit for the average user.
>>>>
>>>> To mitigate the write amplification concern I would make this change
>>>> and make sure it is well documented for operators so they’re not caught off
>>>> guard.
>>>>
>>>> On Fri, Dec 6, 2024 at 8:06 PM Jeff Jirsa <[email protected]> wrote:
>>>>
>>>>> And it works for that most of the time, so what’s the concern? “You
>>>>> lose throughput because iops / write amplification go up, so the perf of
>>>>> the default install goes down” ? (But the cost per byte goes way down,
>>>>> too)?
>>>>>
>>>>>
>>>>>
>>>>> On Dec 6, 2024, at 8:01 PM, Brad <[email protected]> wrote:
>>>>>
>>>>> > Could you elaborate what you mean by 'disk storage management'?
>>>>>
>>>>> I often see clusters use LCS as an easy fix to avoid the 50% disk free
>>>>> recommendation of STCS without considering the write
>>>>> magnification implications.
>>>>>
>>>>> On Fri, Dec 6, 2024 at 10:46 PM Dinesh Joshi <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Could you elaborate what you mean by 'disk storage management'?
>>>>>>
>>>>>> On Fri, Dec 6, 2024 at 7:30 PM Brad <[email protected]> wrote:
>>>>>>
>>>>>>> I'm -1 on LCS being the default, seen far too many people use it for
>>>>>>> disk storage management
>>>>>>>
>>>>>>> On Fri, Dec 6, 2024 at 10:08 PM Jon Haddad <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm -1 on LCS being the default, since using it in the wrong
>>>>>>>> situations renders clusters inoperable.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Dec 6, 2024 at 7:03 PM Paulo Motta <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> > I'd prefer to see the default go from STCS to UCS
>>>>>>>>>
>>>>>>>>> I’m proposing this for latest unstable (cassandra_latest.yaml)
>>>>>>>>> since it’s a more recent strategy still being adopted. For latest 
>>>>>>>>> stable
>>>>>>>>> (cassandra.yaml) I’d prefer LCS since it does not need tuning to 
>>>>>>>>> support
>>>>>>>>> mutable workloads (UPDATE/DELETE) and is battle-tested.
>>>>>>>>>
>>>>>>>>> On Fri, 6 Dec 2024 at 21:37 Jon Haddad <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I'd prefer to see the default go from STCS to UCS, probably with
>>>>>>>>>> scaling_parameters T4.  That's essentially the same as STCS but 
>>>>>>>>>> without the
>>>>>>>>>> ridiculous SSTable growth, allowing us to leverage the fast 
>>>>>>>>>> streaming path
>>>>>>>>>> more often.  I don't think there's any valid use cases for STCS 
>>>>>>>>>> anymore now
>>>>>>>>>> that we have UCS.
>>>>>>>>>>
>>>>>>>>>> That said, many have taken issue with the state of UCS docs,
>>>>>>>>>> myself included, so that would need to be addressed with any default 
>>>>>>>>>> change.
>>>>>>>>>>
>>>>>>>>>> I don't think we should mark TWCS as experimental.  Maybe we
>>>>>>>>>> prevent repairs to tables using TWCS, or do a better job of 
>>>>>>>>>> encouraging
>>>>>>>>>> folks to use incremental repair at higher frequencies.  It's 
>>>>>>>>>> definitely not
>>>>>>>>>> experimental though.
>>>>>>>>>>
>>>>>>>>>> Side note: I think experimental has been over-used and has lost
>>>>>>>>>> all meaning.  How is Java 17 experimental?  Very confusing for the
>>>>>>>>>> community.
>>>>>>>>>>
>>>>>>>>>> I think TWCS should use UCS under the hood which would address
>>>>>>>>>> streaming performance (and thus node density) or UCS could be 
>>>>>>>>>> updated to
>>>>>>>>>> allow for time window's options.  Either would solve issue #3 in 
>>>>>>>>>> your list.
>>>>>>>>>>
>>>>>>>>>> Jon
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Dec 6, 2024 at 5:36 PM Paulo Motta <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> It’s 2024 and users are still facing issues due to misconfigured
>>>>>>>>>>> compaction when using default configuration.
>>>>>>>>>>>
>>>>>>>>>>> I would like to start a conversation around improving compaction
>>>>>>>>>>> defaults in 5.1/trunk, so users trying out CQL transactions don’t 
>>>>>>>>>>> need to
>>>>>>>>>>> worry about tuning compaction.
>>>>>>>>>>>
>>>>>>>>>>> A few suggestions:
>>>>>>>>>>>
>>>>>>>>>>> 1) Make LeveledCompactionStrategy default on cassandra.yaml, UCS
>>>>>>>>>>> default on cassandra_latest.yaml ?
>>>>>>>>>>>
>>>>>>>>>>> 2) Does TWCS work out of the box with repairs and hints? My
>>>>>>>>>>> understanding is that due to CASSANDRA-10496 this causes droppable
>>>>>>>>>>> tombstone issues when in combination with repair and hints (see 
>>>>>>>>>>> more on
>>>>>>>>>>> this thread [1]). We should either fix this or mark TWCS 
>>>>>>>>>>> experimental.
>>>>>>>>>>>
>>>>>>>>>>> 3) When STCS is used with deletions/TTL, tombstones accumulate
>>>>>>>>>>> in higher level stables when unchecked_tombstone_compaction is 
>>>>>>>>>>> disabled
>>>>>>>>>>> (see CASSANDRA-6563). I propose having adding a new setting “auto” 
>>>>>>>>>>> enabled
>>>>>>>>>>> by default that will have this set to true when STCS/TWCS is used.
>>>>>>>>>>>
>>>>>>>>>>> I believe addressing these points will improve user experience
>>>>>>>>>>> with Cassandra.
>>>>>>>>>>>
>>>>>>>>>>> I apologize in advance if these topics were discussed in recent
>>>>>>>>>>> threads. I would be happy to get  pointers of related discussions 
>>>>>>>>>>> on this
>>>>>>>>>>> topic.
>>>>>>>>>>>
>>>>>>>>>>> I will be happy to create JIRA if there’s agreement on
>>>>>>>>>>> addressing these items.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Paulo
>>>>>>>>>>>
>>>>>>>>>>> [1] -
>>>>>>>>>>>
>>>>>>>>>>> https://user.cassandra.apache.narkive.com/VQOacfnT/twcs-repair-create-new-buckets-with-old-data
>>>>>>>>>>>
>>>>>>>>>>
>>>>>

Re: Re-evaluate compaction defaults in 5.1/trunk

Reply via email to