Not sure if these are what Jeff was referring to but as a workaround, you
can configure the following STCS compaction subproperties:
- min_threshold - set to 2 so that only a minimum of 2 similar-sized
sstables are required to trigger a minor compaction instead of the default 4
- tombstone_threshold - set to 0.1 so that if at least 10% of an sstable
are tombstones, Cassandra will compact the table alone instead of waiting
for the higher default ratio of 0.2
- unchecked_tombstone_compaction - set to true to allow Cassandra to run
tombstone compaction without having to check if an sstable is eligible for
compaction

WARNING - For future reference, this is just a workaround. It isn't a fix
for clusters with bad data models. Consider these as buying your cluster
some breathing space while you redesign your data model. Cheers!

On Thu, Aug 10, 2017 at 12:27 AM, Jeff Jirsa <jji...@gmail.com> wrote:

> The deleting compaction strategy from protectwise (https://github.com/
> protectwise/cassandra-util/blob/master/deleting-
> compaction-strategy/README.md) was written (I believe) to solve a similar
> problem - business based deletion rules to enable flexible TTLs. May want
> to glance at that.
>
> Other answers inline below
>
>
> --
> Jeff Jirsa
>
>
> On Aug 9, 2017, at 1:41 AM, Steinmaurer, Thomas <
> thomas.steinmau...@dynatrace.com> wrote:
>
> Hello,
>
>
>
> our top contributor from a data volume perspective is time series data. We
> are running with STCS since our initial production deployment in 2014 with
> several clusters with a varying number of nodes, but currently with max. 9
> nodes per single cluster per different region in AWS with m4.xlarge / EBS
> gp2 storage. We have a road of Cassandra versions starting with 1.2 to
> actually using DSC 2.1.15 soon to be replaced by Apache Cassandra 2.1.18
> across all deployments. Lately we switched from Thrift (Astyanax) to
> Native/CQL (DataStax driver). Overall we are pretty happy with stability
> and the scale out offering.
>
>
>
> We store time series data in different resolutions, from 1min up to 1day
> aggregates per “time slot”. Each resolution has its own column family /
> table and a periodic worker is executing our business logic regarding time
> series aging from e.g. 1min => 5min => … resolution + deletion in source
> resolutions according to our retention per resolution policy. So deletions
> will happen way later (e.g. at least > 14d). We don’t use TTLs on written
> time series data (in production, see TWCS testing below), so purging is
> exclusively handled by explicit DELETEs in our aging business logic
> creating tombstones.
>
>
>
> Naturally with STCS and late explicit deletions / tombstones, it will take
> a lot of time to finally reclaim disk space, even worse, we are now running
> a major compaction every X weeks. We currently are also testing with STCS
> min_threshold = 2 etc., but all in all, this all feels not being a
> long-term solution. I guess there is nothing else we are missing from a
> configuration/setting side with STCS? Single SSTable compaction might not
> kick in as well, cause checking with sstablemeta, estimated droppable
> tombstones for our time series based SSTables is pretty much 0.0 all the
> time. I guess as we don’t write with TTL?
>
>
>
> Or you aren't issuing deletes, explicit deletes past GCGS will cause that
> number to increase
>
>
>
> TWCS caught my eyes in 2015 I think, and even more at the Cassandra Summit
> 2016 + other Tombstone related talks. Cassandra 3.0 is around 6 months
> ahead for us, thus initial testing was with 2.1.18 patched with TWCS from
> GitHub.
>
>
>
> Looks like TWCS is exactly what we need, thus test-wise, once we start
> writing with TTL we end up with a single SSTable per passed window size and
> data (SSTables) older than TTL + grace get automatically removed from disk.
> Even with enabled out-of-orders DELETEs from our business logic, purging
> SSTables seems not be stucked. Not sure if this is expected. Writing with
> TTL is also a bit problematic, in case our retention policy changes in
> general or for particular customers.
>
>
> Search for my Cassandra summit talk from 2016 - there's a few other
> compaction options you probably want to set to more aggressively trigger
> single sstable compaction to help unstick it.
>
>
>
> A few questions, as we need some short-term (with C* 2.1) and long-term
> (with C* 3.0) mitigation:
>
> ·         With STCS, estimated droppable tombstones being always 0.0
> (thus also no automatic single SSTable compaction may happen): Is this a
> matter of not writing with TTL? If yes, would enabling TTL with STCS
> improve the disk reclaim situation, cause then single SSTAble compactions
> will kick in?
>
> ·         What is the semantic of “default_time_to_live” at table level?
> From: http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_c.html
> : “After the default_time_to_live TTL value has been exceed, Cassandra
> tombstones the entire table”. What does “entire table” mean?
>
>
> It probably means sstable, but even that isn't really accurate - that's a
> doc bug
>
> Hopefully / I guess I don’t end up with an empty table every X past TTLs?
>
> ·         Anything else I’m missing regarding STCS and reclaiming disk
> space earlier in our TS use case?
>
>
> LCS rewrites much more aggressively on partition updates - if you can
> spare the IO it's likely going to be more efficient purging deleted data
> than STCS
>
> ·         I know, changing compaction is a matter of executing ALTER
> TABLE (or temporary via JMX for a single node), but as we have legacy data
> being written without TTL, I wonder if we may end up in stuck SSTable again
>
> ·         In case of stuck SSTables with any compaction strategy, what is
> the best way to debug/analyze why it got stuck (overlapping etc.)?
>
>
> sstableexpiredblockers
>
>
>
> Thanks a lot and sorry for the lengthy email.
>
>
>
> Thomas
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freistädterstraße 313
>
>

Reply via email to