The deleting compaction strategy from protectwise
was written (I believe) to solve a similar problem - business based deletion
rules to enable flexible TTLs. May want to glance at that.
Other answers inline below
> On Aug 9, 2017, at 1:41 AM, Steinmaurer, Thomas
> <thomas.steinmau...@dynatrace.com> wrote:
> our top contributor from a data volume perspective is time series data. We
> are running with STCS since our initial production deployment in 2014 with
> several clusters with a varying number of nodes, but currently with max. 9
> nodes per single cluster per different region in AWS with m4.xlarge / EBS gp2
> storage. We have a road of Cassandra versions starting with 1.2 to actually
> using DSC 2.1.15 soon to be replaced by Apache Cassandra 2.1.18 across all
> deployments. Lately we switched from Thrift (Astyanax) to Native/CQL
> (DataStax driver). Overall we are pretty happy with stability and the scale
> out offering.
> We store time series data in different resolutions, from 1min up to 1day
> aggregates per “time slot”. Each resolution has its own column family / table
> and a periodic worker is executing our business logic regarding time series
> aging from e.g. 1min => 5min => … resolution + deletion in source resolutions
> according to our retention per resolution policy. So deletions will happen
> way later (e.g. at least > 14d). We don’t use TTLs on written time series
> data (in production, see TWCS testing below), so purging is exclusively
> handled by explicit DELETEs in our aging business logic creating tombstones.
> Naturally with STCS and late explicit deletions / tombstones, it will take a
> lot of time to finally reclaim disk space, even worse, we are now running a
> major compaction every X weeks. We currently are also testing with STCS
> min_threshold = 2 etc., but all in all, this all feels not being a long-term
> solution. I guess there is nothing else we are missing from a
> configuration/setting side with STCS? Single SSTable compaction might not
> kick in as well, cause checking with sstablemeta, estimated droppable
> tombstones for our time series based SSTables is pretty much 0.0 all the
> time. I guess as we don’t write with TTL?
Or you aren't issuing deletes, explicit deletes past GCGS will cause that
number to increase
> TWCS caught my eyes in 2015 I think, and even more at the Cassandra Summit
> 2016 + other Tombstone related talks. Cassandra 3.0 is around 6 months ahead
> for us, thus initial testing was with 2.1.18 patched with TWCS from GitHub.
> Looks like TWCS is exactly what we need, thus test-wise, once we start
> writing with TTL we end up with a single SSTable per passed window size and
> data (SSTables) older than TTL + grace get automatically removed from disk.
> Even with enabled out-of-orders DELETEs from our business logic, purging
> SSTables seems not be stucked. Not sure if this is expected. Writing with TTL
> is also a bit problematic, in case our retention policy changes in general or
> for particular customers.
Search for my Cassandra summit talk from 2016 - there's a few other compaction
options you probably want to set to more aggressively trigger single sstable
compaction to help unstick it.
> A few questions, as we need some short-term (with C* 2.1) and long-term (with
> C* 3.0) mitigation:
> · With STCS, estimated droppable tombstones being always 0.0 (thus
> also no automatic single SSTable compaction may happen): Is this a matter of
> not writing with TTL? If yes, would enabling TTL with STCS improve the disk
> reclaim situation, cause then single SSTAble compactions will kick in?
> · What is the semantic of “default_time_to_live” at table level?
> From: http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_c.html :
> “After the default_time_to_live TTL value has been exceed, Cassandra
> tombstones the entire table”. What does “entire table” mean?
It probably means sstable, but even that isn't really accurate - that's a doc
> Hopefully / I guess I don’t end up with an empty table every X past TTLs?
> · Anything else I’m missing regarding STCS and reclaiming disk space
> earlier in our TS use case?
LCS rewrites much more aggressively on partition updates - if you can spare the
IO it's likely going to be more efficient purging deleted data than STCS
> · I know, changing compaction is a matter of executing ALTER TABLE
> (or temporary via JMX for a single node), but as we have legacy data being
> written without TTL, I wonder if we may end up in stuck SSTable again
> · In case of stuck SSTables with any compaction strategy, what is the
> best way to debug/analyze why it got stuck (overlapping etc.)?
> Thanks a lot and sorry for the lengthy email.
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or disclose
> it to anyone else. If you received it in error please notify us immediately
> and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h)
> is a company registered in Linz whose registered office is at 4040 Linz,
> Austria, Freistädterstraße 313