Hello, our top contributor from a data volume perspective is time series data. We are running with STCS since our initial production deployment in 2014 with several clusters with a varying number of nodes, but currently with max. 9 nodes per single cluster per different region in AWS with m4.xlarge / EBS gp2 storage. We have a road of Cassandra versions starting with 1.2 to actually using DSC 2.1.15 soon to be replaced by Apache Cassandra 2.1.18 across all deployments. Lately we switched from Thrift (Astyanax) to Native/CQL (DataStax driver). Overall we are pretty happy with stability and the scale out offering.
We store time series data in different resolutions, from 1min up to 1day aggregates per "time slot". Each resolution has its own column family / table and a periodic worker is executing our business logic regarding time series aging from e.g. 1min => 5min => ... resolution + deletion in source resolutions according to our retention per resolution policy. So deletions will happen way later (e.g. at least > 14d). We don't use TTLs on written time series data (in production, see TWCS testing below), so purging is exclusively handled by explicit DELETEs in our aging business logic creating tombstones. Naturally with STCS and late explicit deletions / tombstones, it will take a lot of time to finally reclaim disk space, even worse, we are now running a major compaction every X weeks. We currently are also testing with STCS min_threshold = 2 etc., but all in all, this all feels not being a long-term solution. I guess there is nothing else we are missing from a configuration/setting side with STCS? Single SSTable compaction might not kick in as well, cause checking with sstablemeta, estimated droppable tombstones for our time series based SSTables is pretty much 0.0 all the time. I guess as we don't write with TTL? TWCS caught my eyes in 2015 I think, and even more at the Cassandra Summit 2016 + other Tombstone related talks. Cassandra 3.0 is around 6 months ahead for us, thus initial testing was with 2.1.18 patched with TWCS from GitHub. Looks like TWCS is exactly what we need, thus test-wise, once we start writing with TTL we end up with a single SSTable per passed window size and data (SSTables) older than TTL + grace get automatically removed from disk. Even with enabled out-of-orders DELETEs from our business logic, purging SSTables seems not be stucked. Not sure if this is expected. Writing with TTL is also a bit problematic, in case our retention policy changes in general or for particular customers. A few questions, as we need some short-term (with C* 2.1) and long-term (with C* 3.0) mitigation: * With STCS, estimated droppable tombstones being always 0.0 (thus also no automatic single SSTable compaction may happen): Is this a matter of not writing with TTL? If yes, would enabling TTL with STCS improve the disk reclaim situation, cause then single SSTAble compactions will kick in? * What is the semantic of "default_time_to_live" at table level? From: http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_c.html : "After the default_time_to_live TTL value has been exceed, Cassandra tombstones the entire table". What does "entire table" mean? Hopefully / I guess I don't end up with an empty table every X past TTLs? * Anything else I'm missing regarding STCS and reclaiming disk space earlier in our TS use case? * I know, changing compaction is a matter of executing ALTER TABLE (or temporary via JMX for a single node), but as we have legacy data being written without TTL, I wonder if we may end up in stuck SSTable again * In case of stuck SSTables with any compaction strategy, what is the best way to debug/analyze why it got stuck (overlapping etc.)? Thanks a lot and sorry for the lengthy email. Thomas The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freist?dterstra?e 313