Hello,

our top contributor from a data volume perspective is time series data. We are 
running with STCS since our initial production deployment in 2014 with several 
clusters with a varying number of nodes, but currently with max. 9 nodes per 
single cluster per different region in AWS with m4.xlarge / EBS gp2 storage. We 
have a road of Cassandra versions starting with 1.2 to actually using DSC 
2.1.15 soon to be replaced by Apache Cassandra 2.1.18 across all deployments. 
Lately we switched from Thrift (Astyanax) to Native/CQL (DataStax driver). 
Overall we are pretty happy with stability and the scale out offering.

We store time series data in different resolutions, from 1min up to 1day 
aggregates per "time slot". Each resolution has its own column family / table 
and a periodic worker is executing our business logic regarding time series 
aging from e.g. 1min => 5min => ... resolution + deletion in source resolutions 
according to our retention per resolution policy. So deletions will happen way 
later (e.g. at least > 14d). We don't use TTLs on written time series data (in 
production, see TWCS testing below), so purging is exclusively handled by 
explicit DELETEs in our aging business logic creating tombstones.

Naturally with STCS and late explicit deletions / tombstones, it will take a 
lot of time to finally reclaim disk space, even worse, we are now running a 
major compaction every X weeks. We currently are also testing with STCS 
min_threshold = 2 etc., but all in all, this all feels not being a long-term 
solution. I guess there is nothing else we are missing from a 
configuration/setting side with STCS? Single SSTable compaction might not kick 
in as well, cause checking with sstablemeta, estimated droppable tombstones for 
our time series based SSTables is pretty much 0.0 all the time. I guess as we 
don't write with TTL?

TWCS caught my eyes in 2015 I think, and even more at the Cassandra Summit 2016 
+ other Tombstone related talks. Cassandra 3.0 is around 6 months ahead for us, 
thus initial testing was with 2.1.18 patched with TWCS from GitHub.

Looks like TWCS is exactly what we need, thus test-wise, once we start writing 
with TTL we end up with a single SSTable per passed window size and data 
(SSTables) older than TTL + grace get automatically removed from disk. Even 
with enabled out-of-orders DELETEs from our business logic, purging SSTables 
seems not be stucked. Not sure if this is expected. Writing with TTL is also a 
bit problematic, in case our retention policy changes in general or for 
particular customers.

A few questions, as we need some short-term (with C* 2.1) and long-term (with 
C* 3.0) mitigation:

*         With STCS, estimated droppable tombstones being always 0.0 (thus also 
no automatic single SSTable compaction may happen): Is this a matter of not 
writing with TTL? If yes, would enabling TTL with STCS improve the disk reclaim 
situation, cause then single SSTAble compactions will kick in?

*         What is the semantic of "default_time_to_live" at table level? From: 
http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_c.html : "After 
the default_time_to_live TTL value has been exceed, Cassandra tombstones the 
entire table". What does "entire table" mean? Hopefully / I guess I don't end 
up with an empty table every X past TTLs?

*         Anything else I'm missing regarding STCS and reclaiming disk space 
earlier in our TS use case?

*         I know, changing compaction is a matter of executing ALTER TABLE (or 
temporary via JMX for a single node), but as we have legacy data being written 
without TTL, I wonder if we may end up in stuck SSTable again

*         In case of stuck SSTables with any compaction strategy, what is the 
best way to debug/analyze why it got stuck (overlapping etc.)?

Thanks a lot and sorry for the lengthy email.

Thomas
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist?dterstra?e 313

Reply via email to