Re: Understanding when Cassandra drops expired time series data

Jason J. W. Williams Fri, 17 Jun 2016 12:36:49 -0700

Hey Jeff,

Do most of those behaviors apply to TWCS too?


-J

On Fri, Jun 17, 2016 at 1:25 PM, Jeff Jirsa <[email protected]>
wrote:

> First, DTCS in 2.0.15 has some weird behaviors -
> https://issues.apache.org/jira/browse/CASSANDRA-9572 .
>
>
>
> That said, some other general notes:
>
>
> Data deleted by TTL isn’t the same as issuing a delete – each expiring
> cell internally has a ttl/timestamp at which it will be converted into a
> tombstone. There is no tombstone added to the memtable, or flushed to disk
> – it just treats the expired cells as tombstones once they’re past that
> timestamp.
>
>
>
>
> Cassandra’s getFullyExpiredSSTables() will consider a table fully expired
> if (and only if) all cells within that table are expired (current time >
> max timestamp ) AND the sstable timestamps don’t overlap with others that
> aren’t fully expired. Björn talks about this in
> https://issues.apache.org/jira/browse/CASSANDRA-8243 - the intent here is
> so that explicit deletes (which do create tombstones) won’t be GC’d  from
> an otherwise fully expired sstable if they’re covering data in a more
> recent sstable – without this check, we could accidentally bring dead data
> back to life. In an append only time series workload this would be unusual,
> but not impossible.
>
> Unfortunately, read repairs (foreground/blocking, if you write with CL <
> ALL and read with CL > ONE) will cause cells written with old timestamps to
> be written into the newly flushed sstables, which creates sstables with
> wide gaps between minTimestamp and maxTimestamp (you could have a read
> repair pull data that is 23 hours old into a new sstable, and now that one
> sstable spans 23 hours, and isn’t fully expired until the oldest data is 47
> hours old). There’s an open ticket (
> https://issues.apache.org/jira/browse/CASSANDRA-10496 ) meant to make
> this behavior ‘better’ in the future by splitting those old read-repaired
> cells from the newly flushed sstables.
>
>
>
>
> I gave a talk on a lot of this behavior last year at Summit (
> http://www.slideshare.net/JeffJirsa1/cassandra-summit-2015-real-world-dtcs-for-operators
> ) - if you’re running time series in production on DTCS, it’s worth a
> glance.
>
>
>
>
>
>
>
> *From: *jerome <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Friday, June 17, 2016 at 11:52 AM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Understanding when Cassandra drops expired time series data
>
>
>
> Hello! Recently I have been trying to familiarize myself with Cassandra
> but don't quite understand when data is removed from disk after it has been
> deleted. The use case I'm particularly interested is expiring time series
> data with DTCS. As an example, I created the following table:
>
> CREATE TABLE metrics (
>
>   metric_id text,
>
>   time timestamp,
>
>   value double,
>
>   PRIMARY KEY (metric_id, time),
>
> ) WITH CLUSTERING ORDER BY (time DESC) AND
>
>      default_time_to_live = 86400 AND
>
>      gc_grace_seconds = 3600 AND
>
>      compaction = {
>
>       'class': 'DateTieredCompactionStrategy',
>
>       'timestamp_resolution':'MICROSECONDS',
>
>       'base_time_seconds':'3600',
>
>       'max_sstable_age_days':'365',
>
>       'min_threshold':'4'
>
>      };
>
> I understand that Cassandra will create a tombstone for all rows inserted
> into this table 24 hours after they are inserted (86400 seconds). These
> tombstones will first be written to an in-memory Memtable and then flushed
> to disk as an SSTable when the Memtable reaches a certain size. My question
> is when will the data that is now expired be removed from disk? Is it the
> next time the SSTable which contains the data gets compacted? So, with DTCS
> and min_threshold set to four, we would wait until at least three other
> SSTables are in the same time window as the expired data, and then those
> SSTables will be compacted into a SSTable without the expired data. Is it
> only during this compaction that the data will be removed? It seems to me
> that this would require Cassandra to maintain some metadata on which rows
> have been deleted since the newer tombstones would likely not be in the
> older SSTables that are being compacted. Also, I'm aware that Cassandra can
> drop entire SSTables if they contain only expired data but I'm unsure of
> what qualifies as expired data (is it just SSTables whose maximum
> timestamp is past the default TTL for the table?) and when such SSTables
> are dropped.
>
> Alternatively, do the SSTables which contain the tombstones have to be
> compacted with the SSTables which contain the expired data for the data to
> be removed? It seems to me that this could result in Cassandra holding the
> expired data long after it has expired since it's waiting for the new
> tombstones to be compacted with the older expired data.
>
> Finally, I was also unsure when the tombstones themselves are removed. I
> know Cassandra does not delete them until after gc_grace_seconds but it
> can't delete the tombstones until it's sure the expired data has been
> deleted right? Otherwise it would see the expired data as being valid.
> Consequently, it seems to me that the question of when tombstones are
> deleted is intimately tied to the questions above.
>
> Thanks in advance! If it helps I've been experimenting with version 2.0.15
> myself.
>

Re: Understanding when Cassandra drops expired time series data

Reply via email to