I'm using LCS and a relatively large TTL of 2 years for all inserted rows
and I'm concerned about the moment at wich C* would drop the corresponding
tombstones (neither explicit deletes nor updates are being performed).

>From [Missing Manual for Leveled Compaction Strategy](
https://www.youtube.com/watch?v=-5sNVvL8RwI), [Tombstone Compactions in
Cassandra](https://www.youtube.com/watch?v=pher-9jqqC4) and [Deletes
Without Tombstones or TTLs](https://www.youtube.com/watch?v=BhGkSnBZgJA) I
understand that

 - All levels except L0 contain non-overlapping SSTables, but a partition
key may be present in one SSTable in each level (aka distributed in all
levels).
 - For a compaction to be able to drop a tombstone it must be sure that is
compacting all SStables that contains de data to prevent zombie data (this
is done checking bloom filters). It also considers gc_grace_seconds

So, for my particular use case (2 years TTL and write heavy load) I can
conclude that TTLed data will be in highest levels so I'm wondering when
those SSTables with TTLed data will be compacted with the SSTables that
contains the corresponding SSTables.
The main question will be: **Where are tombstones (from ttls) being
created? Are being created at Level 0 so it will take a long time until it
will end up in the highest levels (hence disk space will take long time to
be freed)?**

In a comment from [About deletes and tombstones](
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html)
Alain says that
> Yet using TTLs helps, it reduces the chances of having data being
fragmented between SSTables that will not be compacted together any time
soon. Using any compaction strategy, if the delete comes relatively late in
the row history, as it use to happen, the 'upsert'/'insert' of the
tombstone will go to a new SSTable. It might take time for this tombstone
to get to the right compaction "bucket" (with the rest of the row) and for
Cassandra to be able to finally free space.
**My understanding is that with TTLs the tombstones is created in-place**,
thus it is often and for many reasons easier and safer to get rid of a TTLs
than from a delete.
Another clue to explore would be to use the TTL as a default value if
that's a good fit. TTLs set at the table level with 'default_time_to_live'
should not generate any tombstone at all in C*3.0+. Not tested on my hand,
but I read about this.

I'm not sure what it means with "*in-place*" since SSTables are immutable.
(I also have some doubts about what it says of using `default_time_to_live`
that I've asked in [How default_time_to_live would delete rows without
tombstones in Cassandra?](https://stackoverflow.com/q/52282517/3517383)).

My guess is that is referring to tombstones being created in the same level
(but different SStables) that the TTLed data during a compaction triggered
by one of the following reasons:

 1. "Going from highest level, any level having score higher than 1.001 can
be picked by a compaction thread" [The Missing Manual for Leveled
Compaction Strategy](
https://image.slidesharecdn.com/csummit16lcstalk-161004232416/95/the-missing-manual-for-leveled-compaction-strategy-wei-deng-ryan-svihla-datastax-cassandra-summit-2016-12-638.jpg?cb=1475693117
)
 2. "If we go 25 rounds without compacting in the highest level, we start
bringing in sstables from that level into lower level compactions" [The
Missing Manual for Leveled Compaction Strategy](
https://image.slidesharecdn.com/csummit16lcstalk-161004232416/95/the-missing-manual-for-leveled-compaction-strategy-wei-deng-ryan-svihla-datastax-cassandra-summit-2016-12-638.jpg?cb=1475693117
)
 3. "When there are no other compactions to do, we trigger a single-sstable
compaction if there is more than X% droppable tombstones in the sstable."
[CASSANDRA-7019](https://issues.apache.org/jira/browse/CASSANDRA-7019)
Since tombstones are created during compaction, I think it may be using
SSTable metadata to estimate droppable tombstones.

**So, compactions (2) and (3) should be creating/dropping tombstones in
highest levels hence using LCS with a large TTL should not be an issue per
se.**
With creating/dropping I mean that the same kind of compactions will be
creating tombstones for expired data and/or dropping tombstones if the gc
period has already passed.

A link to source code that clarifies this situation will be great, thanks.

Reply via email to