[
https://issues.apache.org/jira/browse/CASSANDRA-8243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209684#comment-14209684
]
Marcus Eriksson commented on CASSANDRA-8243:
--------------------------------------------
Problem with this is that we might drop tombstones that actually cover data in
other sstables, even though that data is also expired.
I don't see any reason that this would make a difference to users, but I'm
gonna throw up the [~slebresne]-flag here as he said back in CASSANDRA-5228
that we must account for the timestamp of candidates that cover data in other
sstables (in the code in the comment from Mar 21st)
> DTCS can leave time-overlaps, limiting ability to expire entire SSTables
> ------------------------------------------------------------------------
>
> Key: CASSANDRA-8243
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8243
> Project: Cassandra
> Issue Type: Bug
> Reporter: Björn Hegerfors
> Assignee: Björn Hegerfors
> Priority: Minor
> Labels: compaction, performance
> Fix For: 2.0.12, 2.1.3
>
> Attachments: cassandra-trunk-CASSANDRA-8243-aggressiveTTLExpiry.txt
>
>
> CASSANDRA-6602 (DTCS) and CASSANDRA-5228 are supposed to be a perfect match
> for tables where every value is written with a TTL. DTCS makes sure to keep
> old data separate from new data. So shortly after the TTL has passed,
> Cassandra should be able to throw away the whole SSTable containing a given
> data point.
> CASSANDRA-5228 deletes the very oldest SSTables, and only if they don't
> overlap (in terms of timestamps) with another SSTable which cannot be deleted.
> DTCS however, can't guarantee that SSTables won't overlap (again, in terms of
> timestamps). In a test that I ran, every single SSTable overlapped with its
> nearest neighbors by a very tiny amount. My reasoning for why this could
> happen is that the dumped memtables were already overlapping from the start.
> DTCS will never create an overlap where there is none. I surmised that this
> happened in my case because I sent parallel writes which must have come out
> of order. This was just locally, and out of order writes should be much more
> common non-locally.
> That means that the SSTable removal optimization may never get a chance to
> kick in!
> I can see two solutions:
> 1. Make DTCS split SSTables on time window borders. This will essentially
> only be done on a newly dumped memtable once every base_time_seconds.
> 2. Make TTL SSTable expiry more aggressive. Relax the conditions on which an
> SSTable can be dropped completely, of course without affecting any semantics.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)