[
https://issues.apache.org/jira/browse/CASSANDRA-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114977#comment-15114977
]
Marcus Eriksson commented on CASSANDRA-11060:
---------------------------------------------
also see CASSANDRA-11056
> Allow DTCS old SSTable filtering to use min timestamp instead of max
> --------------------------------------------------------------------
>
> Key: CASSANDRA-11060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11060
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sam Bisbee
> Labels: dtcs
>
> We have observed a DTCS behavior when using TTLs where SSTables are never or
> very rarely fully expired due to compaction, allowing expired data to be
> "stuck" in large partially expired SSTables.
> This is because compaction filtering is performed on the max timestamp, which
> continues to grow as SSTables are compacted together. This means they will
> never move past max_sstable_age_days. With a sufficiently large TTL, like 30
> days, this allows old but not expired SSTables to continue combining and
> never become fully expired, even with a max_sstable_age_days of 1.
> As a result we have seen expired data hang around in large SSTables for over
> six months longer than it should have. This is obviously wasteful and a disk
> capacity issue.
> As a result we have been running an extended version of DTCS called MTCS in
> some deployments. The only change is that it uses min timestamp instead of
> max for compaction filtering (filterOldSSTables()). This allows SSTables to
> move beyond max_sstable_age_days and stop compacting, which means the entire
> SSTable can become fully expired and be dropped off disk as intended.
> You can see and test MTCS here: https://github.com/threatstack/mtcs
> I am not advocating that MTCS be its own stand alone compaction strategy.
> However, I would like to see a configuration option for DTCS that allows you
> to specify whether old SSTables should be filtered on min or max timestamp.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)