[
https://issues.apache.org/jira/browse/CASSANDRA-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955314#comment-14955314
]
Björn Hegerfors commented on CASSANDRA-10496:
---------------------------------------------
If we keep using minTimestamp in DTCS, which is what we've used so far, then we
should flip the logic here and split out anything that's too new for the window
instead.
Perhaps it's worth having a discussion about the pros and cons of using
minTimestamp vs maxTimestamp for DTCS. But maybe not here. Though it would be
quite strongly tied to this ticket. Because if SSTables were split according to
the time windows, then min/max would make no difference. It only makes a
difference specifically for the SSTables that need to be split. Variables to
take into account are how non-fitting SSTables could end up on a node,
switching to this strategy and what to do with major compaction.
> Make DTCS split partitions based on time during compaction
> ----------------------------------------------------------
>
> Key: CASSANDRA-10496
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10496
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Marcus Eriksson
> Labels: dtcs
> Fix For: 3.x
>
>
> To avoid getting old data in new time windows with DTCS (or related, like
> [TWCS|CASSANDRA-9666]), we need to split out old data into its own sstable
> during compaction.
> My initial idea is to just create two sstables, when we create the compaction
> task we state the start and end times for the window, and any data older than
> the window will be put in its own sstable.
> By creating a single sstable with old data, we will incrementally get the
> windows correct - say we have an sstable with these timestamps:
> {{[100, 99, 98, 97, 75, 50, 10]}}
> and we are compacting in window {{[100, 80]}} - we would create two sstables:
> {{[100, 99, 98, 97]}}, {{[75, 50, 10]}}, and the first window is now
> 'correct'. The next compaction would compact in window {{[80, 60]}} and
> create sstables {{[75]}}, {{[50, 10]}} etc.
> We will probably also want to base the windows on the newest data in the
> sstables so that we actually have older data than the window.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)