[ https://issues.apache.org/jira/browse/CASSANDRA-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988455#comment-15988455 ]
Corentin Chary commented on CASSANDRA-10496: -------------------------------------------- I wanted to give it a shot for TWCS because of CASSANDRA-13418, I was thinking about using a custom CompactionAwareWriter to seggregate data by timestamp in the first window (and also make --split-output work). Currently I'm planning to use partition.stats().minTimestamp, but I'm not sure how it is affect by read-repairs. It may be a better idea to group data by deletion time instead .. > Make DTCS/TWCS split partitions based on time during compaction > --------------------------------------------------------------- > > Key: CASSANDRA-10496 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10496 > Project: Cassandra > Issue Type: Improvement > Reporter: Marcus Eriksson > Labels: dtcs > Fix For: 3.11.x > > > To avoid getting old data in new time windows with DTCS (or related, like > [TWCS|CASSANDRA-9666]), we need to split out old data into its own sstable > during compaction. > My initial idea is to just create two sstables, when we create the compaction > task we state the start and end times for the window, and any data older than > the window will be put in its own sstable. > By creating a single sstable with old data, we will incrementally get the > windows correct - say we have an sstable with these timestamps: > {{[100, 99, 98, 97, 75, 50, 10]}} > and we are compacting in window {{[100, 80]}} - we would create two sstables: > {{[100, 99, 98, 97]}}, {{[75, 50, 10]}}, and the first window is now > 'correct'. The next compaction would compact in window {{[80, 60]}} and > create sstables {{[75]}}, {{[50, 10]}} etc. > We will probably also want to base the windows on the newest data in the > sstables so that we actually have older data than the window. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org