[
https://issues.apache.org/jira/browse/CASSANDRA-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147177#comment-16147177
]
ASF GitHub Bot commented on CASSANDRA-10496:
--------------------------------------------
GitHub user iksaif opened a pull request:
https://github.com/apache/cassandra/pull/147
[wip] CASSANDRA-10496
Done:
- --split-output kind of work when running nodetool compact
- Values with timestamps outside of the first window should be isolated
and merged back to the correct sstable (which maybe quite costly for now
as
it involves re-writing huge sstables for single values)
TODO:
- Unit tests
- Fix remaining TODOs
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/iksaif/cassandra cassandra-10496-trunk
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/cassandra/pull/147.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #147
----
commit 785182fa9f977c65c201cea135ffc8076170276d
Author: Corentin Chary <[email protected]>
Date: 2017-04-28T09:49:56Z
CASSANDRA-10496
Done:
- --split-output kind of work when running nodetool compact
- Values with timestamps outside of the first window should be isolated
and merged back to the correct sstable (which maybe quite costly for now
as
it involves re-writing huge sstables for single values)
TODO:
- Unit tests
- Fix remaining TODOs
----
> Make DTCS/TWCS split partitions based on time during compaction
> ---------------------------------------------------------------
>
> Key: CASSANDRA-10496
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10496
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Marcus Eriksson
> Labels: dtcs
> Fix For: 4.x
>
>
> To avoid getting old data in new time windows with DTCS (or related, like
> [TWCS|CASSANDRA-9666]), we need to split out old data into its own sstable
> during compaction.
> My initial idea is to just create two sstables, when we create the compaction
> task we state the start and end times for the window, and any data older than
> the window will be put in its own sstable.
> By creating a single sstable with old data, we will incrementally get the
> windows correct - say we have an sstable with these timestamps:
> {{[100, 99, 98, 97, 75, 50, 10]}}
> and we are compacting in window {{[100, 80]}} - we would create two sstables:
> {{[100, 99, 98, 97]}}, {{[75, 50, 10]}}, and the first window is now
> 'correct'. The next compaction would compact in window {{[80, 60]}} and
> create sstables {{[75]}}, {{[50, 10]}} etc.
> We will probably also want to base the windows on the newest data in the
> sstables so that we actually have older data than the window.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]