[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155144#comment-14155144
 ] 

Jonathan Ellis edited comment on CASSANDRA-7019 at 10/1/14 5:21 PM:
--------------------------------------------------------------------

bq. The problem with starting in high levels is that it will take a long time 
before that data gets included in a (minor) compaction.

But you still have that problem with the "start at L1 and fill every level up 
as you go" approach, just with 90% of your data instead of 100%.

IMO the two options that make the most sense are:

# Just rewrite the existing tables minus tombstones without merging or changing 
levels, as originally proposed
# Write all the sstables out, then pick a level for them when complete such 
that all the sstables fit in the level (and they don't overlap with anything 
flushed + compacted by other threads in the meantime)


was (Author: jbellis):
bq. The problem with starting in high levels is that it will take a long time 
before that data gets included in a (minor) compaction.

But you already have that problem, just with 90% of your data instead of 100%.

IMO the two options that make the most sense are:

# Just rewrite the existing tables minus tombstones without merging or changing 
levels, as originally proposed
# Write all the sstables out, then pick a level for them when complete such 
that all the sstables fit in the level (and they don't overlap with anything 
flushed + compacted by other threads in the meantime)

> Major tombstone compaction
> --------------------------
>
>                 Key: CASSANDRA-7019
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>              Labels: compaction
>             Fix For: 3.0
>
>
> It should be possible to do a "major" tombstone compaction by including all 
> sstables, but writing them out 1:1, meaning that if you have 10 sstables 
> before, you will have 10 sstables after the compaction with the same data, 
> minus all the expired tombstones.
> We could do this in two ways:
> # a nodetool command that includes _all_ sstables
> # once we detect that an sstable has more than x% (20%?) expired tombstones, 
> we start one of these compactions, and include all overlapping sstables that 
> contain older data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to