[ 
https://issues.apache.org/jira/browse/CASSANDRA-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14307963#comment-14307963
 ] 

Björn Hegerfors edited comment on CASSANDRA-7019 at 2/5/15 8:59 PM:
--------------------------------------------------------------------

I posted a related ticked some time ago, CASSANDRA-8359. In particular, the 
side note at the end is essentially this ticket exactly, for DTCS. A solution 
to this ticket may or may not solve the main issue in that ticket, but that's a 
matter for that ticket.

Since DTCS SSTables are (supposed to be) separated into time windows, we have 
the concept of an _oldest_ SSTable in a way that we don't with STCS. To me it 
seems pretty clear that a multi-SSTable tombstone compaction on _n_ SSTables 
should always target the _n_ oldest ones. The oldest one alone is practically 
guaranteed to overlap with any other SSTable, in terms of tokens. So picking 
the right SSTables for multi-tombstone compaction should be as easy as sorting 
by age (min timestamp), taking the oldest one, and include the newer ones in 
succession, checking at which point the tombstone ratio is the highest. Or 
something close to that, anyway. Then we might as well write them back as a 
single SSTable, I don't see why not.

EDIT: moved the following to CASSANDRA-7272, where it belongs.

-As for the STCS case, I don't understand why major compaction for STCS isn't 
already optimal. I do see why one might want to compact some but not all 
SSTables in a multi-tombstone compaction (though DTCS should be a better fit 
for anyone wanting this). But if every single SSTable is being rewritten to 
disk, why not write them into one file? As far as I understand, the ultimate 
goal of STCS is to be one SSTable. STCS only gets there, the natural way, once 
in a blue moon. But that's the most optimal state that it can be in. Am I 
wrong?-

-The only explanation I can see for splitting the result of compacting all 
SSTables into fragments, is if those fragments are:-
-1. Partitioned smartly. For example into separate token ranges (à la LCS), 
timestamp ranges (à la DTCS) or clustering column ranges (which would be 
interesting). Or a combination of these.-
-2. The structure upheld by the resulting fragments is not subsequently 
demolished by the running compaction strategy going on with its usual business.-


was (Author: bj0rn):
I posted a related ticked some time ago, CASSANDRA-8359. In particular, the 
side note at the end is essentially this ticket exactly, for DTCS. A solution 
to this ticket may or may not solve the main issue in that ticket, but that's a 
matter for that ticket.

Since DTCS SSTables are (supposed to be) separated into time windows, we have 
the concept of an _oldest_ SSTable in a way that we don't with STCS. To me it 
seems pretty clear that a multi-SSTable tombstone compaction on _n_ SSTables 
should always target the _n_ oldest ones. The oldest one alone is practically 
guaranteed to overlap with any other SSTable, in terms of tokens. So picking 
the right SSTables for multi-tombstone compaction should be as easy as sorting 
by age (min timestamp), taking the oldest one, and include the newer ones in 
succession, checking at which point the tombstone ratio is the highest. Or 
something close to that, anyway. Then we might as well write them back as a 
single SSTable, I don't see why not.

EDIT: moved the all of the below to CASSANDRA-7272, where it belongs.

-As for the STCS case, I don't understand why major compaction for STCS isn't 
already optimal. I do see why one might want to compact some but not all 
SSTables in a multi-tombstone compaction (though DTCS should be a better fit 
for anyone wanting this). But if every single SSTable is being rewritten to 
disk, why not write them into one file? As far as I understand, the ultimate 
goal of STCS is to be one SSTable. STCS only gets there, the natural way, once 
in a blue moon. But that's the most optimal state that it can be in. Am I 
wrong?-

-The only explanation I can see for splitting the result of compacting all 
SSTables into fragments, is if those fragments are:-
-1. Partitioned smartly. For example into separate token ranges (à la LCS), 
timestamp ranges (à la DTCS) or clustering column ranges (which would be 
interesting). Or a combination of these.-
-2. The structure upheld by the resulting fragments is not subsequently 
demolished by the running compaction strategy going on with its usual business.-

> Improve tombstone compactions
> -----------------------------
>
>                 Key: CASSANDRA-7019
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7019
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Marcus Eriksson
>            Assignee: Branimir Lambov
>              Labels: compaction
>             Fix For: 3.0
>
>
> When there are no other compactions to do, we trigger a single-sstable 
> compaction if there is more than X% droppable tombstones in the sstable.
> In this ticket we should try to include overlapping sstables in those 
> compactions to be able to actually drop the tombstones. Might only be doable 
> with LCS (with STCS we would probably end up including all sstables)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to