[jira] [Commented] (CASSANDRA-1608) Redesigned Compaction

Benjamin Coverston (JIRA) Mon, 08 Aug 2011 22:19:25 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081442#comment-13081442
 ]


Benjamin Coverston commented on CASSANDRA-1608:
-----------------------------------------------

.bq Could we allow concurrency between non-overlapping compactions? I assume 
that's what makes things tricky.

TL;DR: Yes, but we would have to disable the ability of the DataTracker to 
evict some sstables from a currently scheduled compaction and instead have it 
remove all of them effectively canceling the scheduled compaction.

A compaction from L0 to L1 causes every SSTable in L1 to be turned over. 
Because of that I can't schedule a concurrent compaction from {L0, L1} and {L1, 
L2} without running the risk of having the source SSTable in the {L1, L2} 
compaction from being evicted by the data tracker.

After L1 it gets a bit easier. If I can force compactions to complete and enter 
a serial mode for {L0, L1} or at least tie L0, L1, and L2 together then 
concurrently compacting higher levels is a little easier and falls under a 
scenario of exclusion. 

L2 - 1. . .10
L3 - 1 . . . 100

We start by choosing L2.1 and find that it overlaps L3.1-11 so we start a 
compaction with 12 SSTables. At this point --

We don't know what the state of the levels will look like after the compaction 
is complete, but we can guess.

If we think that after removing max_sstable_size from L2 we'll still need more 
compactions we can move on to scheduling a new compaction using L2.2, but L2.2 
may still overlap one of the sstables in L3.1-11. So we could choose a new 
candidate from that level until we find a non-overlapping candidate for the 
currently compacting sstables into the next level.





> Redesigned Compaction
> ---------------------
>
>                 Key: CASSANDRA-1608
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chris Goffinet
>            Assignee: Benjamin Coverston
>         Attachments: 1608-v11.txt, 1608-v2.txt
>
>
> After seeing the I/O issues in CASSANDRA-1470, I've been doing some more 
> thinking on this subject that I wanted to lay out.
> I propose we redo the concept of how compaction works in Cassandra. At the 
> moment, compaction is kicked off based on a write access pattern, not read 
> access pattern. In most cases, you want the opposite. You want to be able to 
> track how well each SSTable is performing in the system. If we were to keep 
> statistics in-memory of each SSTable, prioritize them based on most accessed, 
> and bloom filter hit/miss ratios, we could intelligently group sstables that 
> are being read most often and schedule them for compaction. We could also 
> schedule lower priority maintenance on SSTable's not often accessed.
> I also propose we limit the size of each SSTable to a fix sized, that gives 
> us the ability to  better utilize our bloom filters in a predictable manner. 
> At the moment after a certain size, the bloom filters become less reliable. 
> This would also allow us to group data most accessed. Currently the size of 
> an SSTable can grow to a point where large portions of the data might not 
> actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1608) Redesigned Compaction

Reply via email to