Multithread across compaction buckets
-------------------------------------

                 Key: CASSANDRA-2191
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2191
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Stu Hood
            Priority: Critical
             Fix For: 0.8


This ticket overlaps with CASSANDRA-1876 to a degree, but the approaches and 
reasoning are different enough to open a separate issue.

The problem with compactions currently is that they compact the set of sstables 
that existed the moment the compaction started. This means that for longer 
running compactions (even when running as fast as possible on the hardware), a 
very large number of new sstables might be created in the meantime. We have 
observed this proliferation of sstables killing performance during 
major/high-bucketed compactions.

One approach would be to pause compactions in upper buckets (containing larger 
files) when compactions in lower buckets become possible. While this would 
likely solve the problem with read performance, it does not actually help us 
perform compaction any faster, which is a reasonable requirement for other 
situations.

Instead, we need to be able to perform any compactions that are currently 
required in parallel, independent of what bucket they might be in.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to