Multithread across compaction buckets
-------------------------------------
Key: CASSANDRA-2191
URL: https://issues.apache.org/jira/browse/CASSANDRA-2191
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Stu Hood
Priority: Critical
Fix For: 0.8
This ticket overlaps with CASSANDRA-1876 to a degree, but the approaches and
reasoning are different enough to open a separate issue.
The problem with compactions currently is that they compact the set of sstables
that existed the moment the compaction started. This means that for longer
running compactions (even when running as fast as possible on the hardware), a
very large number of new sstables might be created in the meantime. We have
observed this proliferation of sstables killing performance during
major/high-bucketed compactions.
One approach would be to pause compactions in upper buckets (containing larger
files) when compactions in lower buckets become possible. While this would
likely solve the problem with read performance, it does not actually help us
perform compaction any faster, which is a reasonable requirement for other
situations.
Instead, we need to be able to perform any compactions that are currently
required in parallel, independent of what bucket they might be in.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira