[
https://issues.apache.org/jira/browse/CASSANDRA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stu Hood updated CASSANDRA-2191:
--------------------------------
Attachment: 0004-Try-harder-to-close-scanners-in-compaction-close.txt
0003-Expose-multiple-compactions-via-JMX-and-deprecate-sing.txt
0002-Use-the-compacting-set-of-sstables-to-schedule-multith.txt
0001-Add-a-compacting-set-to-DataTracker.txt
> Multithread across compaction buckets
> -------------------------------------
>
> Key: CASSANDRA-2191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2191
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Stu Hood
> Priority: Critical
> Labels: compaction
> Fix For: 0.8
>
> Attachments: 0001-Add-a-compacting-set-to-DataTracker.txt,
> 0002-Use-the-compacting-set-of-sstables-to-schedule-multith.txt,
> 0003-Expose-multiple-compactions-via-JMX-and-deprecate-sing.txt,
> 0004-Try-harder-to-close-scanners-in-compaction-close.txt
>
>
> This ticket overlaps with CASSANDRA-1876 to a degree, but the approaches and
> reasoning are different enough to open a separate issue.
> The problem with compactions currently is that they compact the set of
> sstables that existed the moment the compaction started. This means that for
> longer running compactions (even when running as fast as possible on the
> hardware), a very large number of new sstables might be created in the
> meantime. We have observed this proliferation of sstables killing performance
> during major/high-bucketed compactions.
> One approach would be to pause compactions in upper buckets (containing
> larger files) when compactions in lower buckets become possible. While this
> would likely solve the problem with read performance, it does not actually
> help us perform compaction any faster, which is a reasonable requirement for
> other situations.
> Instead, we need to be able to perform any compactions that are currently
> required in parallel, independent of what bucket they might be in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira