[
https://issues.apache.org/jira/browse/CASSANDRA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stu Hood updated CASSANDRA-2191:
--------------------------------
Attachment: 0005-Add-a-harness-to-allow-compaction-tasks-that-need-to-a.txt
0004-Allow-multithread-compaction-to-be-disabled.txt
0003-Expose-multiple-compactions-via-JMX-and-a-concrete-ser.txt
0002-Use-the-compacting-set-of-sstables-to-schedule-multith.txt
0001-Add-a-compacting-set-to-DataTracker.txt
* Implemented "acquire the write lock long enough to schedule" for major,
cleanup and scrub: it's alone in patch 0005 for clarity
* Removed the JMX methods I had deprecated before, and added a method that
returns serialized objects for more programmatic access. The serialized object
is concrete CompactionInfo, which replaces (I|A)CompactionInfo
> Multithread across compaction buckets
> -------------------------------------
>
> Key: CASSANDRA-2191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2191
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Stu Hood
> Assignee: Stu Hood
> Priority: Critical
> Labels: compaction
> Fix For: 0.8
>
> Attachments: 0001-Add-a-compacting-set-to-DataTracker.txt,
> 0002-Use-the-compacting-set-of-sstables-to-schedule-multith.txt,
> 0003-Expose-multiple-compactions-via-JMX-and-a-concrete-ser.txt,
> 0004-Allow-multithread-compaction-to-be-disabled.txt,
> 0005-Add-a-harness-to-allow-compaction-tasks-that-need-to-a.txt
>
>
> This ticket overlaps with CASSANDRA-1876 to a degree, but the approaches and
> reasoning are different enough to open a separate issue.
> The problem with compactions currently is that they compact the set of
> sstables that existed the moment the compaction started. This means that for
> longer running compactions (even when running as fast as possible on the
> hardware), a very large number of new sstables might be created in the
> meantime. We have observed this proliferation of sstables killing performance
> during major/high-bucketed compactions.
> One approach would be to pause compactions in upper buckets (containing
> larger files) when compactions in lower buckets become possible. While this
> would likely solve the problem with read performance, it does not actually
> help us perform compaction any faster, which is a reasonable requirement for
> other situations.
> Instead, we need to be able to perform any compactions that are currently
> required in parallel, independent of what bucket they might be in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira