[
https://issues.apache.org/jira/browse/CASSANDRA-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025316#comment-13025316
]
Peter Schuller commented on CASSANDRA-2559:
-------------------------------------------
This may intersect with the solution to avoiding AES on small CF:s haing to
wait for huge long-running AES jobs on large CF:s. I didn't file that because I
was going to figure out whether the concurrent compaction work already
addressed it. I take it that it doesn't, but this would help.
So, that's another potential motivation.
> Distinguish long and short running compactions
> ----------------------------------------------
>
> Key: CASSANDRA-2559
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2559
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Sylvain Lebresne
> Priority: Minor
> Labels: compaction
>
> Unless you have SSD, multi-threaded compaction is mainly here to avoid
> accumulating lots of newly flushed sstables while a long lasting compaction
> is running. But too many concurrent compactions are bad for random IO.
> CASSANDRA-2558 will allow to limit the number of such concurrent compactions,
> but choosing the right number there is not easy. If you pick too low a
> number, you risk accumulating "young" sstables if 2 or 3 fairly long
> compaction runs at the same time. On the other side, compacting multiple
> "small" sstables is likely to be less efficient (on a spinning disk) than
> compacting them serially.
> It seems to me we could have the best of both world by distinguishing long
> and short compactions. We could have 2 pools of thread, one for long
> compaction (whatever the exact definition is) and one for short ones. With
> this, even with one thread in each pool you would avoid most of the 'new
> sstable accumulation' problem while making sure you never run too many
> concurrent compactions (note that in theory we could stratify further than
> "short" and "long", but I'm not sure the benefits would out-weigh the added
> complexity).
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira