[
https://issues.apache.org/jira/browse/CASSANDRA-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969829#action_12969829
]
Ryan King commented on CASSANDRA-1083:
--------------------------------------
To be honest, I'm not sure this is the best approach anymore. I think the
fundamental problem is that its driven by the write traffic, not the read
traffic.
> Improvement to CompactionManger's submitMinorIfNeeded
> -----------------------------------------------------
>
> Key: CASSANDRA-1083
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1083
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Ryan King
> Assignee: Tyler Hobbs
> Priority: Minor
> Fix For: 0.7.1
>
> Attachments: 1083-configurable-compaction-thresholds.patch,
> compaction_simulation.rb, compaction_simulation.rb
>
>
> We've discovered that we are unable to tune compaction the way we want for
> our production cluster. I think the current algorithm doesn't do this as well
> as it could, since it doesn't sort the sstables by size before doing the
> bucketing, which means the tuning parameters have unpredictable results.
> I looked at CASSANDRA-792, but it seems like overkill. Here's an alternative
> proposal:
> config operations:
> minimumCompactionThreshold
> maximumCompactionThreshold
> targetSSTableCount
> The first two would mean what they currently mean: the bounds on how many
> sstables to compact in one compaction operation. The 3rd is a target for how
> many SSTables you'd like to have.
> Pseudo code algorithm for determining whether or not to do a minor compaction:
> {noformat}
> if sstables.length + minimumCompactionThreshold -1 > targetSSTableCount
> sort sstables from smallest to largest
> compact the up to maximumCompactionThreshold smallest tables
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.