[
https://issues.apache.org/jira/browse/CASSANDRA-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969043#action_12969043
]
Tyler Hobbs edited comment on CASSANDRA-1083 at 12/7/10 5:18 PM:
-----------------------------------------------------------------
One nice thing about this strategy is that in steady state, you're compacting
about 1/target of your total SSTable data by size. This gives you a much
smoother (and tunable) impact from compaction. Recompaction of recently
compacted data shouldn't be any more frequent than with the current strategy;
this is especially true since there would no longer be cascading compactions.
Minor nitpick -- compactions happen after every min_compaction_threshold - 1
flushes, so a default of 5 instead of 4 might be a good idea.
I think this should be easy to code up. Jonathan, do you want to me to go
ahead with this?
was (Author: thobbs):
One nice thing about this strategy is that in steady state, you're
compacting about 1/target of your total SSTable data by size. This gives you a
much smoother (and tunable) impact from compaction. Recompaction of recently
compacted data shouldn't be any more frequent than with the current strategy;
this is especially true since there would no longer be cascading compactions.
Minor nitpick -- compactions happen after every min_compaction_threshold - 1
thresholds, so a default of 5 instead of 4 might be a good idea.
I think this should be easy to code up. Jonathan, do you want to me to go
ahead with this?
> Improvement to CompactionManger's submitMinorIfNeeded
> -----------------------------------------------------
>
> Key: CASSANDRA-1083
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1083
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Ryan King
> Assignee: Tyler Hobbs
> Priority: Minor
> Fix For: 0.7.1
>
> Attachments: 1083-configurable-compaction-thresholds.patch,
> compaction_simulation.rb
>
>
> We've discovered that we are unable to tune compaction the way we want for
> our production cluster. I think the current algorithm doesn't do this as well
> as it could, since it doesn't sort the sstables by size before doing the
> bucketing, which means the tuning parameters have unpredictable results.
> I looked at CASSANDRA-792, but it seems like overkill. Here's an alternative
> proposal:
> config operations:
> minimumCompactionThreshold
> maximumCompactionThreshold
> targetSSTableCount
> The first two would mean what they currently mean: the bounds on how many
> sstables to compact in one compaction operation. The 3rd is a target for how
> many SSTables you'd like to have.
> Pseudo code algorithm for determining whether or not to do a minor compaction:
> {noformat}
> if sstables.length + minimumCompactionThreshold -1 > targetSSTableCount
> sort sstables from smallest to largest
> compact the up to maximumCompactionThreshold smallest tables
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.