[
https://issues.apache.org/jira/browse/CASSANDRA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018063#comment-13018063
]
Sylvain Lebresne commented on CASSANDRA-2191:
---------------------------------------------
Some comments on the new patches:
* I (really) don't like the stopTheWorld function. I don't like the name and
arguments names and I don't find clear the 'abstraction' it provides. But more
importantly it makes the 'unmarking' of sstable very fragile, since the
markCompacting function *is not* in the 'try .. final' block where the unmark
function is. Let's just inline this stopTheWorld function instead (it won't
really be more code).
* The forcing of major compaction in doCompaction is buggy, because there is no
guarantee we haven't skipped some sstable (because of the skip option) or that
some sstable has been removed from the set by lack of disk space. Forcing a
major compaction in those case is wrong. I know that when you submit a major
compaction you could end up doing a minor one just because a sstable got
flushed between the time you grabbed the set of sstables and the time you
checked this set is still complete. But it is not a new problem (nor a big one,
contrarily to wrongfully removing tombstones) so let's open a separate ticket
if we want to fix that.
* The ksname and cfname field in CompactionController should be replaced by
getters (of cfs.metadata.{ksname|cfname}), if only to make sure we don't make
the renaming of ks/cf more problematic that it already is (it's probably no
problem here but avoiding the duplication of fields would be cool anyway).
* As far as I can tell we still have a problem with cache saving in that you
can have two cache saving happening at the same time (which will be bad). To
answer the proposition of a previous comment, I am not a fan of using the flush
writer for this as it has more important things to do (and we would lose the
reporting through JMX). A specific executor could solve the problem but it is a
bit overkill. Maybe a static atomicBoolean that cache writing tasks would
atomically compare to false and set to true at the beginning. If that succeed
they would do the cache writing and set back to false the boolean, otherwise
the task would just return directly. It means it would discard a cache saving
if one is already running but that's probably fine (that's actually probably
what you want).
* I would still prefer a IdentityHashSet rather than a HashMap whose value is
unused in CompactionExecutor for the sake of the clarity of the code.
* In nodetool printCompactionStats, we should keep the println with plain
English as it was before instead of using the much denser (and much less user
friendly) CompactionInfo.toString().
* In SSTableWriter.Builder, when creating the compactionInfo, putting the
OperationType as it is done will look ugly and I think it will be more
confusing than anything else to the user (the previous message was probably
quite enough).
FYI, my monday morning happens to be quite a few hours before the monday
morning 'freeze'. So if there is fixes for the things above by my monday
morning, I'll look at it in priority.
> Multithread across compaction buckets
> -------------------------------------
>
> Key: CASSANDRA-2191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2191
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Stu Hood
> Assignee: Stu Hood
> Priority: Critical
> Labels: compaction
> Fix For: 0.8
>
> Attachments: 0001-Add-a-compacting-set-to-DataTracker.txt,
> 0002-Use-the-compacting-set-of-sstables-to-schedule-multith.txt,
> 0003-Expose-multiple-compactions-via-JMX-and-a-concrete-ser.txt,
> 0004-Allow-multithread-compaction-to-be-disabled.txt,
> 0005-Add-a-harness-to-allow-compaction-tasks-that-need-to-a.txt
>
>
> This ticket overlaps with CASSANDRA-1876 to a degree, but the approaches and
> reasoning are different enough to open a separate issue.
> The problem with compactions currently is that they compact the set of
> sstables that existed the moment the compaction started. This means that for
> longer running compactions (even when running as fast as possible on the
> hardware), a very large number of new sstables might be created in the
> meantime. We have observed this proliferation of sstables killing performance
> during major/high-bucketed compactions.
> One approach would be to pause compactions in upper buckets (containing
> larger files) when compactions in lower buckets become possible. While this
> would likely solve the problem with read performance, it does not actually
> help us perform compaction any faster, which is a reasonable requirement for
> other situations.
> Instead, we need to be able to perform any compactions that are currently
> required in parallel, independent of what bucket they might be in.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira