[ 
https://issues.apache.org/jira/browse/CASSANDRA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015446#comment-13015446
 ] 

Sylvain Lebresne commented on CASSANDRA-2191:
---------------------------------------------


Here's some number of remarks on this. I wrote those on a plane so I did not 
had access to what was previously said above, and I'm too lazy to edit those 
back (and anyway there little intersections and nothing is outdated I think).

Comments:
  * An unfortunate consequence of this is that the submission of a compaction 
can now fail, in particular cleanup or major ones (more precisely, cleanup 
compactions will leave some sstable unclean and major ones won't be major and 
thus won't clean all tombstones). It will be a pain for users. Not only should 
they check the log once they summit one of those compactions to see what is 
left to do. It's also inefficient: for cleanups, you'll have to submit a new 
one which will redo a full cleanup compaction (and those are not as efficient 
as they could be). For major compaction, it's even worse. Anyway, grabbing the 
write lock instead of the read lock for those compaction should be good enough.
  * The different compactions weren't wrote to be run in parallel initially
    We need a way to deactivate this if something goes wrong. It could
    either be a flag that makes everyone acquire the write lock, or an
    option to make the compaction executor mono-threaded.
  * Related to the point above, I think that the cache writing tasks were fed 
to the compaction executor to make sure we never do 2 cache writing at the same 
time. We need to restore that property somehow.
  * There's a number of TODO in the code. I am not a fan of leaving TODOs 
unless there is a really good reason to, there is JIRA tickets for that. In 
particular, I do not agree with the fact that flusherLock and compactionLock 
should be replaced by a list of sstables (at least this is not as simple as 
this is put).
  * Let's not use clearUnsafe() to initialize DataTracker given its name (but 
I'm fine with renaming it say init() and CFStore.clearUnsafe() calls init()).
  * The patch about closing the sstableScanners should be another ticket for 
traceability sake.


Some more minor remarks:
  * On the JMX reporting: I think it is ok to remove the methods instead of 
deprecating(it will be in a major release and there no indication that this is 
deprecated for the user anyway). Also, since looking at a list of strings in 
jconsole is a bit of a pain, it could be nice to at least expose the active 
count, and maybe a sum over all running compactions of bytesCompacted/toCompact 
(it would less ugly to expose a MBean for each CompactionInfo, but I'm not sure 
how easy/efficient that would be).
  * About ACompactionInfo, we use ICompactionInfo for interface, but 
AbstractCompationInfo for abstract classes. Let's keep it at that for coherence 
of the code base sake.
  * The compacting set field in CompactionManager is dead code. So is 
MIN_COMPACTION_THRESHOLD in CompactionsSet.
  * I don't find the docString for markCompacting very clear (it kinds of 
suggest markCompacting it be in a finally block, and it's unclear that this is 
this method that does the marking.
  * I would prefer adding forwarding methods in CFStore for mark/unmark since 
that's what we've done so far (instead of exporting DataTracker).
  * In validationCompaction, the try would ideally be the next thing after the 
markCompacting().
  * I'd prefer moving toString(CompactionInfo) in ICompactionInfo (or 
AbstractCompactionInfo). Also, since this is for JMX reporting, adding the 
object hashcode will confuse users more than anything else.
  * In compactionExecutor, I suppose the HashMap with Object value is just a 
way to deal with the absence of an IdentityHashSet. If so, for clarity I would 
prefer using Collections.newSetFromMap(new IdentityHashMap()). Or really just a 
set should do it as it would make no sense to redefine the CopmactionInfo 
equality to be anything else than reference equality.


> Multithread across compaction buckets
> -------------------------------------
>
>                 Key: CASSANDRA-2191
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2191
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>            Priority: Critical
>              Labels: compaction
>             Fix For: 0.8
>
>         Attachments: 0001-Add-a-compacting-set-to-DataTracker.txt, 
> 0002-Use-the-compacting-set-of-sstables-to-schedule-multith.txt, 
> 0003-Expose-multiple-compactions-via-JMX-and-deprecate-sing.txt, 
> 0004-Try-harder-to-close-scanners-in-compaction-close.txt
>
>
> This ticket overlaps with CASSANDRA-1876 to a degree, but the approaches and 
> reasoning are different enough to open a separate issue.
> The problem with compactions currently is that they compact the set of 
> sstables that existed the moment the compaction started. This means that for 
> longer running compactions (even when running as fast as possible on the 
> hardware), a very large number of new sstables might be created in the 
> meantime. We have observed this proliferation of sstables killing performance 
> during major/high-bucketed compactions.
> One approach would be to pause compactions in upper buckets (containing 
> larger files) when compactions in lower buckets become possible. While this 
> would likely solve the problem with read performance, it does not actually 
> help us perform compaction any faster, which is a reasonable requirement for 
> other situations.
> Instead, we need to be able to perform any compactions that are currently 
> required in parallel, independent of what bucket they might be in.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to