Hi Germán, Thanks for taking a stab at this!
I don't actually think there are going to be any tricky race conditions with flush or schema migration; flush has been parallel for a long time itself, and we already have the lock in CompactionManager for schema migration. To clean this up for submission you'd want to follow the style guide at http://wiki.apache.org/cassandra/CodeStyle, r/m the commented-out sections, and add a configuration parameter for how many compactions to allow simultaneously (IMO it mainly only makes sense to have > 1 when you are running on SSDs, and there's no good way for us to auto-detect that). On Thu, Dec 16, 2010 at 6:04 PM, Germán Kondolf <german.kond...@gmail.com>wrote: > Hi everybody, > > I've just finished the first implementation of a Parallel Compaction > Patch for the trunk version, tomorrow I'll test it with high volumen > of data to see if it works as I expected, but before I wan't to > validate with you the approach. > > I know it's kinda naif, but, maybe it works as starting point for a > future production implementation or at least allow to make > configurable the compaction strategy. > First of all, I don't know in depth the C* code, so maybe I took a few > shortcuts and that's why I need a second look from an expert... > > I've modified the doCompaction method of CompactionManager, added a > few static classes (I'm working to remove them, so V2 is coming), and > simply splitted the sstables to compact in a balanced order and fire > each group compaction in parallel. > > The revision I've based the patch is: 1050234 > The files are attached, the patch and the CompactionManager.java > > Thanks in advance, I'll appreciate the feedback. > > -- > //GK > german.kond...@gmail.com > // sites > http://twitter.com/germanklf > http://www.facebook.com/germanklf > http://ar.linkedin.com/in/germankondolf > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com