I've created the patch ticket: https://issues.apache.org/jira/browse/CASSANDRA-1876
On Fri, Dec 17, 2010 at 12:30 PM, Germán Kondolf <german.kond...@gmail.com> wrote: > On Fri, Dec 17, 2010 at 11:15 AM, Jonathan Ellis <jbel...@gmail.com> wrote: >> On Fri, Dec 17, 2010 at 8:01 AM, Germán Kondolf >> <german.kond...@gmail.com>wrote: >> >>> Thanks Jonathan for the feedback. >>> >>> By flush/schema migration you mean the SSTables replace lock? I've put >>> that lock just to be sure, if it's fine by you I'll remove it. >>> I'll clean up the code according to the code-style article, add the >>> parameter to the configuration using a default of "1" and I'll send it >>> again. >>> >>> Why do you think is only worth it on SSDs? >>> >> >> Because even a single compaction causes a ton of i/o contention. 99% of the >> time your concern is how to make compaction use _less_ resources, not more. >> :) > > We guess that depending on the scenario there are room for different > strategies in order to use less resources. > > With short lived keys, a parallel fast compaction jointly with > CASSANDRA-1074 may cause that the node will be compacting for very > short period of time and while this is happening the other nodes could > handle the load provided the compaction takes just seconds. > > In other scenario, with long lived keys, we're thinking that if the > minor compaction just compacted the BF and Indexes and leaving the > SSTables the way they were, we would save the I/O bandwidth we're > using in write phase, and just writing BF and Indexes. > > The proposed structure of SSTables would change an look like this: > LogicSSTable > Index > BloomFilter > Collection<SSTableOnDisk> > > The LogicSSTable contains a the Idx & BF of the given compacted SSTables. > > Where reading a column would implied using the BF, reading the index > which would indicated not only and offset but also a file, and reading > the corresponding file. > > In this way, the minor compaction is just a reading process and not a > writing intensive process. > > Of course, it depends on the behaviour of the dataset. With short > lived keys, this later strategy just makes the major compaction > harder. On the other hand, with the current strategy and long lived > columns, after a while, every column is read and written a lot of > times just to be left in its original state. > > We know that this isn't an easy change, but eventually will try it at > home, so your critics, warnings and advice are welcome. > > Regards. > -- > //GK > german.kond...@gmail.com > // sites > http://twitter.com/germanklf > http://www.facebook.com/germanklf > http://ar.linkedin.com/in/germankondolf > -- //GK german.kond...@gmail.com // sites http://twitter.com/germanklf