BTW, when I say "major compaction", I mean running the "nodetool compact" command (which does a major compaction for Sized Tiered Compaction). I didn't see the distribution of SSTables I expected until I ran that command, in the steps I described below.
-Mike On Feb 14, 2013, at 3:51 PM, Wei Zhu wrote: > I haven't tried to switch compaction strategy. We started with LCS. > > For us, after massive data imports (5000 w/seconds for 6 days), the first > repair is painful since there is quite some data inconsistency. For 150G > nodes, repair brought in about 30 G and created thousands of pending > compactions. It took almost a day to clear those. Just be prepared LCS is > really slow in 1.1.X. System performance degrades during that time since > reads could go to more SSTable, we see 20 SSTable lookup for one read.. (We > tried everything we can and couldn't speed it up. I think it's single > threaded.... and it's not recommended to turn on multithread compaction. We > even tried that, it didn't help )There is parallel LCS in 1.2 which is > supposed to alleviate the pain. Haven't upgraded yet, hope it works:) > > http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 > > > Since our cluster is not write intensive, only 100 w/seconds. I don't see any > pending compactions during regular operation. > > One thing worth mentioning is the size of the SSTable, default is 5M which is > kind of small for 200G (all in one CF) data set, and we are on SSD. It more > than 150K files in one directory. (200G/5M = 40K SSTable and each SSTable > creates 4 files on disk) You might want to watch that and decide the SSTable > size. > > By the way, there is no concept of Major compaction for LCS. Just for fun, > you can look at a file called $CFName.json in your data directory and it > tells you the SSTable distribution among different levels. > > -Wei > > From: Charles Brophy <cbro...@zulily.com> > To: user@cassandra.apache.org > Sent: Thursday, February 14, 2013 8:29 AM > Subject: Re: Size Tiered -> Leveled Compaction > > I second these questions: we've been looking into changing some of our CFs to > use leveled compaction as well. If anybody here has the wisdom to answer them > it would be of wonderful help. > > Thanks > Charles > > On Wed, Feb 13, 2013 at 7:50 AM, Mike <mthero...@yahoo.com> wrote: > Hello, > > I'm investigating the transition of some of our column families from Size > Tiered -> Leveled Compaction. I believe we have some high-read-load column > families that would benefit tremendously. > > I've stood up a test DB Node to investigate the transition. I successfully > alter the column family, and I immediately noticed a large number (1000+) > pending compaction tasks become available, but no compaction get executed. > > I tried running "nodetool sstableupgrade" on the column family, and the > compaction tasks don't move. > > I also notice no changes to the size and distribution of the existing > SSTables. > > I then run a major compaction on the column family. All pending compaction > tasks get run, and the SSTables have a distribution that I would expect from > LeveledCompaction (lots and lots of 10MB files). > > Couple of questions: > > 1) Is a major compaction required to transition from size-tiered to leveled > compaction? > 2) Are major compactions as much of a concern for LeveledCompaction as their > are for Size Tiered? > > All the documentation I found concerning transitioning from Size Tiered to > Level compaction discuss the alter table cql command, but I haven't found too > much on what else needs to be done after the schema change. > > I did these tests with Cassandra 1.1.9. > > Thanks, > -Mike > > >