BTW, when I say "major compaction", I mean running the "nodetool compact" 
command (which does a major compaction for Sized Tiered Compaction).  I didn't 
see the distribution of SSTables I expected until I ran that command, in the 
steps I described below.  

-Mike

On Feb 14, 2013, at 3:51 PM, Wei Zhu wrote:

> I haven't tried to switch compaction strategy. We started with LCS. 
> 
> For us, after massive data imports (5000 w/seconds for 6 days), the first 
> repair is painful since there is quite some data inconsistency. For 150G 
> nodes, repair brought in about 30 G and created thousands of pending 
> compactions. It took almost a day to clear those. Just be prepared LCS is 
> really slow in 1.1.X. System performance degrades during that time since 
> reads could go to more SSTable, we see 20 SSTable lookup for one read.. (We 
> tried everything we can and couldn't speed it up. I think it's single 
> threaded.... and it's not recommended to turn on multithread compaction. We 
> even tried that, it didn't help )There is parallel LCS in 1.2 which is 
> supposed to alleviate the pain. Haven't upgraded yet, hope it works:)
> 
> http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2
> 
> 
> Since our cluster is not write intensive, only 100 w/seconds. I don't see any 
> pending compactions during regular operation. 
> 
> One thing worth mentioning is the size of the SSTable, default is 5M which is 
> kind of small for 200G (all in one CF) data set, and we are on SSD.  It more 
> than  150K files in one directory. (200G/5M = 40K SSTable and each SSTable 
> creates 4 files on disk)  You might want to watch that and decide the SSTable 
> size. 
> 
> By the way, there is no concept of Major compaction for LCS. Just for fun, 
> you can look at a file called $CFName.json in your data directory and it 
> tells you the SSTable distribution among different levels. 
> 
> -Wei
> 
> From: Charles Brophy <cbro...@zulily.com>
> To: user@cassandra.apache.org 
> Sent: Thursday, February 14, 2013 8:29 AM
> Subject: Re: Size Tiered -> Leveled Compaction
> 
> I second these questions: we've been looking into changing some of our CFs to 
> use leveled compaction as well. If anybody here has the wisdom to answer them 
> it would be of wonderful help.
> 
> Thanks
> Charles
> 
> On Wed, Feb 13, 2013 at 7:50 AM, Mike <mthero...@yahoo.com> wrote:
> Hello,
> 
> I'm investigating the transition of some of our column families from Size 
> Tiered -> Leveled Compaction.  I believe we have some high-read-load column 
> families that would benefit tremendously.
> 
> I've stood up a test DB Node to investigate the transition.  I successfully 
> alter the column family, and I immediately noticed a large number (1000+) 
> pending compaction tasks become available, but no compaction get executed.
> 
> I tried running "nodetool sstableupgrade" on the column family, and the 
> compaction tasks don't move.
> 
> I also notice no changes to the size and distribution of the existing 
> SSTables.
> 
> I then run a major compaction on the column family.  All pending compaction 
> tasks get run, and the SSTables have a distribution that I would expect from 
> LeveledCompaction (lots and lots of 10MB files).
> 
> Couple of questions:
> 
> 1) Is a major compaction required to transition from size-tiered to leveled 
> compaction?
> 2) Are major compactions as much of a concern for LeveledCompaction as their 
> are for Size Tiered?
> 
> All the documentation I found concerning transitioning from Size Tiered to 
> Level compaction discuss the alter table cql command, but I haven't found too 
> much on what else needs to be done after the schema change.
> 
> I did these tests with Cassandra 1.1.9.
> 
> Thanks,
> -Mike
> 
> 
> 

Reply via email to