We doubled the SStable size to 10M. It still generates a lot of SSTable and we 
don't see much difference of the read latency.  We are able to finish the 
compactions after repair within serveral hours. We will increase the SSTable 
size again if we feel the number of SSTable hurts the performance. 

----- Original Message -----
From: "Mike" <mthero...@yahoo.com>
To: user@cassandra.apache.org
Sent: Sunday, February 17, 2013 4:50:40 AM
Subject: Re: Size Tiered -> Leveled Compaction


Hello Wei, 

First thanks for this response. 

Out of curiosity, what SSTable size did you choose for your usecase, and what 
made you decide on that number? 

Thanks, 
-Mike 

On 2/14/2013 3:51 PM, Wei Zhu wrote: 




I haven't tried to switch compaction strategy. We started with LCS. 


For us, after massive data imports (5000 w/seconds for 6 days), the first 
repair is painful since there is quite some data inconsistency. For 150G nodes, 
repair brought in about 30 G and created thousands of pending compactions. It 
took almost a day to clear those. Just be prepared LCS is really slow in 1.1.X. 
System performance degrades during that time since reads could go to more 
SSTable, we see 20 SSTable lookup for one read.. (We tried everything we can 
and couldn't speed it up. I think it's single threaded.... and it's not 
recommended to turn on multithread compaction. We even tried that, it didn't 
help )There is parallel LCS in 1.2 which is supposed to alleviate the pain. 
Haven't upgraded yet, hope it works:) 


http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 





Since our cluster is not write intensive, only 100 w/seconds. I don't see any 
pending compactions during regular operation. 


One thing worth mentioning is the size of the SSTable, default is 5M which is 
kind of small for 200G (all in one CF) data set, and we are on SSD. It more 
than 150K files in one directory. (200G/5M = 40K SSTable and each SSTable 
creates 4 files on disk) You might want to watch that and decide the SSTable 
size. 


By the way, there is no concept of Major compaction for LCS. Just for fun, you 
can look at a file called $CFName.json in your data directory and it tells you 
the SSTable distribution among different levels. 


-Wei 





From: Charles Brophy <cbro...@zulily.com> 
To: user@cassandra.apache.org 
Sent: Thursday, February 14, 2013 8:29 AM 
Subject: Re: Size Tiered -> Leveled Compaction 


I second these questions: we've been looking into changing some of our CFs to 
use leveled compaction as well. If anybody here has the wisdom to answer them 
it would be of wonderful help. 


Thanks 
Charles 


On Wed, Feb 13, 2013 at 7:50 AM, Mike < mthero...@yahoo.com > wrote: 


Hello, 

I'm investigating the transition of some of our column families from Size 
Tiered -> Leveled Compaction. I believe we have some high-read-load column 
families that would benefit tremendously. 

I've stood up a test DB Node to investigate the transition. I successfully 
alter the column family, and I immediately noticed a large number (1000+) 
pending compaction tasks become available, but no compaction get executed. 

I tried running "nodetool sstableupgrade" on the column family, and the 
compaction tasks don't move. 

I also notice no changes to the size and distribution of the existing SSTables. 

I then run a major compaction on the column family. All pending compaction 
tasks get run, and the SSTables have a distribution that I would expect from 
LeveledCompaction (lots and lots of 10MB files). 

Couple of questions: 

1) Is a major compaction required to transition from size-tiered to leveled 
compaction? 
2) Are major compactions as much of a concern for LeveledCompaction as their 
are for Size Tiered? 

All the documentation I found concerning transitioning from Size Tiered to 
Level compaction discuss the alter table cql command, but I haven't found too 
much on what else needs to be done after the schema change. 

I did these tests with Cassandra 1.1.9. 

Thanks, 
-Mike 





Reply via email to