Sweet, thanks for the info.
Dean

From: Alain RODRIGUEZ <arodr...@gmail.com<mailto:arodr...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Monday, February 25, 2013 7:41 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Size Tiered -> Leveled Compaction

"After running a major compaction, automatic minor compactions are no longer 
triggered,"

... Because of the size difference between the big sstable generated and the 
new sstable flushed/compacted. Compactions are not stopped, they are just "no 
longer triggered" for a while.

"frequently requiring you to manually run major compactions on a routine basis"

... In order to keep a good read latency. If you don't run compaction 
periodically and you have some row update, you will have an increasing amount 
of rows spread across various sstable. But my guess is that if you have no 
delete, no update and no ttl but only write once row, you may keep this big 
table uncompacted for as long as you want without any read performance 
degradation.

I think the documentation just don't go deep enough in the explanation, or 
maybe this information already exists somewhere else in the documentation.

Wait a confirmation of an expert, I am just an humble user.

Alain


2013/2/25 Hiller, Dean <dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>>
So what you are saying is this documentation is not quite accurate then….(I am 
more confused between your statement and the documentation now)

http://www.datastax.com/docs/1.1/operations/tuning

Which says "After running a major compaction, automatic minor compactions are 
no longer triggered, frequently requiring you to manually run major compactions 
on a routine basis"

Which implied that you have to keep running major compactions and minor 
compactions are not kicking in anymore :( :( and we(my project) want minor 
compactions to continue.

Thanks,
Dean


From: Alain RODRIGUEZ 
<arodr...@gmail.com<mailto:arodr...@gmail.com><mailto:arodr...@gmail.com<mailto:arodr...@gmail.com>>>
Reply-To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Date: Monday, February 25, 2013 7:15 AM
To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Subject: Re: Size Tiered -> Leveled Compaction

"I am confused.  I thought running compact turns off the minor compactions and 
users are actually supposed to run upgradesstables????  (maybe I am on old 
documentation?)"

Well, that's not true. What happens is that compaction use sstables with an 
aproximate same size. So if you run a major compaction on a 10GB CF, you have 
almost no chance of getting that (big) sstable compacted again. You will have 
to wait for other sstables to reach this size or run an other major compaction.

But anyways, this doesn't apply here because we are speaking of LCS (leveled 
compaction strategy), which runs differently from the traditional STC (sized 
tier compaction).

Not sure about it, but you may run upgradesstable or compaction to rebuild your 
sstable after switching from STC  to LCS, I mean both methods trigger an 
initialization of LCS on old sstables.

Alain


2013/2/25 Hiller, Dean 
<dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov><mailto:dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>>>
I am confused.  I thought running compact turns off the minor compactions and 
users are actually supposed to run upgradesstables????  (maybe I am on old 
documentation?)

Can someone verify that?

Thanks,
Dean

From: Michael Theroux 
<mthero...@yahoo.com<mailto:mthero...@yahoo.com><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com>><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com>>>>
Reply-To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>>
Date: Sunday, February 24, 2013 7:45 PM
To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>>
Subject: Re: Size Tiered -> Leveled Compaction

Aaron,

Thanks for the response.  I think I speak for many Cassandra users when I say 
we greatly appreciate your help with our questions and issues.  For the 
specific bug I mentioned, I found this comment :

http://data.story.lu/2012/10/15/cassandra-1-1-6-has-been-released

"Automatic fixing of overlapping leveled sstables (CASSANDRA-4644)"

Although I had difficulty putting 2 and 2 together from the comments in 4644 
(it mentioned being fixed in 1.1.6, but also being not reproducible).

We converted two column families yesterday (two we believe would be 
particularly well suited for Leveled Compaction).  We have two more to convert, 
but those will wait until next weekend.  So far no issues, and, we've seen some 
positive results.

To help answer some of my own questions I posed in this thread, and others have 
expressed interest in knowing, the steps we followed were:

1) Perform the proper alter table command:

ALTER TABLE X WITH compaction_strategy_class='LeveledCompactionStrategy' AND  
compaction_strategy_options:sstable_size_in_mb=10;

2) Ran compact on all nodes

nodetool compact <keyspace> X

We converted one column family at a time, and temporarily disabled some 
maintenance activities we perform to decrease load while we converted column 
families, as the compaction was resource heavy and I didn't wish to interfere 
with our operational activities as much as possible.    In our case, the 
compaction after altering the schema, took about an hour and a half.

Thus far, it appears everything worked without a hitch.  I chose 10 mb for the 
SSTABLE size, based on Wei's feedback (who's data size is on-par with ours), 
and other tid-bits I found through searching.  Based on issues people have 
reported in the relatively distant past. I made sure that we've been handling 
the compaction load properly, and I've run test repairs on the specific tables 
we converted.  We also tested restarting a node after the conversion.

Again, I believe the tables we converted were particularly well suited for 
Leveled Compaction.  These particular column families were situations where 
reads outstripped writes by an order of magnitude or two.

So far, our results have been very positive.  We've seen a greater than 50% 
reduction in read I/O, and a large improvement in performance for some 
activities.  We've also seen an improvement in memory utilization.  I imagine 
other's mileage may vary.

If everything is stable over the next week, we will convert the last two tables 
we are considering for Leveled Compaction.

Thanks again!
-Mike

On Feb 24, 2013, at 8:56 PM, aaron morton wrote:

If you did not use LCS until after the upgrade to 1.1.9 I think you are ok.

If in doubt the steps here look like they helped 
https://issues.apache.org/jira/browse/CASSANDRA-4644?focusedCommentId=13456137&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13456137

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 23/02/2013, at 6:56 AM, Mike 
<mthero...@yahoo.com<mailto:mthero...@yahoo.com><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com>><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com>>>>
 wrote:

Hello,

Still doing research before we potentially move one of our column families from 
Size Tiered->Leveled compaction this weekend.  I was doing some research around 
some of the bugs that were filed against leveled compaction in Cassandra and I 
found this:

https://issues.apache.org/jira/browse/CASSANDRA-4644

The bug mentions:

"You need to run the offline scrub (bin/sstablescrub) to fix the sstable 
overlapping problem from early 1.1 releases. (Running with -m to just check for 
overlaps between sstables should be fine, since you already scrubbed online 
which will catch out-of-order within an sstable.)"

We recently upgraded from 1.1.2 to 1.1.9.

Does anyone know if an offline scrub is recommended to be performed when 
switching from STCS->LCS after upgrading from 1.1.2?

Any insight would be appreciated,
Thanks,
-Mike

On 2/17/2013 8:57 PM, Wei Zhu wrote:
We doubled the SStable size to 10M. It still generates a lot of SSTable and we 
don't see much difference of the read latency.  We are able to finish the 
compactions after repair within serveral hours. We will increase the SSTable 
size again if we feel the number of SSTable hurts the performance.

----- Original Message -----
From: "Mike" 
<mthero...@yahoo.com<mailto:mthero...@yahoo.com><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com>><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com>>>>
To: 
user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Sent: Sunday, February 17, 2013 4:50:40 AM
Subject: Re: Size Tiered -> Leveled Compaction


Hello Wei,

First thanks for this response.

Out of curiosity, what SSTable size did you choose for your usecase, and what 
made you decide on that number?

Thanks,
-Mike

On 2/14/2013 3:51 PM, Wei Zhu wrote:




I haven't tried to switch compaction strategy. We started with LCS.


For us, after massive data imports (5000 w/seconds for 6 days), the first 
repair is painful since there is quite some data inconsistency. For 150G nodes, 
repair brought in about 30 G and created thousands of pending compactions. It 
took almost a day to clear those. Just be prepared LCS is really slow in 1.1.X. 
System performance degrades during that time since reads could go to more 
SSTable, we see 20 SSTable lookup for one read.. (We tried everything we can 
and couldn't speed it up. I think it's single threaded.... and it's not 
recommended to turn on multithread compaction. We even tried that, it didn't 
help )There is parallel LCS in 1.2 which is supposed to alleviate the pain. 
Haven't upgraded yet, hope it works:)


http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2





Since our cluster is not write intensive, only 100 w/seconds. I don't see any 
pending compactions during regular operation.


One thing worth mentioning is the size of the SSTable, default is 5M which is 
kind of small for 200G (all in one CF) data set, and we are on SSD. It more 
than 150K files in one directory. (200G/5M = 40K SSTable and each SSTable 
creates 4 files on disk) You might want to watch that and decide the SSTable 
size.


By the way, there is no concept of Major compaction for LCS. Just for fun, you 
can look at a file called $CFName.json in your data directory and it tells you 
the SSTable distribution among different levels.


-Wei





From: Charles Brophy 
<cbro...@zulily.com<mailto:cbro...@zulily.com><mailto:cbro...@zulily.com<mailto:cbro...@zulily.com>><mailto:cbro...@zulily.com<mailto:cbro...@zulily.com><mailto:cbro...@zulily.com<mailto:cbro...@zulily.com>>>>
To: 
user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Sent: Thursday, February 14, 2013 8:29 AM
Subject: Re: Size Tiered -> Leveled Compaction


I second these questions: we've been looking into changing some of our CFs to 
use leveled compaction as well. If anybody here has the wisdom to answer them 
it would be of wonderful help.


Thanks
Charles


On Wed, Feb 13, 2013 at 7:50 AM, Mike < 
mthero...@yahoo.com<mailto:mthero...@yahoo.com><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com>><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com><mailto:mthero...@yahoo.com<mailto:mthero...@yahoo.com>>>
 > wrote:


Hello,

I'm investigating the transition of some of our column families from Size 
Tiered -> Leveled Compaction. I believe we have some high-read-load column 
families that would benefit tremendously.

I've stood up a test DB Node to investigate the transition. I successfully 
alter the column family, and I immediately noticed a large number (1000+) 
pending compaction tasks become available, but no compaction get executed.

I tried running "nodetool sstableupgrade" on the column family, and the 
compaction tasks don't move.

I also notice no changes to the size and distribution of the existing SSTables.

I then run a major compaction on the column family. All pending compaction 
tasks get run, and the SSTables have a distribution that I would expect from 
LeveledCompaction (lots and lots of 10MB files).

Couple of questions:

1) Is a major compaction required to transition from size-tiered to leveled 
compaction?
2) Are major compactions as much of a concern for LeveledCompaction as their 
are for Size Tiered?

All the documentation I found concerning transitioning from Size Tiered to 
Level compaction discuss the alter table cql command, but I haven't found too 
much on what else needs to be done after the schema change.

I did these tests with Cassandra 1.1.9.

Thanks,
-Mike










Reply via email to