RE: Performance issue in 3.0.9

Shashank Joshi Thu, 02 Feb 2017 12:26:34 -0800

Hi Matija,
Your experience mirrors ours. Can you please share any lessons learned or 
suggestions you might have ?

We are using CMS because that is the default setting that came with 3.0.9.  We 
had read that G1 was supposed to be the default seeing in 3.0 but the following 
links made it seem as if CMS was working better for 3.0 - 
https://issues.apache.org/jira/browse/CASSANDRA-10326 
http://cstar.datastax.com/graph?stats=518e5484-5ee3-11e5-b421-42010af0688f&metric=99.9th_latency&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=865.37&ymin=0&ymax=158.51

If this is incorrect we can certainly try with G1. Would that be a 
recommendation ?

Thank you to all the others who have asked for more details. Here is some more 
information.

The performance hit is something like 80 times worse than 2.1, if it even 
completes the standard read-write operations that we are running. It is 
actually worse than that because a lot of the reads and writes are failing with 
timeouts due to lack of quorum. We also tried with a CL of ONE for both reads 
and writes just to see if that worked, but that also failed.

Since we had problems when we upgraded 3.0 with 2.1 data, we reproduced the 
performance problem by starting clean in 3.0 and creating all the data fresh in 
3.0.  In this case, we loaded the data into one node, and let replication take 
care of updating the other two. We are using RF of 3 with 3 nodes because our 
app uses QUORUM for better consistency and we want to be able to have an HA 
setup where we tolerate the failure of one node at a time.

Regarding compaction:
We do not update or delete data, nor do we have TTLs on data at this time. So 
it seems as if compaction if any should not be a major concern but we do see it 
happening. So we even tested with autocompaction turned off but did not see any 
improvement.

Thank you for any insights.

-----Original Message-----
From: Matija Gobec [mailto:matija0...@gmail.com] 
Sent: Thursday, February 02, 2017 12:39 AM
To: dev@cassandra.apache.org
Subject: Re: Performance issue in 3.0.9

We ran for months with the same highly tuned setup on 2.1 and once we switched 
to 3.0.9 the performance with the same configuration was crap.
Leveled compaction but a bit more nodes. There are differences in how 2.1 and 
3.0 work so I guess you need to revisit your cassandra.yaml and os settings.
Next to everything Jeff mentioned, is there any reason you have data on all 
nodes and use QUORUM?
Also, is there any reason you are not using G1 with 3.0?

On Thu, Feb 2, 2017 at 6:28 AM, Jeff Jirsa <jji...@gmail.com> wrote:

> Can you quantify "major"?
>
> Latency or throughput?
> GC pauses?
> What did you see before? What do you see now?
> Do you have a stack dump?
>
>
> --
> Jeff Jirsa
>
>
> > On Feb 1, 2017, at 4:23 PM, Shashank Joshi 
> > <shashank.jo...@ericsson.com>
> wrote:
> >
> > We are seeing major performance issues with about 100 GB of data in
> 3.0.9-E001. The exact same app runs very well in 2.1.
> >
> >
> >
> > It feels to us like something is wrong with our configuration 
> > because of
> the severity of the issues. Thanks in advance for any recommendations 
> or suggestions.
> >
> >
> >
> > Details:
> >
> > Size of data: 100 GB+  all in one table, with a simple schema, 
> > couple of
> bigints and a double
> >
> > Cluster: 3 nodes with RF of 3
> >
> > Client: App uses read and write CL of QUORUM and we have a lots of
> timeouts due to inability to reach quorum
> >
> > Compaction: Leveled
> >
> > Nature of data usage: No updates/deletes, High reads, relatively low
> writes
> >
> >
> >
> >
> >
> > JVM:
> >
> > Using CMS GC and around 8 GB of max heap
> >
>

RE: Performance issue in 3.0.9

Reply via email to