Re: Slow performance after upgrading from 2.0.9 to 2.1.11

Peddi, Praveen Fri, 29 Jan 2016 10:31:51 -0800

Hello,
We have another update on performance on 2.1.11. compression_chunk_size  didn’t 
really help much but We changed concurrent_compactors from default to 64 in 
2.1.11 and read latencies improved significantly. However, 2.1.11 read 
latencies are still 1.5 slower than 2.0.9. One thing we noticed in JMX metric 
that could affect read latencies is that 2.1.11 is running 
ReadRepairedBackground and ReadRepairedBlocking too frequently compared to 
2.0.9 even though our read_repair_chance is same on both. Could anyone shed 
some light on why 2.1.11 could be running read repair 10 to 50 times more in 
spite of same configuration on both clusters?


dclocal_read_repair_chance=0.100000 AND
read_repair_chance=0.000000 AND

Here is the table for read repair metrics for both clusters.
                2.0.9   2.1.11
ReadRepairedBackground  5MinAvg 0.006   0.1
        15MinAvg        0.009   0.153
ReadRepairedBlocking    5MinAvg 0.002   0.55
        15MinAvg        0.007   0.91

Thanks
Praveen

From: Jeff Jirsa <jeff.ji...@crowdstrike.com<mailto:jeff.ji...@crowdstrike.com>>
Reply-To: <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, January 14, 2016 at 2:58 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11

Sorry I wasn’t as explicit as I should have been

The same buffer size is used by compressed reads as well, but tuned with 
compression_chunk_size table property. It’s likely true that if you lower 
compression_chunk_size, you’ll see improved read performance.

This was covered in the AWS re:Invent youtube link I sent in my original reply.



From: "Peddi, Praveen"
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
Date: Thursday, January 14, 2016 at 11:36 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>", Zhiyan Shao
Cc: "Agrawal, Pratik"
Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11

Hi,
We will try with reduced “rar_buffer_size” to 4KB. However 
CASSANDRA-10249<https://issues.apache.org/jira/browse/CASSANDRA-10249> says 
"this only affects users who have 1. disabled compression, 2. switched to 
buffered i/o from mmap’d”. None of this is true for us I believe. We use 
default disk_access_mode which should be mmap. We also used LZ4Compressor when 
created table.

We will let you know if this property had any effect. We were testing with 
2.1.11 and this was only fixed in 2.1.12 so we need to play with latest version.

Praveen





From: Jeff Jirsa <jeff.ji...@crowdstrike.com<mailto:jeff.ji...@crowdstrike.com>>
Reply-To: <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, January 14, 2016 at 1:29 PM
To: Zhiyan Shao <zhiyan.s...@gmail.com<mailto:zhiyan.s...@gmail.com>>, 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Cc: "Agrawal, Pratik" <paagr...@amazon.com<mailto:paagr...@amazon.com>>
Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11

This may be due to https://issues.apache.org/jira/browse/CASSANDRA-10249 / 
https://issues.apache.org/jira/browse/CASSANDRA-8894 - whether or not this is 
really the case depends on how much of your data is in page cache, and whether 
or not you’re using mmap. Since the original question was asked by someone 
using small RAM instances, it’s possible.

We mitigate this by dropping compression_chunk_size in order to force a smaller 
buffer on reads, so we don’t over read very small blocks. This has other side 
effects (lower compression ratio, more garbage during streaming), but 
significantly speeds up read workloads for us.


From: Zhiyan Shao
Date: Thursday, January 14, 2016 at 9:49 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
Cc: Jeff Jirsa, "Agrawal, Pratik"
Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11

Praveen, if you search "Read is slower in 2.1.6 than 2.0.14" in this forum, you 
can find another thread I sent a while ago. The perf test I did indicated that 
read is slower for 2.1.6 than 2.0.14 so we stayed with 2.0.14.

On Tue, Jan 12, 2016 at 9:35 AM, Peddi, Praveen 
<pe...@amazon.com<mailto:pe...@amazon.com>> wrote:
Thanks Jeff for your reply. Sorry for delayed response. We were running some 
more tests and wanted to wait for the results.

So basically we saw higher CPU with 2.1.11 was higher compared to 2.0.9 (see 
below) for the same exact load test. Memory spikes were also aggressive on 
2.1.11.

So we wanted to rule out any of our custom setting so we ended up doing some 
testing with Cassandra stress test and default Cassandra installation. Here are 
the results we saw between 2.0.9 and 2.1.11. Both are default installations and 
both use Cassandra stress test with same params. This is the closest 
apple-apple comparison we can get. As you can see both read and write latencies 
are 30 to 50% worse in 2.1.11 than 2.0.9. Since we are using default 
installation.

Highlights of the test:
Load: 2x reads and 1x writes
CPU:  2.0.9 (goes upto 25%)  compared to 2.1.11 (goes upto 60%)
Local read latency: 0.039 ms for 2.0.9 and 0.066 ms for 2.1.11
Local write Latency: 0.033 ms for 2.0.9 Vs 0.030 ms for 2.1.11
One observation is, As the number of threads are increased, 2.1.11 read 
latencies are getting worse compared to 2.0.9 (see below table for 24 threads 
vs 54 threads)
Not sure if anyone has done this kind of comparison before and what their 
thoughts are. I am thinking for this same reason

2.0.9 Plain      type         total ops     op/s            pk/s           
row/s            mean             med        0.95    0.99    0.999        max   
        time
 16 threadCount  READ   66854   7205    7205    7205    1.6     1.3     2.8     
3.5     9.6     85.3    9.3
 16 threadCount  WRITE  33146   3572    3572    3572    1.3     1       2.6     
3.3     7       206.5   9.3
 16 threadCount  total  100000  10777   10777   10777   1.5     1.3     2.7     
3.4     7.9     206.5   9.3
2.1.11 Plain
 16 threadCount  READ   67096   6818    6818    6818    1.6     1.5     2.6     
3.5     7.9     61.7    9.8
 16 threadCount  WRITE  32904   3344    3344    3344    1.4     1.3     2.3     
3       6.5     56.7    9.8
 16 threadCount  total  100000  10162   10162   10162   1.6     1.4     2.5     
3.2     6       61.7    9.8
2.0.9 Plain
 24 threadCount  READ   66414   8167    8167    8167    2       1.6     3.7     
7.5     16.7    208     8.1
 24 threadCount  WRITE  33586   4130    4130    4130    1.7     1.3     3.4     
5.4     25.6    45.4    8.1
 24 threadCount  total  100000  12297   12297   12297   1.9     1.5     3.5     
6.2     15.2    208     8.1
2.1.11 Plain
 24 threadCount  READ   66628   7433    7433    7433    2.2     2.1     3.4     
4.3     8.4     38.3    9
 24 threadCount  WRITE  33372   3723    3723    3723    2       1.9     3.1     
3.8     21.9    37.2    9
 24 threadCount  total  100000  11155   11155   11155   2.1     2       3.3     
4.1     8.8     38.3    9
2.0.9 Plain
 54 threadCount  READ   67115   13419   13419   13419   2.8     2.6     4.2     
6.4     36.9    82.4    5
 54 threadCount  WRITE  32885   6575    6575    6575    2.5     2.3     3.9     
5.6     15.9    81.5    5
 54 threadCount  total  100000  19993   19993   19993   2.7     2.5     4.1     
5.7     13.9    82.4    5
2.1.11 Plain
 54 threadCount  READ   66780   8951    8951    8951    4.3     3.9     6.8     
9.7     49.4    69.9    7.5
 54 threadCount  WRITE  33220   4453    4453    4453    3.5     3.2     5.7     
8.2     36.8    68      7.5
 54 threadCount  total  100000  13404   13404   13404   4       3.7     6.6     
9.2     48      69.9    7.5


From: Jeff Jirsa <jeff.ji...@crowdstrike.com<mailto:jeff.ji...@crowdstrike.com>>
Date: Thursday, January 7, 2016 at 1:01 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>, Peddi Praveen 
<pe...@amazon.com<mailto:pe...@amazon.com>>
Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11

Anecdotal evidence typically agrees that 2.1 is faster than 2.0 (our experience 
was anywhere from 20-60%, depending on workload).

However, it’s not necessarily true that everything behaves exactly the same – 
in particular, memtables are different, commitlog segment handling is 
different, and GC params may need to be tuned differently for 2.1 than 2.0.

When the system is busy, what’s it actually DOING? Cassandra exposes a TON of 
metrics – have you plugged any into a reporting system to see what’s going on? 
Is your latency due to pegged cpu, iowait/disk queues or gc pauses?

My colleagues spent a lot of time validating different AWS EBS configs (video 
from reinvent at https://www.youtube.com/watch?v=1R-mgOcOSd4), 2.1 was faster 
in almost every case, but you’re using an instance size I don’t believe we 
tried (too little RAM to be viable in production).  c3.2xl only gives you 15G 
of ram – most “performance” based systems want 2-4x that (people running G1 
heaps usually start at 16G heaps and leave another 16-30G for page cache), 
you’re running fairly small hardware – it’s possible that 2.1 isn’t “as good” 
on smaller hardware.

(I do see your domain, presumably you know all of this, but just to be sure):

You’re using c3, so presumably you’re using EBS – are you using GP2? Which 
volume sizes? Are they the same between versions? Are you hitting your iops 
limits? Running out of burst tokens? Do you have enhanced networking enabled? 
At load, what part of your system is stressed? Are you cpu bound? Are you 
seeing GC pauses hurt latency? Have you tried changing memtable_allocation_type 
-> offheap objects  (available in 2.1, not in 2.0)?

Tuning gc_grace is weird – do you understand what it does? Are you overwriting 
or deleting a lot of data in your test (that’d be unusual)? Are you doing a lot 
of compaction?


From: "Peddi, Praveen"
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
Date: Wednesday, January 6, 2016 at 11:41 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
Subject: Slow performance after upgrading from 2.0.9 to 2.1.11

Hi,
We have upgraded Cassandra from 2.0.9 to 2.1.11 in our loadtest environment 
with pretty much same yaml settings in both (removed unused yaml settings and 
renamed few others) and we have noticed performance on 2.1.11 is worse compared 
to 2.0.9. After more investigation we found that the performance gets worse as 
we increase replication factor on 2.1.11 where as on 2.0.9 performance is more 
or less same. Has anything architecturally changed as far as replication is 
concerned in 2.1.11?

All googling only suggested 2.1.11 should be FASTER than 2.0.9 so we are 
obviously doing something different. However the client code, load test is all 
identical in both cases.

Details:
Nodes: 3 ec2 c3.2x large
R/W Consistency: QUORUM
Renamed memtable_total_space_in_mb to memtable_heap_space_in_mb and removed 
unused properties from yaml file.
We run compaction aggressive compaction with low gc_grace (15 mins) but this is 
true for both 2.0.9 and 2.1.11.

As you can see, all p50, p90 and p99 latencies stayed with in 10% difference on 
2.0.9 when we increased RF from 1 to 3, where as on 2.1.11 latencies almost 
doubled (especially reads are much slower than writes).

# Nodes         RF      # of rows       2.0.9   2.1.11
READ
                        P50     P90     P99     P50     P90     P99
3       1       450     306     594     747     425     849     1085
3       3       450     358     634     877     708     1274    2642

WRITE
3       1       10      26      80      179     37      131     196
3       3       10      31      96      184     46      166     468

Any pointers on how to debug performance issues will be appreciated.

Praveen

Re: Slow performance after upgrading from 2.0.9 to 2.1.11

Reply via email to