Re: Cassandra 2.1.18 - Question on stream/bootstrap throughput

Reid Pinchback Tue, 22 Oct 2019 10:32:14 -0700

Thanks for the reading Jon.  😊

From: Jon Haddad <j...@jonhaddad.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Tuesday, October 22, 2019 at 12:32 PM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: Cassandra 2.1.18 - Question on stream/bootstrap throughput


Message from External Sender
CPU waiting on memory will look like CPU overhead.   There's a good post on the 
topic by Brendan Gregg: 
http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.brendangregg.com_blog_2017-2D05-2D09_cpu-2Dutilization-2Dis-2Dwrong.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=uyQyRQAH6rGAAtjwZF7Xzd0gwksPBtKKNFpzfyi9f2M&s=g-34YFo5F6gV_lvv-fCjlGn5SdvQJRFUOT0DIohRpCQ&e=>

Regarding GC, I agree with Reid.  You're probably not going to saturate your 
network card no matter what your settings, Cassandra has way too much overhead 
to do that.  It's one of the reasons why the whole zero-copy streaming feature 
was added to Cassandra 4.0: 
http://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org_blog_2018_08_07_faster-5Fstreaming-5Fin-5Fcassandra.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=uyQyRQAH6rGAAtjwZF7Xzd0gwksPBtKKNFpzfyi9f2M&s=kCbODyLouPOI__Ku2DHXUXvBhw29wixkEsbXj8uwICk&e=>

Reid is also correct in pointing out the method by which you're monitoring your 
metrics might be problematic.  With prometheus, the same data can show 
significantly different graphs when using rate vs irate, and only collecting 
once a minute would hide a lot of useful data.

If you keep digging and find you're not using all your CPU during GC pauses, 
you can try using more GC threads by setting -XX:ParallelGCThreads to match the 
number of cores you have, since by default it won't use them all.  You've got 
40 cores in the m4.10xlarge, try setting -XX:ParallelGCThreads to 40.
Jon



On Tue, Oct 22, 2019 at 11:38 AM Reid Pinchback 
<rpinchb...@tripadvisor.com<mailto:rpinchb...@tripadvisor.com>> wrote:
Thomas, what is your frequency of metric collection?  If it is minute-level 
granularity, that can give a very false impression.  I’ve seen CPU and disk 
throttles that don’t even begin to show visibility until second-level 
granularity around the time of the constraining event.  Even clearer is 100ms.

Also, are you monitoring your GC activity at all?  GC bound up in a lot of 
memory copies is not going to manifest that much CPU, it’s memory bus bandwidth 
you are fighting against then.  It is easy to have a box that looks unused but 
in reality its struggling.  Given that you’ve opened up the floodgates on 
compaction, that would seem quite plausible to be what you are experiencing.

From: "Steinmaurer, Thomas" 
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, October 22, 2019 at 11:22 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: RE: Cassandra 2.1.18 - Question on stream/bootstrap throughput

Message from External Sender
Hi Alex,

Increased streaming throughput has been set on the existing nodes only, cause 
it is meant to limit outgoing traffic only, right? At least when judging from 
the name, reading the documentation etc.

Increased compaction throughput on all nodes, although my understanding is that 
it would be necessary only on the joining node to catchup with compacting 
received SSTables.

We really see no resource (CPU, NW and disk) being somehow maxed out on any 
node, which would explain the limit in the area of the new node receiving data 
at ~ 180-200 Mbit/s.

Thanks again,
Thomas

From: Oleksandr Shulgin 
<oleksandr.shul...@zalando.de<mailto:oleksandr.shul...@zalando.de>>
Sent: Dienstag, 22. Oktober 2019 16:35
To: User <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Cassandra 2.1.18 - Question on stream/bootstrap throughput

On Tue, Oct 22, 2019 at 12:47 PM Steinmaurer, Thomas 
<thomas.steinmau...@dynatrace.com<mailto:thomas.steinmau...@dynatrace.com>> 
wrote:

using 2.1.8, 3 nodes (m4.10xlarge, ESB SSD-based), vnodes=256, RF=3, we are 
trying to add a 4th node.

The two options to my knowledge, mainly affecting throughput, namely stream 
output and compaction throttling has been set to very high values (e.g. stream 
output = 800 Mbit/s resp. compaction throughput = 500 Mbyte/s) or even set to 0 
(unthrottled) in cassandra.yaml + process restart. In both scenarios 
(throttling with high values vs. unthrottled), the 4th node is streaming from 
one node capped ~ 180-200Mbit/s, according to our SFM.

The nodes have plenty of resources available (10Gbit, disk io/iops), also 
confirmed by e.g. iperf in regard to NW throughput and write to / read from 
disk in the area of 200 MByte/s.

Are there any other known throughput / bootstrap limitations, which basically 
outrule above settings?

Hi Thomas,

Assuming you have 3 Availability Zones and you are adding the new node to one 
of the zones where you already have a node running, it is expected that it only 
streams from that node (its local rack).

Have you increased the streaming throughput on the node it streams from or only 
on the new node?  The limit applies to the source node as well.  You can change 
it online w/o the need to restart using nodetool command.

Have you checked if the new node is not CPU-bound?  It's unlikely though due to 
big instance type and only one node to stream from, more relevant for scenarios 
when streaming from a lot of nodes.

Cheers,
--
Alex

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313

Re: Cassandra 2.1.18 - Question on stream/bootstrap throughput

Reply via email to