gc-tuning.html

Reid Pinchback Mon, 21 Oct 2019 10:29:18 -0700

Think of GB to OS as something intended to support file caching.  As such the 
amount is whatever suits your usage.  If your use is almost exclusively 
reading, then file cache memory doesn’t matter that much if you’re operating 
with your storage as those nvme ssd drives that the i3’s come with.  There is 
already a chunk cache that you should be tuning in C* instead, and feeding fast 
from the O/S file cache, assuming compressed SSTables, maybe turns out to be 
less of a concern.


If you have moderate write activity then your situation changes because then 
that same file cache is how your dirty background pages turn into eventual 
flushes to disk, and so you have to watch the impact of read stalls when the 
I/O fills with write requests.  You might not see this so obviously on nvme 
drives, but that could depend a lot on the distro and kernels and how the 
filesystem is mounted.

My super strong advice on issues like this is to not cargo-cult other people’s 
tunings.  Look at them for ideas, sure. But learn how to do your own 
investigations, and budget the time for it into your project.  Budget a LOT of 
time for it if your measure of “good performance” is based on latency; when 
“good” is defined in terms of throughput your life is easier.  Also, everything 
is always a little different in virtualization, and lord knows you can have 
screwball things appear in AWS. The good news is you don’t need a perfect 
configuration out of the gate; you need a configuration you understand and can 
refine; understanding comes from knowing how to do your own performance 
monitoring.


From: Sergio <lapostadiser...@gmail.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Monday, October 21, 2019 at 1:16 PM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: [EXTERNAL] Re: GC Tuning 
https://thelastpickle.com/blog/2018/04/11/gc-tuning.html

Message from External Sender
Thanks, guys!
I just copied and paste what I found on our test machines but I can confirm 
that we have the same settings except for 8GB in production.
I didn't select these settings and I need to verify why these settings are 
there.
If any of you want to share your flags for a read-heavy workload it would be 
appreciated, so I would replace and test those flags with TLP-STRESS.
I am thinking about different approaches (G1GC vs ParNew + CMS)
How many GB for RAM do you dedicate to the OS in percentage or in an exact 
number?
Can you share the flags for ParNew + CMS that I can play with it and perform a 
test?

Best,
Sergio

Il giorno lun 21 ott 2019 alle ore 09:27 Reid Pinchback 
<rpinchb...@tripadvisor.com<mailto:rpinchb...@tripadvisor.com>> ha scritto:
Since the instance size is < 32gb, hopefully swap isn’t being used, so it 
should be moot.

Sergio, also be aware that  -XX:+CMSClassUnloadingEnabled probably doesn’t do 
anything for you.  I believe that only applies to CMS, not G1GC.  I also 
wouldn’t take it as gospel truth that  -XX:+UseNUMA is a good thing on AWS (or 
anything virtualized), you’d have to run your own tests and find out.

R
From: Jon Haddad <j...@jonhaddad.com<mailto:j...@jonhaddad.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Monday, October 21, 2019 at 12:06 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: [EXTERNAL] Re: GC Tuning 
https://thelastpickle.com/blog/2018/04/11/gc-tuning.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__thelastpickle.com_blog_2018_04_11_gc-2Dtuning.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=UvvIpm6RP7FYRQH6S5EPXTsxAMsezbm6QzHNB0zmMG0&s=jmk5lyXeQ6gwlVWF86TKWUbIhy57G5tOnlLEps8-DQw&e=>

Message from External Sender
One thing to note, if you're going to use a big heap, cap it at 31GB, not 32.  
Once you go to 32GB, you don't get to use compressed pointers [1], so you get 
less addressable space than at 31GB.

[1] 
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.codecentric.de_en_2014_02_35gb-2Dheap-2Dless-2D32gb-2Djava-2Djvm-2Dmemory-2Doddities_&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=e9Ahs5XXRBicgUhMZQaboxsqb6jXpjvo48kEojUWaQc&s=Q7jI4ZEqVMFZIMPoSXTvMebG5fWOUJ6lhDOgWGxiHg8&e=>

On Mon, Oct 21, 2019 at 11:39 AM Durity, Sean R 
<sean_r_dur...@homedepot.com<mailto:sean_r_dur...@homedepot.com>> wrote:
I don’t disagree with Jon, who has all kinds of performance tuning experience. 
But for ease of operation, we only use G1GC (on Java 8), because the tuning of 
ParNew+CMS requires a high degree of knowledge and very repeatable testing 
harnesses. It isn’t worth our time. As a previous writer mentioned, there is 
usually better return on our time tuning the schema (aka helping developers 
understand Cassandra’s strengths).

We use 16 – 32 GB heaps, nothing smaller than that.

Sean Durity

From: Jon Haddad <j...@jonhaddad.com<mailto:j...@jonhaddad.com>>
Sent: Monday, October 21, 2019 10:43 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: [EXTERNAL] Re: GC Tuning 
https://thelastpickle.com/blog/2018/04/11/gc-tuning.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__thelastpickle.com_blog_2018_04_11_gc-2Dtuning.html&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=e9Ahs5XXRBicgUhMZQaboxsqb6jXpjvo48kEojUWaQc&s=YFRUQ6Rdb5mcFf6GqguRYCsrcAcP6KzjozIgYp56riE&e=>

I still use ParNew + CMS over G1GC with Java 8.  I haven't done a comparison 
with JDK 11 yet, so I'm not sure if it's any better.  I've heard it is, but I 
like to verify first.  The pause times with ParNew + CMS are generally lower 
than G1 when tuned right, but as Chris said it can be tricky.  If you aren't 
willing to spend the time understanding how it works and why each setting 
matters, G1 is a better option.

I wouldn't run Cassandra in production on less than 8GB of heap - I consider it 
the absolute minimum.  For G1 I'd use 16GB, and never 4GB with Cassandra unless 
you're rarely querying it.

I typically use the following as a starting point now:

ParNew + CMS
16GB heap
10GB new gen
2GB memtable cap, otherwise you'll spend a bunch of time copying around 
memtables (cassandra.yaml)
Max tenuring threshold: 2
survivor ratio 6

I've also done some tests with a 30GB heap, 24 GB of which was new gen.  This 
worked surprisingly well in my tests since it essentially keeps everything out 
of the old gen.  New gen allocations are just a pointer bump and are pretty 
fast, so in my (limited) tests of this I was seeing really good p99 times.  I 
was seeing a 200-400 ms pause roughly once a minute running a workload that 
deliberately wasn't hitting a resource limit (testing real world looking stress 
vs overwhelming the cluster).

We built tlp-cluster [1] and tlp-stress [2] to help figure these things out.

[1] https://thelastpickle.com/tlp-cluster/ 
[thelastpickle.com]<https://urldefense.com/v3/__https:/thelastpickle.com/tlp-cluster/__;!OYIaWQQGbnA!ZhiXAdRaL49J8nBlh0F_5MQ97Z1QNTUuTSMvksmEmxan3d65D6ATmQO1ig58W52u_EmQ1GM$>
[2] http://thelastpickle.com/tlp-stress 
[thelastpickle.com]<https://urldefense.com/v3/__http:/thelastpickle.com/tlp-stress__;!OYIaWQQGbnA!ZhiXAdRaL49J8nBlh0F_5MQ97Z1QNTUuTSMvksmEmxan3d65D6ATmQO1ig58W52uuCUZYKw$>

Jon




On Mon, Oct 21, 2019 at 10:24 AM Reid Pinchback 
<rpinchb...@tripadvisor.com<mailto:rpinchb...@tripadvisor.com>> wrote:
An i3x large has 30.5 gb of RAM but you’re using less than 4gb for C*.  So 
minus room for other uses of jvm memory and for kernel activity, that’s about 
25 gb for file cache.  You’ll have to see if you either want a bigger heap to 
allow for less frequent gc cycles, or you could save money on the instance 
size.  C* generates a lot of medium-length lifetime objects which can easily 
end up in old gen.  A larger heap will reduce the burn of more old-gen 
collections.  There are no magic numbers to just give because it’ll depend on 
your usage patterns.

From: Sergio <lapostadiser...@gmail.com<mailto:lapostadiser...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Sunday, October 20, 2019 at 2:51 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: GC Tuning https://thelastpickle.com/blog/2018/04/11/gc-tuning.html 
[thelastpickle.com]<https://urldefense.com/v3/__https:/thelastpickle.com/blog/2018/04/11/gc-tuning.html__;!OYIaWQQGbnA!ZhiXAdRaL49J8nBlh0F_5MQ97Z1QNTUuTSMvksmEmxan3d65D6ATmQO1ig58W52uwG_KUYM$>

Message from External Sender
Thanks for the answer.

This is the JVM version that I have right now.

openjdk version "1.8.0_161"
OpenJDK Runtime Environment (build 1.8.0_161-b14)
OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)

These are the current flags. Would you change anything in a i3x.large aws node?

java -Xloggc:/var/log/cassandra/gc.log 
-Dcassandra.max_queued_native_transport_requests=4096 -ea 
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 
-XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 
-XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseTLAB -XX:+ResizeTLAB 
-XX:+UseNUMA -XX:+PerfDisableSharedMem -Djava.net.preferIPv4Stack=true 
-XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:+UseG1GC 
-XX:G1RSetUpdatingPauseTimePercent=5 -XX:MaxGCPauseMillis=200 
-XX:InitiatingHeapOccupancyPercent=45 -XX:G1HeapRegionSize=0 
-XX:-ParallelRefProcEnabled -Xms3821M -Xmx3821M 
-XX:CompileCommandFile=/etc/cassandra/conf/hotspot_compiler 
-Dcom.sun.management.jmxremote.port=7199 
-Dcom.sun.management.jmxremote.rmi.port=7199 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.password.file=/etc/cassandra/conf/jmxremote.password
 
-Dcom.sun.management.jmxremote.access.file=/etc/cassandra/conf/jmxremote.access 
-Djava.library.path=/usr/share/cassandra/lib/sigar-bin 
-Djava.rmi.server.hostname=172.24.150.141 -XX:+CMSClassUnloadingEnabled 
-javaagent:/usr/share/cassandra/lib/jmx_prometheus_javaagent-0.3.1.jar=10100:/etc/cassandra/default.conf/jmx-export.yml
 -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra 
-Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid 
-Dcassandra-foreground=yes -cp 
/etc/cassandra/conf:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/asm-5.0.4.jar:/usr/share/cassandra/lib/caffeine-2.2.6.jar:/usr/share/cassandra/lib/cassandra-driver-core-3.0.1-shaded.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.9.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/concurrent-trees-2.4.0.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/ecj-4.4.2.jar:/usr/share/cassandra/lib/guava-18.0.jar:/usr/share/cassandra/lib/HdrHistogram-2.1.9.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/hppc-0.5.4.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.13.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.13.jar:/usr/share/cassandra/lib/jamm-0.3.0.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/share/cassandra/lib/jctools-core-1.2.1.jar:/usr/share/cassandra/lib/jflex-1.6.0.jar:/usr/share/cassandra/lib/jmx_prometheus_javaagent-0.3.1.jar:/usr/share/cassandra/lib/jna-4.2.2.jar:/usr/share/cassandra/lib/joda-time-2.4.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/jstackjunit-0.0.1.jar:/usr/share/cassandra/lib/libthrift-0.9.2.jar:/usr/share/cassandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/share/cassandra/lib/logback-classic-1.1.3.jar:/usr/share/cassandra/lib/logback-core-1.1.3.jar:/usr/share/cassandra/lib/lz4-1.3.0.jar:/usr/share/cassandra/lib/metrics-core-3.1.5.jar:/usr/share/cassandra/lib/metrics-jvm-3.1.5.jar:/usr/share/cassandra/lib/metrics-logback-3.1.5.jar:/usr/share/cassandra/lib/netty-all-4.0.44.Final.jar:/usr/share/cassandra/lib/ohc-core-0.4.4.jar:/usr/share/cassandra/lib/ohc-core-j8-0.4.4.jar:/usr/share/cassandra/lib/reporter-config3-3.0.3.jar:/usr/share/cassandra/lib/reporter-config-base-3.0.3.jar:/usr/share/cassandra/lib/sigar-1.6.4.jar:/usr/share/cassandra/lib/slf4j-api-1.7.7.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.1.1.7.jar:/usr/share/cassandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-3.11.3.jar:/usr/share/cassandra/apache-cassandra-thrift-3.11.3.jar:/usr/share/cassandra/stress.jar:
 org.apache.cassandra.service.CassandraDaemon

Best,

Sergio

Il giorno sab 19 ott 2019 alle ore 14:30 Chris Lohfink 
<clohfin...@gmail.com<mailto:clohfin...@gmail.com>> ha scritto:
"It depends" on your version and heap size but G1 is easier to get right so 
probably wanna stick with that unless you are using small heaps or really 
interested in tuning it (likely for massively smaller gains then tuning your 
data model). There is no GC algo that is strictly better than others in all 
scenarios unfortunately. If your JVM supports it, ZGC or Shenandoah are likely 
going to give you the best latencies.

Chris

On Fri, Oct 18, 2019 at 8:41 PM Sergio Bilello 
<lapostadiser...@gmail.com<mailto:lapostadiser...@gmail.com>> wrote:
Hello!

Is it still better to use ParNew + CMS Is it still better than G1GC  these days?

Any recommendation for i3.xlarge nodes read-heavy workload?


Thanks,

Sergio

---------------------------------------------------------------------
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org>
For additional commands, e-mail: 
user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org>

________________________________

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

Re: [EXTERNAL] Re: GC Tuning https://thelastpickle.com/blog/2018/04/11/gc-tuning.html

Reply via email to