RE: Reduce Cassandra GC

Viktor Jevdokimov Tue, 16 Apr 2013 02:22:30 -0700

For a >40GB of data 1GB of heap is too low.

Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer


Email: viktor.jevdoki...@adform.com<mailto:viktor.jevdoki...@adform.com>
Phone: +370 5 212 3063, Fax +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsider<http://twitter.com/#!/adforminsider>
Take a ride with Adform's Rich Media Suite<http://vimeo.com/adform/richmedia>

[Adform News] <http://www.adform.com>
[Adform awarded the Best Employer 2012] 
<http://www.adform.com/site/blog/adform/adform-takes-top-spot-in-best-employer-survey/>


Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.

From: Joel Samuelsson [mailto:samuelsson.j...@gmail.com]
Sent: Tuesday, April 16, 2013 10:47
To: user@cassandra.apache.org
Subject: Reduce Cassandra GC

Hi,

We have a small production cluster with two nodes. The load on the nodes is 
very small, around 20 reads / sec and about the same for writes. There are 
around 2.5 million keys in the cluster and a RF of 2.

About 2.4 million of the rows are skinny (6 columns) and around 3kb in size 
(each). Currently, scripts are running, accessing all of the keys in timeorder 
to do some calculations.

While running the scripts, the nodes go down and then come back up 6-7 minutes 
later. This seems to be due to GC. I get lines like this in the log:
INFO [ScheduledTasks:1] 2013-04-15 14:00:02,749 GCInspector.java (line 122) GC 
for ParNew: 338798 ms for 1 collections, 592212416 used; max is 1046937600

However, the heap is not full. The heap usage has a jagged pattern going from 
60% up to 70% during 5 minutes and then back down to 60% the next 5 minutes and 
so on. I get no "Heap is X full..." messages. Every once in a while at one of 
these peaks, I get these stop-the-world GC for 6-7 minutes. Why does GC take up 
so much time even though the heap isn't full?

I am aware that my access patterns make key caching very unlikely to be high. 
And indeed, my average key cache hit ratio during the run of the scripts is 
around 0.5%. I tried disabling key caching on the accessed column family 
(UPDATE COLUMN FAMILY cf WITH caching=none;) through the cassandra-cli but I 
get the same behaviour. Is the turning key cache off effective immediately?

Stop-the-world GC is fine if it happens for a few seconds but having them for 
several minutes doesn't work. Any other suggestions to remove them?

Best regards,
Joel Samuelsson

<<inline: signature-logo402b.png>>

<<inline: signature-best-employer-logo72cd.png>>

RE: Reduce Cassandra GC

Reply via email to