How do you calculate the heap / data size ratio? Is this a linear ratio? Each node has slightly more than 12 GB right now though.
2013/4/16 Viktor Jevdokimov <viktor.jevdoki...@adform.com> > For a >40GB of data 1GB of heap is too low.**** > > ** ** > Best regards / Pagarbiai > *Viktor Jevdokimov* > Senior Developer > > Email: viktor.jevdoki...@adform.com > Phone: +370 5 212 3063, Fax +370 5 261 0453 > J. Jasinskio 16C, LT-01112 Vilnius, Lithuania > Follow us on Twitter: @adforminsider <http://twitter.com/#!/adforminsider> > Take a ride with Adform's Rich Media Suite<http://vimeo.com/adform/richmedia> > [image: Adform News] <http://www.adform.com> > [image: Adform awarded the Best Employer 2012] > <http://www.adform.com/site/blog/adform/adform-takes-top-spot-in-best-employer-survey/> > > Disclaimer: The information contained in this message and attachments is > intended solely for the attention and use of the named addressee and may be > confidential. If you are not the intended recipient, you are reminded that > the information remains the property of the sender. You must not use, > disclose, distribute, copy, print or rely on this e-mail. If you have > received this message in error, please contact the sender immediately and > irrevocably delete this message and any copies. > > *From:* Joel Samuelsson [mailto:samuelsson.j...@gmail.com] > *Sent:* Tuesday, April 16, 2013 10:47 > *To:* user@cassandra.apache.org > *Subject:* Reduce Cassandra GC**** > > ** ** > > Hi,**** > > ** ** > > We have a small production cluster with two nodes. The load on the nodes > is very small, around 20 reads / sec and about the same for writes. There > are around 2.5 million keys in the cluster and a RF of 2.**** > > ** ** > > About 2.4 million of the rows are skinny (6 columns) and around 3kb in > size (each). Currently, scripts are running, accessing all of the keys in > timeorder to do some calculations.**** > > ** ** > > While running the scripts, the nodes go down and then come back up 6-7 > minutes later. This seems to be due to GC. I get lines like this in the log: > **** > > INFO [ScheduledTasks:1] 2013-04-15 14:00:02,749 GCInspector.java (line > 122) GC for ParNew: 338798 ms for 1 collections, 592212416 used; max is > 1046937600**** > > ** ** > > However, the heap is not full. The heap usage has a jagged pattern going > from 60% up to 70% during 5 minutes and then back down to 60% the next 5 > minutes and so on. I get no "Heap is X full..." messages. Every once in a > while at one of these peaks, I get these stop-the-world GC for 6-7 > minutes. Why does GC take up so much time even though the heap isn't full? > **** > > ** ** > > I am aware that my access patterns make key caching very unlikely to be > high. And indeed, my average key cache hit ratio during the run of the > scripts is around 0.5%. I tried disabling key caching on the accessed > column family (UPDATE COLUMN FAMILY cf WITH caching=none;) through the > cassandra-cli but I get the same behaviour. Is the turning key cache off > effective immediately?**** > > ** ** > > Stop-the-world GC is fine if it happens for a few seconds but having them > for several minutes doesn't work. Any other suggestions to remove them?*** > * > > ** ** > > Best regards,**** > > Joel Samuelsson**** >
<<signature-logo402b.png>>
<<signature-best-employer-logo72cd.png>>