On Wed, Oct 10, 2012 at 11:04 PM, Daniel Woo <[email protected]> wrote: > I am running a mini cluster with 6 nodes, recently we see very frequent > ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it takes > 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client throws > SocketTimeoutException every 3 minutes.
What version of Cassandra? What JVM? Are JNA and Jamm working? > I checked the load, it seems well balanced, and the two nodes are running on > the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G > heap, including 800MB young generation. We did not see any swap usage during > the GC, any idea about this? It sounds like the two nodes that are pathological right now have exhausted the perm gen with actual non-garbage, probably mostly the Bloom filters and the JMX MBeans. > Then I took a heap dump, it shows that 5 instances of JmxMBeanServer holds > 500MB memory and most of the referenced objects are JMX mbean related, it's > kind of wired to me and looks like a memory leak. Do you have a "large" number of ColumnFamilies? How large is the data stored per node? =Rob -- =Robert Coli AIM>ALK - [email protected] YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
