Hi Dmitry and Andrew, Apache Cassandra makes heavy use of JMX to expose its operational metrics. Often, these metrics are consumed by heterogeneous systems like Graphite, OpenNMS, Cacti, et al. that don't necessarily take care to use persistent connections. Even when connection pooling is done properly, normal service restarts will result in some reconnections.
What we've seen is that reconnecting to recent JDK6 and JDK7 builds results in leaking the CCL tree, which can result in JVMs OOMing even in low allocation rate scenarios. (In fact, high allocation rates are likely to hide the problem if they force a STW GC.) We've tracked the problem to this change: http://hg.openjdk.java.net/icedtea/jdk7/jdk/rev/985b53122cf8 We've attempted to file a bug but not gotten a response. Is this considered a regression, or the New Normal behavior going forward? We are enabling CMSClassUnloadingEnabled as a workaround but would prefer not to have to do this long term because of the performance implications. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced