Hi Dmitry and Andrew,

Apache Cassandra makes heavy use of JMX to expose its operational
metrics.  Often, these metrics are consumed by heterogeneous systems
like Graphite, OpenNMS, Cacti, et al. that don't necessarily take care
to use persistent connections.  Even when connection pooling is done
properly, normal service restarts will result in some reconnections.

What we've seen is that reconnecting to recent JDK6 and JDK7 builds
results in leaking the CCL tree, which can result in JVMs OOMing even
in low allocation rate scenarios.  (In fact, high allocation rates are
likely to hide the problem if they force a STW GC.)  We've tracked the
problem to this change:

http://hg.openjdk.java.net/icedtea/jdk7/jdk/rev/985b53122cf8

We've attempted to file a bug but not gotten a response.

Is this considered a regression, or the New Normal behavior going
forward?  We are enabling CMSClassUnloadingEnabled as a workaround but
would prefer not to have to do this long term because of the
performance implications.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced

Reply via email to