We have a 6 node Cassandra 0.6.8 cluster running on boxes with 4 GB of
RAM.  Over the course of several weeks cached memory slowly decreases
until Cassandra is restarted or something bad happens (ie oom killer).
Performance obviously suffers as cached memory is no longer available.
Here is a graph of memory over a two week period, the big jump is a
Cassandra restart:

http://img194.imageshack.us/img194/383/2weekmem.png

There is also tomcat and and your standard linux services running on the
box as well.   I saved the output of /proc/$CASSANDRA_PID/status ever 10
seconds from yesterday afternoon to this morning and graphed resident
set size.  While the jvm and the linux virtual memory system do all
sorts of clever things, over a period of 12+ hours I believe RSS must go
in a direction other than up or there is a problem.

http://img24.imageshack.us/img24/1754/cassandrarss.png

swapiness is currently set to zero and a graph of swap use over a two
week period does not show any corresponding growth to the lost cached space.

We are using <DiskAccessMode>standard</DiskAccessMode> and no JNA
interfaces.  I think this makes Cassandra a totally normal java program
living in heap, so I am confused how this growth could be happening.
Similar experiences or explanations would be most welcome.  I can
provide further information if necessary.


##### Info dump
uname: 2.6.18-194.8.1.el5 #1 SMP Wed Jun 23 10:52:51 EDT 2010 x86_64
x86_64 x86_64 GNU/Linux

java -version
java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)

cmd line arg (paths edited):
/usr/java/jdk1.6.0_20/bin/java -Xms1500M -cXmx1500M -ea -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
-XX:+HeapDumpOnOutOfMemoryError -XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42 -Dcassandra.compaction.priority=1
-Dcom.sun.management.jmxremote.port=10101
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dstorage-config=~/conf -Dcassandra-pidfile=~/cassandra.pid -cp
~/conf::~/lib/antlr-3.1.3.jar:~/lib/apache-cassandra-0.6.8.jar:~/lib/clhm-production.jar:~/lib/commons-cli-1.1.jar:~/lib/commons-codec-1.2.jar:~/lib/commons-collections-3.2.1.jar:~/lib/commons-lang-2.4.jar:~/lib/google-collections-1.0.jar:~/lib/hadoop-core-0.20.1.jar:~/lib/high-scale-lib.jar:~/lib/ivy-2.1.0.jar:~/lib/jackson-core-asl-1.4.0.jar:~/lib/jackson-mapper-asl-1.4.0.jar:~/lib/jline-0.9.94.jar:~/lib/json-simple-1.1.jar:~/lib/libthrift-r917130.jar:~/lib/log4j-1.2.14.jar:~/lib/slf4j-api-1.5.8.jar:~/lib/slf4j-log4j12-1.5.8.jar:~/lib/zapcass-1.0.0.jar:~/lib/zapcat-1.3-beta.jar
com.clearspring.cassandra.ZapCassDaemon

ZapCassDaemon is just getting cassandra and zabbix to talk.  I do not
believe we are unique in doing this:
public class ZapCassDaemon {
    public EmbeddedCassandraService cassandra;
    public ZabbixAgent agent;
    public ZapCassDaemon() throws Exception {
        agent = new ZabbixAgent();

        cassandra = new EmbeddedCassandraService();
        cassandra.init();
        cassandra.run();
    }
    static public ZapCassDaemon instance;
    static public void main(String[] args) throws Exception {
        instance = new ZapCassDaemon();
    }
}

Reply via email to