We have a 6 node Cassandra 0.6.8 cluster running on boxes with 4 GB of RAM. Over the course of several weeks cached memory slowly decreases until Cassandra is restarted or something bad happens (ie oom killer). Performance obviously suffers as cached memory is no longer available. Here is a graph of memory over a two week period, the big jump is a Cassandra restart:
http://img194.imageshack.us/img194/383/2weekmem.png There is also tomcat and and your standard linux services running on the box as well. I saved the output of /proc/$CASSANDRA_PID/status ever 10 seconds from yesterday afternoon to this morning and graphed resident set size. While the jvm and the linux virtual memory system do all sorts of clever things, over a period of 12+ hours I believe RSS must go in a direction other than up or there is a problem. http://img24.imageshack.us/img24/1754/cassandrarss.png swapiness is currently set to zero and a graph of swap use over a two week period does not show any corresponding growth to the lost cached space. We are using <DiskAccessMode>standard</DiskAccessMode> and no JNA interfaces. I think this makes Cassandra a totally normal java program living in heap, so I am confused how this growth could be happening. Similar experiences or explanations would be most welcome. I can provide further information if necessary. ##### Info dump uname: 2.6.18-194.8.1.el5 #1 SMP Wed Jun 23 10:52:51 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux java -version java version "1.6.0_20" Java(TM) SE Runtime Environment (build 1.6.0_20-b02) Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode) cmd line arg (paths edited): /usr/java/jdk1.6.0_20/bin/java -Xms1500M -cXmx1500M -ea -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:+HeapDumpOnOutOfMemoryError -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Dcassandra.compaction.priority=1 -Dcom.sun.management.jmxremote.port=10101 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dstorage-config=~/conf -Dcassandra-pidfile=~/cassandra.pid -cp ~/conf::~/lib/antlr-3.1.3.jar:~/lib/apache-cassandra-0.6.8.jar:~/lib/clhm-production.jar:~/lib/commons-cli-1.1.jar:~/lib/commons-codec-1.2.jar:~/lib/commons-collections-3.2.1.jar:~/lib/commons-lang-2.4.jar:~/lib/google-collections-1.0.jar:~/lib/hadoop-core-0.20.1.jar:~/lib/high-scale-lib.jar:~/lib/ivy-2.1.0.jar:~/lib/jackson-core-asl-1.4.0.jar:~/lib/jackson-mapper-asl-1.4.0.jar:~/lib/jline-0.9.94.jar:~/lib/json-simple-1.1.jar:~/lib/libthrift-r917130.jar:~/lib/log4j-1.2.14.jar:~/lib/slf4j-api-1.5.8.jar:~/lib/slf4j-log4j12-1.5.8.jar:~/lib/zapcass-1.0.0.jar:~/lib/zapcat-1.3-beta.jar com.clearspring.cassandra.ZapCassDaemon ZapCassDaemon is just getting cassandra and zabbix to talk. I do not believe we are unique in doing this: public class ZapCassDaemon { public EmbeddedCassandraService cassandra; public ZabbixAgent agent; public ZapCassDaemon() throws Exception { agent = new ZabbixAgent(); cassandra = new EmbeddedCassandraService(); cassandra.init(); cassandra.run(); } static public ZapCassDaemon instance; static public void main(String[] args) throws Exception { instance = new ZapCassDaemon(); } }