Re: Cassandra as storage for cache data

2013-07-02 Thread Dmitry Olshansky
In our case we have continuous flow of data to be cached. Every second 
we're receiving tens of PUT requests. Every request has 500Kb payload in 
average and TTL about 20 minutes.


On the other side we have the similar flow of GET requests. Every GET 
request is transformed to get by key query for cassandra.


This is very simple and straightforward solution:
- one CF
- one key that is directly corresponds to cache entry key
- one value of type bytes that corresponds to cache entry payload

To be honest, I don't see how we can switch this solution to multi-CF 
scheme playing with time-based snapshots.


Today this solution crashed again with overload symptoms:
- almost non-stop compactifications on every node in cluster
- large io-wait in the system
- clients start failing with timeout exceptions

At the same time we see that cassandra uses only half of java heap. How 
we can enforce it to start using all available resources (namely 
operating memory)?


Best regards,
Dmitry Olshansky


Re: Cassandra as storage for cache data

2013-07-01 Thread Dmitry Olshansky

Hello,

thanks to all for your answers and comments.

What we've done:
- increased Java heap memory up to 6 Gb
- changed replication factor to 1
- set durable_writes to false
- set memtable_total_space_in_mb to 5000
- set commitlog_total_space_in_mb to 6000

If I understand correctly the last parameter has no matter since we set 
durable_writes to false.


Now the overall performance is much better but still not outstanding. We 
continue observing quite frequent compactions on every node.


According to OpsCenter's graphs Java Heap never grows above 3.5 Gb. So 
there is enough memory to keep memtables. Why they still get flushed to 
disk triggering compactions?


--
Best regards,
Dmitry Olshansky


Cassandra as storage for cache data

2013-06-25 Thread Dmitry Olshansky

Hello,

we are using Cassandra as a data storage for our caching system. Our 
application generates about 20 put and get requests per second. An 
average size of one cache item is about 500 Kb.


Cache items are placed into one column family with TTL set to 20 - 60 
minutes. Keys and values are bytes (not utf8 strings). Compaction 
strategy is SizeTieredCompactionStrategy.


We setup Cassandra 1.2.6 cluster of 4 nodes. Replication factor is 2. 
Each node has 10GB of RAM and enough space on HDD.


Now when we're putting this cluster into the load it's quickly fills 
with our runtime data (about 5 GB on every node) and we start observing 
performance degradation with often timeouts on client side.


We see that on each node compaction starts very frequently and lasts for 
several minutes to complete. It seems that each node usually busy with 
compaction process.


Here the questions:

What are the recommended setup configuration for our use case?

Is it makes sense to somehow tell Cassandra to keep all data in memory 
(memtables) to eliminate flushing it to disk (sstables) thus decreasing 
number of compactions? How to achieve this behavior?


Cassandra is starting with default shell script that gives the following 
command line:


jsvc.exec -user cassandra -home 
/usr/lib/jvm/java-6-openjdk-amd64/jre/bin/../ -pidfile 
/var/run/cassandra.pid -errfile 1 -outfile 
/var/log/cassandra/output.log -cp CLASSPATH_SKIPPED 
-Dlog4j.configuration=log4j-server.properties 
-Dlog4j.defaultInitOverride=true 
-XX:HeapDumpPath=/var/lib/cassandra/java_1371805844.hprof 
-XX:ErrorFile=/var/lib/cassandra/hs_err_1371805844.log -ea 
-javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar 
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms2500M -Xmx2500M 
-Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss180k -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
-XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
-XX:+UseTLAB -Djava.net.preferIPv4Stack=true 
-Dcom.sun.management.jmxremote.port=7199 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
org.apache.cassandra.service.CassandraDaemon


--
Best regards,
Dmitry Olshansky