Re: Cassandra as storage for cache data

2013-07-02 Thread Dmitry Olshansky
In our case we have continuous flow of data to be cached. Every second we're receiving tens of PUT requests. Every request has 500Kb payload in average and TTL about 20 minutes. On the other side we have the similar flow of GET requests. Every GET request is transformed to get by key query

Re: Cassandra as storage for cache data

2013-07-02 Thread Terje Marthinussen
If this is a tombstone problem as suggested by some, and it is ok to turn of replication as suggested by others, it may be an idea to do an optimization in cassandra where if replication_factor 1: do not create tombstones Terje On Jul 2, 2013, at 11:11 PM, Dmitry Olshansky

Re: Cassandra as storage for cache data

2013-07-01 Thread Dmitry Olshansky
Hello, thanks to all for your answers and comments. What we've done: - increased Java heap memory up to 6 Gb - changed replication factor to 1 - set durable_writes to false - set memtable_total_space_in_mb to 5000 - set commitlog_total_space_in_mb to 6000 If I understand correctly the last

Re: Cassandra as storage for cache data

2013-07-01 Thread Robert Coli
The most effective way to deal with obsolete Tombstones in the short lived cache case seems to be to drop them on the floor en masse... :D a) have two column families that the application alternates between, modulo time_period b) truncate and populate the cold one c) read from the hot one d)

Re: Cassandra as storage for cache data

2013-06-28 Thread aaron morton
https://issues.apache.org/jira/browse/CASSANDRA-2958 Thanks - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 28/06/2013, at 6:30 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jun 26, 2013 at 9:51 PM, aaron morton

Re: Cassandra as storage for cache data

2013-06-27 Thread Robert Coli
On Wed, Jun 26, 2013 at 9:51 PM, aaron morton aa...@thelastpickle.com wrote: WARNING: disabling durable_writes means that writes are only in memory and will not be committed to disk until the CF's are flushed. You should *always* use nodetool drain before shutting down a node in this case.

Re: Cassandra as storage for cache data

2013-06-26 Thread aaron morton
I'll also add that you are probably running into some memory issues, 2.5 GB is a low heap size -Xms2500M -Xmx2500M -Xmn400M If you really do have a cache and want to reduce the disk activity disable durable_writes on the KS. That will stop the writes from going to the commit log which is

Cassandra as storage for cache data

2013-06-25 Thread Dmitry Olshansky
Hello, we are using Cassandra as a data storage for our caching system. Our application generates about 20 put and get requests per second. An average size of one cache item is about 500 Kb. Cache items are placed into one column family with TTL set to 20 - 60 minutes. Keys and values are

Re: Cassandra as storage for cache data

2013-06-25 Thread Jeremy Hanna
If you have rapidly expiring data, then tombstones are probably filling your disk and your heap (depending on how you order the data on disk). To check to see if your queries are affected by tombstones, you might try using the query tracing that's built-in to 1.2. See:

Re: Cassandra as storage for cache data

2013-06-25 Thread sankalp kohli
Apart from what Jeremy said, you can try these 1) Use replication = 1. It is cache data and you dont need persistence. 2) Try playing with memtable size. 3) Use netflix client library as it will reduce one hop. It will chose the node with data as the co ordinator. 4) Work on your schema. You might