Re: Nodes frozen in GC

2011-03-15 Thread Peter Schuller
Sorry about the delay, I do believe there is a fundamental issue with compactions allocating too much memory and incurring too many garbage collections (at least with 0.6.12). [snip a lot of good info] You certainly seem to have a real issue, though I don't get the feel it's the same as the

Re: Nodes frozen in GC

2011-03-10 Thread Peter Schuller
I think it would be very useful to get to the bottom of this but without further details (like the asked for GC logs) I'm not sure what to do/suggest. It's clear that a single CF with a 64 MB memtable flush threshold and without key cache and row cache and some bulk insertion, should not be

RE: Nodes frozen in GC

2011-03-10 Thread Gregory Szorc
else, just ask, and I'll see what I can do. Gregory Szorc gregory.sz...@xobni.com -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: Thursday, March 10, 2011 10:36 AM To: ruslan usifov Cc: user@cassandra.apache.org Subject: Re: Nodes

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Chris Goffinet c...@chrisgoffinet.com How large are your SSTables on disk? My thought was because you have so many on disk, we have to store the bloom filter + every 128 keys from index in memory. 0.5GB But as I understand store in memory happens only when read happens, i do only

Re: Nodes frozen in GC

2011-03-08 Thread David Boxenhorn
If RF=2 and CL= QUORUM, you're getting no benefit from replication. When a node is in GC it stops everything. Set RF=3, so when one node is busy the cluster will still work. On Tue, Mar 8, 2011 at 11:46 AM, ruslan usifov ruslan.usi...@gmail.comwrote: 2011/3/8 Chris Goffinet

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
(1) I cannot stress this one enough: Run with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output. Actually, I wonder if it's worth someone getting this enabled by default, with the obvious problems associated with getting the log output placed appropriately and

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
Also: * What is the frequency of the pauses? Are we talking every few seconds, minutes, hours, days * If you say decrease the load down to 25%. Are you seeing the same effect but at 1/4th the frequency, or does it remain unchanged, or does the problem go away completely? -- / Peter Schuller

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Peter Schuller peter.schul...@infidyne.com (1) I cannot stress this one enough: Run with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output. (2) Attach to your process with jconsole or some similar tool. (3) Observe the behavior of the heap over time.

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
JVM_OPTS=$JVM_OPTS -XX:+PrintGCApplicationStoppedTime JVM_OPTS=$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log Add: JVM_OPTS=$JVM_OPTS -XX:+PrintGC JVM_OPTS=$JVM_OPTS -XX:+PrintGCDetails JVM_OPTS=$JVM_OPTS -XX:+PrintGCTimeStamps And you will see significantly more detail in the GC log. -- /

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
                $client-batch_mutate($mutations, cassandra_ConsistencyLevel::QUORUM); Btw, what are the mutations? Are you doing something like inserting both very small values and very large ones? In any case: My main reason to butt back into this thread is that under normal circumstances you

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
Also, why is there so much garbage collection to begin with?  Memcache uses a slab allocator to reuse blocks to prevent allocation/deallocation of blocks from consuming all the cpu time.  Are there any plans to reuse blocks so the garbage collector doesn't have to work so hard? And to address

Re: Nodes frozen in GC

2011-03-08 Thread Paul Pak
Hi Ruslan, Is it possible for you to tell us the details on what you have done which measurably helped your situation, so we can start a best practices doc on growing cassandra systems? So far, I see that under load, cassandra is rarely ready to take heavy load in it's default configuration and

Re: Nodes frozen in GC

2011-03-08 Thread ruslan usifov
2011/3/8 Paul Pak p...@yellowseo.com Hi Ruslan, Is it possible for you to tell us the details on what you have done which measurably helped your situation, so we can start a best practices doc on growing cassandra systems? So far, I see that under load, cassandra is rarely ready to take

Re: Nodes frozen in GC

2011-03-07 Thread ruslan usifov
2011/3/6 aaron morton aa...@thelastpickle.com Your node is under memory pressure, after the GC there is still 5.7GB in use. In fact it looks like memory usage went up during the GC process. Can you reduce the memtable size, caches or the number of CF's or increase the JVM size? Also is this

Re: Nodes frozen in GC

2011-03-07 Thread Aaron Morton
It's always possible to run out of memory. Can you provide... - number cf's and their Memtable settings - any row or key cache settings - any other buffer or memory settings you may have changed in Cassandra.yaml. - what load you are putting on the cluster, e.g. Inserting x rows/columns per

Re: Nodes frozen in GC

2011-03-07 Thread Jonathan Ellis
It sounds like you're complaining that the JVM sometimes does stop-the-world GC. You can mitigate this but not (for most workloads) eliminate it with GC option tuning. That's simply the state of the art for Java garbage collection right now. On Sun, Mar 6, 2011 at 2:18 AM, ruslan usifov

Re: Nodes frozen in GC

2011-03-07 Thread ruslan usifov
2011/3/8 Jonathan Ellis jbel...@gmail.com It sounds like you're complaining that the JVM sometimes does stop-the-world GC. You can mitigate this but not (for most workloads) eliminate it with GC option tuning. That's simply the state of the art for Java garbage collection right now. Hm,

Re: Nodes frozen in GC

2011-03-07 Thread Paul Pak
So, are you saying this is normal and expected from Cassandra? So, under load, we can expect java garbage collection to stop the Cassandra process on that server from time to time, essentially taking out the node for short periods of time while it does garbage collection? Also, why is there so

Re: Nodes frozen in GC

2011-03-07 Thread Paul Pak
Hi Ruslan, It looks like Jonathan and Stu have already been working to reduce garbage collection on v.8 The ticket is at https://issues.apache.org/jira/browse/CASSANDRA-2252 Jonathan, is there any way to apply the patch to .73 and have ruslan test it to see if it fixes his issue with Garbage

Re: Nodes frozen in GC

2011-03-07 Thread Chris Goffinet
Can you tell me how many SSTables on disk when you see GC pauses? In your 3 node cluster, what's the RF factor? On Mon, Mar 7, 2011 at 1:50 PM, ruslan usifov ruslan.usi...@gmail.comwrote: 2011/3/8 Jonathan Ellis jbel...@gmail.com It sounds like you're complaining that the JVM sometimes does

Re: Nodes frozen in GC

2011-03-07 Thread ruslan usifov
2011/3/8 Chris Goffinet c...@chrisgoffinet.com Can you tell me how many SSTables on disk when you see GC pauses? In your 3 node cluster, what's the RF factor? About 30-40, and i use RF=2, and insert rows with QUORUM consistency level

Re: Nodes frozen in GC

2011-03-07 Thread Chris Goffinet
The rows you are inserting, what is your update ratio to those rows? On Mon, Mar 7, 2011 at 4:03 PM, ruslan usifov ruslan.usi...@gmail.comwrote: 2011/3/8 Chris Goffinet c...@chrisgoffinet.com Can you tell me how many SSTables on disk when you see GC pauses? In your 3 node cluster, what's

Re: Nodes frozen in GC

2011-03-07 Thread ruslan usifov
2011/3/8 Chris Goffinet c...@chrisgoffinet.com The rows you are inserting, what is your update ratio to those rows? I doesn't update them only insert, with speed 16000 per second

Re: Nodes frozen in GC

2011-03-07 Thread Chris Goffinet
How large are your SSTables on disk? My thought was because you have so many on disk, we have to store the bloom filter + every 128 keys from index in memory. On Mon, Mar 7, 2011 at 4:35 PM, ruslan usifov ruslan.usi...@gmail.comwrote: 2011/3/8 Chris Goffinet c...@chrisgoffinet.com The rows

Re: Nodes frozen in GC

2011-03-06 Thread ruslan usifov
2011/3/6 aaron morton aa...@thelastpickle.com Your node is under memory pressure, after the GC there is still 5.7GB in use. In fact it looks like memory usage went up during the GC process. Can you reduce the memtable size, caches or the number of CF's or increase the JVM size? Also is this

Re: Nodes frozen in GC

2011-03-06 Thread Peter Schuller
Do you have row cache enabled? Disable it. If it fixes it and you want it, re-enable but consider row sizes and the cap on the cache size.. -- / Peter Schuller