Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-12-03 Thread Paulo Ricardo Motta Gomes
Thanks a lot for the help Graham and Robert! Will try increasing heap and see how it goes. Here are my gc settings, if they're still helpful (they're mostly the defaults): -Xms6G -Xmx6G -Xmn400M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-12-02 Thread Robert Coli
On Mon, Dec 1, 2014 at 11:07 PM, Jason Wee peich...@gmail.com wrote: Hi Rob, any recommended documentation on describing explanation/configuration of the JVM heap and permanent generation ? We stucked in this same situation too. :( The archives of this list are chock full of explorations of

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-12-02 Thread Jason Wee
ack and many thanks for the tips and help.. jason On Wed, Dec 3, 2014 at 4:49 AM, Robert Coli rc...@eventbrite.com wrote: On Mon, Dec 1, 2014 at 11:07 PM, Jason Wee peich...@gmail.com wrote: Hi Rob, any recommended documentation on describing explanation/configuration of the JVM heap and

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-12-01 Thread Robert Coli
On Fri, Nov 28, 2014 at 12:55 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: We restart the whole cluster every 1 or 2 months, to avoid machines getting into this crazy state. We tried tuning GC size and parameters, different cassandra versions (1.1, 1.2, 2.0), but this

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-12-01 Thread Jason Wee
Hi Rob, any recommended documentation on describing explanation/configuration of the JVM heap and permanent generation ? We stucked in this same situation too. :( Jason On Tue, Dec 2, 2014 at 3:42 AM, Robert Coli rc...@eventbrite.com wrote: On Fri, Nov 28, 2014 at 12:55 PM, Paulo Ricardo Motta

Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-11-28 Thread Paulo Ricardo Motta Gomes
Hello, This is a recurrent behavior of JVM GC in Cassandra that I never completely understood: when a node is UP for many days (or even months), or receives a very high load spike (3x-5x normal load), CMS GC pauses start becoming very frequent and slow, causing periodic timeouts in Cassandra.

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-11-28 Thread graham sanderson
Your GC settings would be helpful, though you can see guesstimate by eyeballing (assuming settings are the same across all 4 images) Bursty load can be a big cause of old gen fragmentation (as small working set objects tends to get spilled (promoted) along with memtable slabs which aren’t

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-11-28 Thread graham sanderson
I should note that the young gen size is just a tuning suggestion, not directly related to your problem at hand. You might want to make sure you don’t have issues with key/row cache. Also, I’m assuming that your extra load isn’t hitting tables that you wouldn’t normally be hitting. On Nov