Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts
Thanks a lot for the help Graham and Robert! Will try increasing heap and see how it goes. Here are my gc settings, if they're still helpful (they're mostly the defaults): -Xms6G -Xmx6G -Xmn400M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways On Wed, Dec 3, 2014 at 2:17 AM, Jason Wee peich...@gmail.com wrote: ack and many thanks for the tips and help.. jason On Wed, Dec 3, 2014 at 4:49 AM, Robert Coli rc...@eventbrite.com wrote: On Mon, Dec 1, 2014 at 11:07 PM, Jason Wee peich...@gmail.com wrote: Hi Rob, any recommended documentation on describing explanation/configuration of the JVM heap and permanent generation ? We stucked in this same situation too. :( The archives of this list are chock full of explorations of various cases. Your best bet is to look for a good Aaron Morton reference where he breaks down the math between generations. I swear there was a blog post of his on this subject, but the best I can find is this slidedeck : http://www.slideshare.net/aaronmorton/cassandra-tk-2014-large-nodes =Rob -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br http://www.chaordic.com.br/* +55 48 3232.3200
Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts
On Mon, Dec 1, 2014 at 11:07 PM, Jason Wee peich...@gmail.com wrote: Hi Rob, any recommended documentation on describing explanation/configuration of the JVM heap and permanent generation ? We stucked in this same situation too. :( The archives of this list are chock full of explorations of various cases. Your best bet is to look for a good Aaron Morton reference where he breaks down the math between generations. I swear there was a blog post of his on this subject, but the best I can find is this slidedeck : http://www.slideshare.net/aaronmorton/cassandra-tk-2014-large-nodes =Rob
Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts
ack and many thanks for the tips and help.. jason On Wed, Dec 3, 2014 at 4:49 AM, Robert Coli rc...@eventbrite.com wrote: On Mon, Dec 1, 2014 at 11:07 PM, Jason Wee peich...@gmail.com wrote: Hi Rob, any recommended documentation on describing explanation/configuration of the JVM heap and permanent generation ? We stucked in this same situation too. :( The archives of this list are chock full of explorations of various cases. Your best bet is to look for a good Aaron Morton reference where he breaks down the math between generations. I swear there was a blog post of his on this subject, but the best I can find is this slidedeck : http://www.slideshare.net/aaronmorton/cassandra-tk-2014-large-nodes =Rob
Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts
On Fri, Nov 28, 2014 at 12:55 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: We restart the whole cluster every 1 or 2 months, to avoid machines getting into this crazy state. We tried tuning GC size and parameters, different cassandra versions (1.1, 1.2, 2.0), but this behavior keeps happening. More recently, during black friday, we received about 5x our normal load, and some machines started presenting this behavior. Once again, we restart the nodes an the GC behaves normal again. ... You can clearly notice some memory is actually reclaimed during GC in healthy nodes, while in sick machines very little memory is reclaimed. Also, since GC is executed more frequently in sick machines, it uses about 2x more CPU than non-sick nodes. Have you ever observed this behavior in your cluster? Could this be related to heap fragmentation? Would using the G1 collector help in this case? Any GC tuning or monitoring advice to troubleshoot this issue? The specific combo of symptoms does in fact sound like a combination of being close to heap exhaustion with working set and then fragmentation putting you over the top. I would probably start by increasing your heap, which will help avoid the pre-fail condition from your working set. But for tuning, examine the contents of each generation when the JVM gets into this state. You are probably exhausting permanent generation, but depending on what that says, you could change the relatively sizing of the generations. =Rob
Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts
Hi Rob, any recommended documentation on describing explanation/configuration of the JVM heap and permanent generation ? We stucked in this same situation too. :( Jason On Tue, Dec 2, 2014 at 3:42 AM, Robert Coli rc...@eventbrite.com wrote: On Fri, Nov 28, 2014 at 12:55 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: We restart the whole cluster every 1 or 2 months, to avoid machines getting into this crazy state. We tried tuning GC size and parameters, different cassandra versions (1.1, 1.2, 2.0), but this behavior keeps happening. More recently, during black friday, we received about 5x our normal load, and some machines started presenting this behavior. Once again, we restart the nodes an the GC behaves normal again. ... You can clearly notice some memory is actually reclaimed during GC in healthy nodes, while in sick machines very little memory is reclaimed. Also, since GC is executed more frequently in sick machines, it uses about 2x more CPU than non-sick nodes. Have you ever observed this behavior in your cluster? Could this be related to heap fragmentation? Would using the G1 collector help in this case? Any GC tuning or monitoring advice to troubleshoot this issue? The specific combo of symptoms does in fact sound like a combination of being close to heap exhaustion with working set and then fragmentation putting you over the top. I would probably start by increasing your heap, which will help avoid the pre-fail condition from your working set. But for tuning, examine the contents of each generation when the JVM gets into this state. You are probably exhausting permanent generation, but depending on what that says, you could change the relatively sizing of the generations. =Rob
Nodes get stuck in crazy GC loop after some time, leading to timeouts
Hello, This is a recurrent behavior of JVM GC in Cassandra that I never completely understood: when a node is UP for many days (or even months), or receives a very high load spike (3x-5x normal load), CMS GC pauses start becoming very frequent and slow, causing periodic timeouts in Cassandra. Trying to run GC manually doesn't free up memory. The only solution when a node reaches this state is to restart the node. We restart the whole cluster every 1 or 2 months, to avoid machines getting into this crazy state. We tried tuning GC size and parameters, different cassandra versions (1.1, 1.2, 2.0), but this behavior keeps happening. More recently, during black friday, we received about 5x our normal load, and some machines started presenting this behavior. Once again, we restart the nodes an the GC behaves normal again. I'm attaching a few pictures comparing the heap of healthy and sick nodes: http://imgur.com/a/Tcr3w You can clearly notice some memory is actually reclaimed during GC in healthy nodes, while in sick machines very little memory is reclaimed. Also, since GC is executed more frequently in sick machines, it uses about 2x more CPU than non-sick nodes. Have you ever observed this behavior in your cluster? Could this be related to heap fragmentation? Would using the G1 collector help in this case? Any GC tuning or monitoring advice to troubleshoot this issue? Any advice or pointers will be kindly appreciated. Cheers, -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br http://www.chaordic.com.br/* +55 48 3232.3200
Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts
Your GC settings would be helpful, though you can see guesstimate by eyeballing (assuming settings are the same across all 4 images) Bursty load can be a big cause of old gen fragmentation (as small working set objects tends to get spilled (promoted) along with memtable slabs which aren’t flushed quickly enough). That said, empty fragmentation holes wouldn’t show up as “used” in your graph, and that clearly looks like you are above your CMSIniatingOccupancyFraction and CMS is running continuously, so they probably aren’t the issue here. Other than trying a slightly larger heap to give you more head room, I’d also suggest from eyeballing that you have probably let the JVM pick its own new gen size, and I’d suggest it is too small. What to set it to really depends on your workload, but you could try something in the 0.5gig range unless that makes your young gen pauses too long. In that case (or indeed anyway) make sure you also have the latest GC settings (e.g. -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways) on newer JVMs (to help the young gc pauses) On Nov 28, 2014, at 2:55 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: Hello, This is a recurrent behavior of JVM GC in Cassandra that I never completely understood: when a node is UP for many days (or even months), or receives a very high load spike (3x-5x normal load), CMS GC pauses start becoming very frequent and slow, causing periodic timeouts in Cassandra. Trying to run GC manually doesn't free up memory. The only solution when a node reaches this state is to restart the node. We restart the whole cluster every 1 or 2 months, to avoid machines getting into this crazy state. We tried tuning GC size and parameters, different cassandra versions (1.1, 1.2, 2.0), but this behavior keeps happening. More recently, during black friday, we received about 5x our normal load, and some machines started presenting this behavior. Once again, we restart the nodes an the GC behaves normal again. I'm attaching a few pictures comparing the heap of healthy and sick nodes: http://imgur.com/a/Tcr3w http://imgur.com/a/Tcr3w You can clearly notice some memory is actually reclaimed during GC in healthy nodes, while in sick machines very little memory is reclaimed. Also, since GC is executed more frequently in sick machines, it uses about 2x more CPU than non-sick nodes. Have you ever observed this behavior in your cluster? Could this be related to heap fragmentation? Would using the G1 collector help in this case? Any GC tuning or monitoring advice to troubleshoot this issue? Any advice or pointers will be kindly appreciated. Cheers, -- Paulo Motta Chaordic | Platform www.chaordic.com.br http://www.chaordic.com.br/ +55 48 3232.3200 smime.p7s Description: S/MIME cryptographic signature
Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts
I should note that the young gen size is just a tuning suggestion, not directly related to your problem at hand. You might want to make sure you don’t have issues with key/row cache. Also, I’m assuming that your extra load isn’t hitting tables that you wouldn’t normally be hitting. On Nov 28, 2014, at 6:54 PM, graham sanderson gra...@vast.com wrote: Your GC settings would be helpful, though you can see guesstimate by eyeballing (assuming settings are the same across all 4 images) Bursty load can be a big cause of old gen fragmentation (as small working set objects tends to get spilled (promoted) along with memtable slabs which aren’t flushed quickly enough). That said, empty fragmentation holes wouldn’t show up as “used” in your graph, and that clearly looks like you are above your CMSIniatingOccupancyFraction and CMS is running continuously, so they probably aren’t the issue here. Other than trying a slightly larger heap to give you more head room, I’d also suggest from eyeballing that you have probably let the JVM pick its own new gen size, and I’d suggest it is too small. What to set it to really depends on your workload, but you could try something in the 0.5gig range unless that makes your young gen pauses too long. In that case (or indeed anyway) make sure you also have the latest GC settings (e.g. -XX:+CMSParallelInitialMarkEnabled -XX:+CMSEdenChunksRecordAlways) on newer JVMs (to help the young gc pauses) On Nov 28, 2014, at 2:55 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com mailto:paulo.mo...@chaordicsystems.com wrote: Hello, This is a recurrent behavior of JVM GC in Cassandra that I never completely understood: when a node is UP for many days (or even months), or receives a very high load spike (3x-5x normal load), CMS GC pauses start becoming very frequent and slow, causing periodic timeouts in Cassandra. Trying to run GC manually doesn't free up memory. The only solution when a node reaches this state is to restart the node. We restart the whole cluster every 1 or 2 months, to avoid machines getting into this crazy state. We tried tuning GC size and parameters, different cassandra versions (1.1, 1.2, 2.0), but this behavior keeps happening. More recently, during black friday, we received about 5x our normal load, and some machines started presenting this behavior. Once again, we restart the nodes an the GC behaves normal again. I'm attaching a few pictures comparing the heap of healthy and sick nodes: http://imgur.com/a/Tcr3w http://imgur.com/a/Tcr3w You can clearly notice some memory is actually reclaimed during GC in healthy nodes, while in sick machines very little memory is reclaimed. Also, since GC is executed more frequently in sick machines, it uses about 2x more CPU than non-sick nodes. Have you ever observed this behavior in your cluster? Could this be related to heap fragmentation? Would using the G1 collector help in this case? Any GC tuning or monitoring advice to troubleshoot this issue? Any advice or pointers will be kindly appreciated. Cheers, -- Paulo Motta Chaordic | Platform www.chaordic.com.br http://www.chaordic.com.br/ +55 48 3232.3200 smime.p7s Description: S/MIME cryptographic signature