RE: Garbage collection freezes cassandra node
Thanks for this very helpful info. It is indeed a production site which I cannot easily upgrade. I will try the various gc knobs and post any positive results. -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: vrijdag 20 januari 2012 4:23 To: user@cassandra.apache.org Subject: Re: Garbage collection freezes cassandra node On node 172.16.107.46, I see the following: 21:53:27.192+0100: 1335393.834: [GC 1335393.834: [ParNew (promotion failed): 319468K-324959K(345024K), 0.1304456 secs]1335393.964: [CMS: 6000844K-3298251K(8005248K), 10.8526193 secs] 6310427K-3298251K(8350272K), [CMS Perm : 26355K-26346K(44268K)], 10.9832679 secs] [Times: user=11.15 sys=0.03, real=10.98 secs] 21:53:38,174 GC for ConcurrentMarkSweep: 10856 ms for 1 collections, 3389079904 used; max is 8550678528 I have not yet tested the XX:+DisableExplicitGC switch. Is the right thing to do to decrease the CMSInitiatingOccupancyFraction setting? * Increasing the total heap size can definitely help; the only kink is that if you need to increase the heap size unacceptably much, it is not helpful. * Decreasing the occupancy trigger can help yes, but you will get very much diminishing returns as your trigger fraction approaches the actual live size of data on the heap. * I just re-checked your original message - you're on Cassandra 0.7? I *strongly* suggest upgrading to 1.x. In general that holds true, but also specifically relating to this are significant improvements in memory allocation behavior that significantly reduces the probability and/or frequency of promotion failures and full gcs. * Increasing the size of the young generation can help by causing less promotion to old-gen (see the cassandra.in.sh script or equivalent of for Windows). * Increasing the amount of parallel threads used by CMS can help CMS complete it's marking phase quicker, but at the cost of a greater impact on the mutator (cassandra). I think the most important thing is - upgrade to 1.x before you run these benchmarks. Particularly detailed tuning of GC issues is pretty useless on 0.7 given the significant changes in 1.0. Don't even bother spending time on this until you're on 1.0, unless this is about a production cluster that you cannot upgrade for some reason. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
Re: Garbage collection freezes cassandra node
Thanks for this very helpful info. It is indeed a production site which I cannot easily upgrade. I will try the various gc knobs and post any positive results. *IF* your data size, or at least hot set, is small enough that you're not extremely reliant on the current size of page cache, and in terms of short-term relief, I recommend: * Significantly increasing the heap size. Like double it or more. * Decrease the occupancy trigger such that it kicks in around the time it already does (in terms of amount of heap usage used). * Increase the young generation size (to lessen promotion into old-gen). Experiment on a single node, making sure you're not causing too much disk I/O by stealing memory otherwise used by page cache. Once you have something that works you might try slowly going back down. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
RE: Garbage collection freezes cassandra node
Thanks for your comments. The application is indeed suffering from a freezing Cassandra node. Queries are taking longer than 10 seconds at the moment of a full garbage collect. Here is an example from the logs. I have a three node cluster. At some point I see on a node the following log: 21:53:35,986 InetAddress /172.16.107.46 is now dead. On node 172.16.107.46, I see the following: 21:53:27.192+0100: 1335393.834: [GC 1335393.834: [ParNew (promotion failed): 319468K-324959K(345024K), 0.1304456 secs]1335393.964: [CMS: 6000844K-3298251K(8005248K), 10.8526193 secs] 6310427K-3298251K(8350272K), [CMS Perm : 26355K-26346K(44268K)], 10.9832679 secs] [Times: user=11.15 sys=0.03, real=10.98 secs] 21:53:38,174 GC for ConcurrentMarkSweep: 10856 ms for 1 collections, 3389079904 used; max is 8550678528 I have not yet tested the XX:+DisableExplicitGC switch. Is the right thing to do to decrease the CMSInitiatingOccupancyFraction setting? Thanks! Rene -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: dinsdag 20 december 2011 6:38 To: user@cassandra.apache.org Subject: Re: Garbage collection freezes cassandra node I should add: If you are indeed actually pausing due to promotion failed or concurrent mode failure (which you will see in the GC log if you enable it with the options I suggested), the first thing I would try to mitigate is: * Decrease the occupancy trigger (search for occupancy) of CMS to a lower percentage, making the concurrent mark phase start earlier. * Increase heap size significantly (probably not necessary based on your graph, but for good measure). If it then goes away, report back and we can perhaps figure out details. There are other things that can be done. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
Re: Garbage collection freezes cassandra node
What's the version of Java do you use? Can you try reducing NewSize and increasing Old generation? If you are on old version of Java I also recommend upgrading that version. On Thu, Jan 19, 2012 at 3:27 AM, Rene Kochen rene.koc...@emea.schange.com wrote: Thanks for your comments. The application is indeed suffering from a freezing Cassandra node. Queries are taking longer than 10 seconds at the moment of a full garbage collect. Here is an example from the logs. I have a three node cluster. At some point I see on a node the following log: 21:53:35,986 InetAddress /172.16.107.46 is now dead. On node 172.16.107.46, I see the following: 21:53:27.192+0100: 1335393.834: [GC 1335393.834: [ParNew (promotion failed): 319468K-324959K(345024K), 0.1304456 secs]1335393.964: [CMS: 6000844K-3298251K(8005248K), 10.8526193 secs] 6310427K-3298251K(8350272K), [CMS Perm : 26355K-26346K(44268K)], 10.9832679 secs] [Times: user=11.15 sys=0.03, real=10.98 secs] 21:53:38,174 GC for ConcurrentMarkSweep: 10856 ms for 1 collections, 3389079904 used; max is 8550678528 I have not yet tested the XX:+DisableExplicitGC switch. Is the right thing to do to decrease the CMSInitiatingOccupancyFraction setting? Thanks! Rene -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: dinsdag 20 december 2011 6:38 To: user@cassandra.apache.org Subject: Re: Garbage collection freezes cassandra node I should add: If you are indeed actually pausing due to promotion failed or concurrent mode failure (which you will see in the GC log if you enable it with the options I suggested), the first thing I would try to mitigate is: * Decrease the occupancy trigger (search for occupancy) of CMS to a lower percentage, making the concurrent mark phase start earlier. * Increase heap size significantly (probably not necessary based on your graph, but for good measure). If it then goes away, report back and we can perhaps figure out details. There are other things that can be done. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
Re: Garbage collection freezes cassandra node
On node 172.16.107.46, I see the following: 21:53:27.192+0100: 1335393.834: [GC 1335393.834: [ParNew (promotion failed): 319468K-324959K(345024K), 0.1304456 secs]1335393.964: [CMS: 6000844K-3298251K(8005248K), 10.8526193 secs] 6310427K-3298251K(8350272K), [CMS Perm : 26355K-26346K(44268K)], 10.9832679 secs] [Times: user=11.15 sys=0.03, real=10.98 secs] 21:53:38,174 GC for ConcurrentMarkSweep: 10856 ms for 1 collections, 3389079904 used; max is 8550678528 I have not yet tested the XX:+DisableExplicitGC switch. Is the right thing to do to decrease the CMSInitiatingOccupancyFraction setting? * Increasing the total heap size can definitely help; the only kink is that if you need to increase the heap size unacceptably much, it is not helpful. * Decreasing the occupancy trigger can help yes, but you will get very much diminishing returns as your trigger fraction approaches the actual live size of data on the heap. * I just re-checked your original message - you're on Cassandra 0.7? I *strongly* suggest upgrading to 1.x. In general that holds true, but also specifically relating to this are significant improvements in memory allocation behavior that significantly reduces the probability and/or frequency of promotion failures and full gcs. * Increasing the size of the young generation can help by causing less promotion to old-gen (see the cassandra.in.sh script or equivalent of for Windows). * Increasing the amount of parallel threads used by CMS can help CMS complete it's marking phase quicker, but at the cost of a greater impact on the mutator (cassandra). I think the most important thing is - upgrade to 1.x before you run these benchmarks. Particularly detailed tuning of GC issues is pretty useless on 0.7 given the significant changes in 1.0. Don't even bother spending time on this until you're on 1.0, unless this is about a production cluster that you cannot upgrade for some reason. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
RE: Garbage collection freezes cassandra node
Thanks for your quick response! I am currently running the performance tests with extended gc logging. I will post the gc logging if clients time out at the same moment that the full garbage collect runs. Thanks Rene -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: dinsdag 20 december 2011 6:36 To: user@cassandra.apache.org Subject: Re: Garbage collection freezes cassandra node During the garbage collections, Cassandra freezes for about ten seconds. I observe the following log entries: “GC for ConcurrentMarkSweep: 11597 ms for 1 collections, 1887933144 used; max is 8550678528” Ok, first off: Are you certain that it is actually pausing, or are you assuming that due to the log entry above? Because the log entry in no way indicates a 10 second pause; it only indicates that CMS took 10 seconds - which is entirely expected, and most of CMS is concurrent and implies only short pauses. A full pause can happen, but that log entry is expected and is not in and of itself indicative of a stop-the-world 10 second pause. It is fully expected using the CMS collector that you'll have a sawtooth pattern as young gen is being collected, and then a sudden drop as CMS does its job concurrently without pausing the application for a long period of time. I will second the recommendation to run with -XX:+DisableExplicitGC (or -XX:+ExplicitGCInvokesConcurrent) to eliminate that as a source. I would also run with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps and report back the results (i.e., the GC log around the time of the pause). Your graph is looking very unusual for CMS. It's possible that everything is as it otherwise should and CMS is kicking in too late, but I am kind of skeptical towards that even the extremely smooth look of your graph. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
AW: Garbage collection freezes cassandra node
Tuning garbage colletion is really hard. Espescially if you do not know why garbage collection stalls. In general I must say I have never seen a software which shipped with such a good garbage collection configuration as Cassandra. The thing that looks suspiscious is that the major collections appear regularly in 1 hour intervals. The only software I know of is RMI explicit collections (which I am not certain Cassandra uses) You could avoid those with -XX:+DisableExplicitGC In the Cassandra start script But I assume that Cassandra itself makes use of explicit gc as well so this might have some nasty side effects. Unless someone tells you never to do that on Cassandra I would go for a try and see what happens. Keep flushes and deletions of sstables on your mind though, they are somehow tied to gc I think. Another option (if it is RMI) is to make the RMI full gc threshold larger for timeouts to occur less often. -Dsun.rmi.dgc.client.gcInterval=60 -Dsun.rmi.dgc.server.gcInterval=60 The number is ms until a gc is triggered. Anyway.. same warning as before applies Cheers. Von: Rene Kochen [mailto:rene.koc...@emea.schange.com] Gesendet: Montag, 19. Dezember 2011 16:35 An: user@cassandra.apache.org Betreff: Garbage collection freezes cassandra node I recently see the following garbage collect behavior in our performance tests (the attached chart shows the heap-size in MB): [cid:image001.jpg@01CCBE73.8B567F80] During the garbage collections, Cassandra freezes for about ten seconds. I observe the following log entries: GC for ConcurrentMarkSweep: 11597 ms for 1 collections, 1887933144 used; max is 8550678528 I use Windows Server 2008 and Cassandra 0.7.10 with min and max heap size set to 8 GB. What can I do to make Cassandra not freeze? Just allocate more memory? Thanks, Rene inline: image001.jpg
Re: Garbage collection freezes cassandra node
Increasing memory in this case may not solve the problem. Share some information about your workload. Cluster configuration, cache sizes etc. You can also try getting java heap historgram to get more info on what's on the heap. On Mon, Dec 19, 2011 at 7:35 AM, Rene Kochen rene.koc...@emea.schange.comwrote: I recently see the following garbage collect behavior in our performance tests (the attached chart shows the heap-size in MB): ** ** ** ** During the garbage collections, Cassandra freezes for about ten seconds. I observe the following log entries: ** ** “GC for ConcurrentMarkSweep: 11597 ms for 1 collections, 1887933144 used; max is 8550678528” ** ** I use Windows Server 2008 and Cassandra 0.7.10 with min and max heap size set to 8 GB. ** ** What can I do to make Cassandra not freeze? Just allocate more memory? ** ** Thanks, Rene image001.jpg
Re: Garbage collection freezes cassandra node
During the garbage collections, Cassandra freezes for about ten seconds. I observe the following log entries: “GC for ConcurrentMarkSweep: 11597 ms for 1 collections, 1887933144 used; max is 8550678528” Ok, first off: Are you certain that it is actually pausing, or are you assuming that due to the log entry above? Because the log entry in no way indicates a 10 second pause; it only indicates that CMS took 10 seconds - which is entirely expected, and most of CMS is concurrent and implies only short pauses. A full pause can happen, but that log entry is expected and is not in and of itself indicative of a stop-the-world 10 second pause. It is fully expected using the CMS collector that you'll have a sawtooth pattern as young gen is being collected, and then a sudden drop as CMS does its job concurrently without pausing the application for a long period of time. I will second the recommendation to run with -XX:+DisableExplicitGC (or -XX:+ExplicitGCInvokesConcurrent) to eliminate that as a source. I would also run with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps and report back the results (i.e., the GC log around the time of the pause). Your graph is looking very unusual for CMS. It's possible that everything is as it otherwise should and CMS is kicking in too late, but I am kind of skeptical towards that even the extremely smooth look of your graph. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
Re: Garbage collection freezes cassandra node
I should add: If you are indeed actually pausing due to promotion failed or concurrent mode failure (which you will see in the GC log if you enable it with the options I suggested), the first thing I would try to mitigate is: * Decrease the occupancy trigger (search for occupancy) of CMS to a lower percentage, making the concurrent mark phase start earlier. * Increase heap size significantly (probably not necessary based on your graph, but for good measure). If it then goes away, report back and we can perhaps figure out details. There are other things that can be done. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)