On 12/6/2014 3:00 PM, Shawn Heisey wrote: > On 12/5/2014 2:42 PM, Erick Erickson wrote: >> Saw this on the Cloudera website: >> >> http://blog.cloudera.com/blog/2014/12/tuning-java-garbage-collection-for-hbase/ >> >> Original post here: >> https://software.intel.com/en-us/blogs/2014/06/18/part-1-tuning-java-garbage-collection-for-hbase >> >> Although it's for hbase, I thought the presentation went into enough >> detail about what improvements they'd seen that I can see it being >> useful for Solr folks. And we have some people on this list who are >> interested in this sort of thing.... > > Very interesting. My own experiences with G1 and Solr (which I haven't > repeated since early Java 7 releases, something like 7u10 or 7u13) would > show even worse spikes compared to the blue lines on those graphs ... > and my heap isn't anywhere even CLOSE to 100GB. Solr probably has > different garbage creation characteristics than hbase.
Followup with graphs. I've cc'd Rory at Oracle too, with hopes that this info will ultimately reach those who work on G1. I can provide the actual GC logs as well. Here's a graph of a GC log lasting over two weeks with a tuned CMS collector and Oracle Java 7u25 and a 6GB heap. https://www.dropbox.com/s/mygjeviyybqqnqd/cms-7u25.png?dl=0 CMS was tuned using these settings: http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning This graph shows that virtually all collection pauses were a little under half a second. There were exactly three full garbage collections, and each one took around six seconds. While that is a significant pause, having only three such collections over a period of 16 days sounds pretty good to me. Here's about half as much runtime (8 days) on the same server running G1 with Oracle 7u72 and the same 6GB heap. G1 is untuned, because I do not know how: https://www.dropbox.com/s/2kgx60gj988rflj/g1-7u72.png?dl=0 Most of these collections were around a tenth of a second ... which is certainly better than nearly half a second ... but there are a LOT of collections that take longer than a second, and a fair number of them that took between 3 and 5 seconds. It's difficult to say which of these graphs is actually better. The CMS graph is certainly more consistent, and does a LOT fewer full GCs ... but is the 4 to 1 improvement in a typical GC enough to reveal significantly better performance? My instinct says that it would NOT be enough for that, especially with so many collections taking 1-3 seconds. If the server was really busy (mine isn't), I wonder whether the GC graph would look similar, or whether it would be really different. A busy server would need to collect a lot more garbage, so I fear that the yellow and black parts of the G1 graph would dominate more than they do in my graph, which would be overall a bad thing. Only real testing on busy servers can tell us that. I can tell you for sure that the G1 graph looks a lot better than it did in early Java 7 releases, but additional work by Oracle (and perhaps some G1 tuning options) might significantly improve it. Thanks, Shawn --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
