It's possible the time attributed to GC is actually spent somewhere else; a multitude of tasks may occur during the same safepoint as a GC. We've seen some batch revoke of biased locks take a long time, for instance; *if* this is happening in your case, and we can track down which objects, I would consider it a bug and we may be able to fix it.
-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 On 29 January 2014 16:23, Joel Samuelsson <samuelsson.j...@gmail.com> wrote: > Hi, > > We've been trying to figure out why we have so long and frequent > stop-the-world GC even though we have basically no load. > > Today we got a log of a weird GC that I wonder if you have any theories of > why it might have happened. > > A plot of our heap at the time, paired with the GC time from the Cassandra > log: > http://imgur.com/vw5rOzj > -The blue line is the ratio of Eden space used (i.e. 1.0 = full) > -The red line is the ratio of Survivor0 space used > -The green line is the ratio of Survivor1 space used > -The teal line is the ratio of Old Gen space used > -The pink line shows during which period of time a GC happened (from the > Cassandra log) > > Eden space is filling up and being cleared as expected in the first and > last hill but on the middle one, it takes two seconds to clear Eden (note > that Eden has ratio 1 for 2 seconds). Neither the survivor spaces nor old > generation increase significantly afterwards. > > Any ideas of why this might be happening? > We have swap disabled, JNA enabled, no CPU spikes at the time, no disk I/O > spikes at the time. What else could be causing this? > > /Joel Samuelsson >