Jenny, There's a log attached with those options.
Termination times are less with 2 workers but eden is much smaller (~2 GB instead of ~7) and evac pauses are more frequent. So it's a nice improvement but not ideal. Also it appears to me that the 2 worker case shows that work distribution appears to degenerate over time (termination attempts increases) which mirrors what I perceive in the cases with higher worker counts. Thanks for your help so far! William On Fri, Sep 2, 2016 at 1:34 AM, yu.zh...@oracle.com <yu.zh...@oracle.com> wrote: > William, > > I think it might be related to humongous objects. Can you do some > experiments > > -XX:+UnlockExperimentalVMOptions -XX:+G1TraceEagerReclaimHumongousObjects > > Can you remove -XX:G1LogLevel=finest? > > Based on the output, can adjust the heap region size. > But I think the termination is less with 2 worker threads. > > Thanks > Jenny > > On 09/01/2016 08:12 AM, William Good wrote: >> >> Yep, the marking cycle seems to help. I just don't know why. Objects >> in old regions should die very infrequently, as everything produced >> either survives indefinitely or is a byproduct of loading or >> evaluation (a very fast operation,especially when compared to >> frequency of evac pause * number of survivor regions). Thus a mark >> cycle shouldn't reveal much to be collected in old regions, and my >> understanding is that all the survivor spaces are marked+evac'd on >> each evac pause. >> >> Tried first with 12 workers (happens to be the number of physical >> cores on my machine) and got the same pathological behavior. Then >> tried with 2 and still see large termination time increases. Log file >> attached. >> >> William >> >> On Wed, Aug 31, 2016 at 8:18 AM, yu.zh...@oracle.com >> <yu.zh...@oracle.com> wrote: >>> >>> It seems that after marking (clean up), the termination time drops. Maybe >>> that is why you need a very low ihop so that you can have more marking >>> cycle. >>> >>> The work distribution seems fine. But system time is high. Maybe some >>> lock >>> contention. >>> >>> I would agree to try lowering the gc threads, -XX:ParallelGCThreads=<n> >>> >>> Jenny >>> >>> >>> On 08/30/2016 04:08 PM, Vitaly Davidovich wrote: >>> >>> William, >>> >>> Have you tried running with a lower number (than the current 18) of >>> parallel >>> workers? >>> >>> On Tuesday, August 30, 2016, William Good <bkg...@gmail.com> wrote: >>>> >>>> I've been experiencing an issue in a production application using G1 >>>> for quite some time over a handful of 1.8.0 builds. The application is >>>> relatively simple: it spends about 60s reading some parameters from >>>> files on disk, and then starts serving web requests which merge some >>>> input with those parameters, performs some computation and returns a >>>> result. We're aiming to keep max total request time (as seen by remote >>>> hosts) below 100 ms but from previous experience with parnew and cms >>>> (and g1 on previous projects, for that matter), I didn't anticipate >>>> this being a problem. >>>> >>>> The symptoms are an ever-increasing time spent in evacuation pauses, >>>> and high parallel worker termination times stick out. With the >>>> recommended set of G1 settings (max heap size and pause time target), >>>> they increase sharply until I start seeing 500ms+ pause times and have >>>> to kill the JVM. >>>> >>>> I found some time ago that first forcing a bunch of full GCs with >>>> System.gc() at the phase (load -> serve) change and then forcing >>>> frequent concurrent cycles with -XX:InitiatingHeapOccupancyPercent=1 >>>> seems to mitigate the problem. I'd prefer to have to do neither, as >>>> the former makes redeployments very slow and the latter adds a couple >>>> of neighboring 40ms pauses for remark and cleanup pauses that aren't >>>> good for request time targets. >>>> >>>> I'm attaching a log file that details a short run, with the phase >>>> change at about 60s from start. After a few evacuation pauses, one >>>> lasts 160ms with nearly 100-120ms spent in parallel workers' >>>> 'termination'. After this, a concurrent cycle runs and everything goes >>>> back to normal. java params are at the top of the file. >>>> >>>> Generally this happens over a much longer period of time (and >>>> especially if I haven't given the low >>>> -XX:InitiatingHeapOccupancyPercent value) and over many different >>>> builds of 1.8.0. This was b101. It's running alone on a fairly hefty >>>> dual-socket Xeon box with 128GB of RAM on CentOS 7. >>>> >>>> I'd be more than happy to hear any ideas on what's going on here and >>>> how it could be fixed. >>>> >>>> Best, >>>> William >>> >>> >>> >>> -- >>> Sent from my phone >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >
2workers_tracehuge.log.gz
Description: GNU Zip compressed data
_______________________________________________ hotspot-gc-use mailing list hotspot-gc-use@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use