Hi golang-nuts,

I've got a service that is seeing some confusing GC behavior. The behavior 
manifests as long request latencies when the GC runs. At first I thought 
this would just be stop-the-world behavior from the GC, but as I dig into 
it more, it's clear that the STW phase is actually quite short. It's 
actually that the concurrent phase is completely using all cores at 100% 
utilization while it runs. Because there's no CPU left to run the service 
itself, the service is unavailable during this window, which seems to be a 
few hundred ms. I'm not especially excited about the service being 
unavailable for this length of time, to say the least. It's worth noting 
that > 100ms is the worst case, maybe one out of every 10 GC runs, but all 
of them spike the CPU pretty hard for at least 10ms.

I've tried adjusting GOGC to a smaller number, but this just makes the cpu 
spikes come more frequently. I've been poking around pprof and it looks 
like the service has roughly 20M long-lived objects and ~80M garbage 
objects on each GC run. If it matters, this program has a pretty large LRU 
cache implemented with a map and a container/list. It's also got quite a 
few mutexes, though I'm not sure these meaningfully interact with Go's GC 
in any way.

1) Is this common behavior for Go's GC?

2) Is there possibly something about how the service allocates objects 
that's causing undue work, say perhaps an unexpected behavior during the 
mark phase?

3) Is it possible to convince Go to spread its concurrent phase out over 
more time? From the gc trace, it appears to be doing about 2s of work (wall 
clock), but it runs every 120s, so it feels like this work could be spread 
out in a way that doesn't interfere with the service.

I'm compiled with Go 1.10.3, 4 cores. Here's a sample trace from gctrace

GC forced
pacer: assist ratio=+8.611156e-002 (scan 960 MB in 15690->21268 MB) 
workers=1++0.000000e+000
pacer: H_m_prev=11151043456 h_t=+9.500000e-001 H_T=21744534739 
h_a=+4.765627e-001 H_a=16465215016 h_g=+1.000000e+000 H_g=22302086912 
u_a=+9.186874e-001 u_g=+3.000000e-001 W_a=347113784 goalΔ=+5.000000e-002 
actualΔ=-4.734373e-001 u_a/u_g=+3.062291e+000
gc 36 @1197.776s 1%: 0.59+2210+0.29 ms clock, 2.3+5915/2192/0+1.1 ms cpu, 
15690->15702->10671 MB, 21268 MB goal, 4 P
scvg7: inuse: 15834, idle: 107, sys: 15942, released: 0, consumed: 15942 
(MB)
pacer: sweep done at heap size 10851MB; allocated 179MB during sweep; swept 
2026864 pages at +1.906868e-004 pages/byte

Thank you, and thanks for all the great work on Go!
-Brian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to