ideas to get a profile under duress: a) rent a bigger box with tons of cpu and ram for 15 minutes to get your profile b) terminate the pprof profile programatically after a short amount of time; e.g.
f, err := os.Create("my_cpu_profile") panicOn(err) pprof.StartCPUProfile(f) go func() { time.Sleep(time.Second * 5) pprof.StopCPUProfile() } On Tuesday, March 21, 2023 at 3:26:59 PM UTC-5 Marco P. wrote: > I have a program, normal memory usage is <50MB and CPU ~5%. This doesn't > change over time. > > Rebuilding with `-race` shows memory <100MB and CPU ~25%. > (Consistent with overhead described here: > https://go.dev/doc/articles/race_detector#Runtime_Overheads) > > However, with `-race` enabled after a couple of minutes, the CPU suddenly > jumps to 100%, and skyrockets to multiple GBs within seconds. > > e.g.: > > T0: CPU:25%, MEM:95MB > T1: CPU:25%, MEM:95MB > (...) > T100: CPU:25%, MEM:95MB > T101: CPU:99%, MEM:500MB > T102: CPU:99%, MEM:2GB > T103: CPU:99%, MEM:4GB > T104: CPU:99%, MEM:6GB > T105: CPU:99%, MEM:8GB > => OOM > > The CPU jump is drastic and instantaneous, and the memory seems to grow as > fast as it can be allocated. > > The race detector docs says: > > > The race detector currently allocates an extra 8 bytes per defer and > recover statement. Those extra allocations are not recovered until the > goroutine exits. This means that if you have a long-running goroutine that > is periodically issuing defer and recover calls, the program memory usage > may grow without bound. These memory allocations will not show up in the > output of runtime.ReadMemStats or runtime/pprof. > > Here's my question. > > I would like to: > 1. Confirm the extra memory is due to the race detector overhead related > to defer/recover (as opposed to some other bug in the program that only > surfaces when building with `-race` > 2. Find the coroutine(s?) responsible for that defer/recover > > Any idea on how to investigate? > > I have tried capturing with pprof. Even if the data race allocation are > not visible ("These memory allocations will not show up in the output of > runtime.ReadMemStats or runtime/pprof."), I could at least confirm it's not > the program code allocating. > > However pprof does not work for a different reason: once the program is in > "100%CPU" mode, pprof times out. > So I can't ever capture a trace/heap/profile while the system is showing > the behavior (because CPU and memory are already too pegged to handle a > pprof dump) > > Anything else I could try to get to the bottom of this? > > For example, is there a way to trace all defer/recover calls? > Or is there a way to attach a debugger and pause when memory usage exceeds > a certain amount? > > I searched for these and more, but couldn't find much. Maybe some wizard > on this list has some ideas or pointers. > > [go1.19.7.linux-amd64] > > Thank you, > M. > > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/3e680fee-d047-45ff-bedf-d473a24f866bn%40googlegroups.com.