I have a program, normal memory usage is <50MB and CPU ~5%. This doesn't change over time.
Rebuilding with `-race` shows memory <100MB and CPU ~25%. (Consistent with overhead described here: https://go.dev/doc/articles/race_detector#Runtime_Overheads) However, with `-race` enabled after a couple of minutes, the CPU suddenly jumps to 100%, and skyrockets to multiple GBs within seconds. e.g.: T0: CPU:25%, MEM:95MB T1: CPU:25%, MEM:95MB (...) T100: CPU:25%, MEM:95MB T101: CPU:99%, MEM:500MB T102: CPU:99%, MEM:2GB T103: CPU:99%, MEM:4GB T104: CPU:99%, MEM:6GB T105: CPU:99%, MEM:8GB => OOM The CPU jump is drastic and instantaneous, and the memory seems to grow as fast as it can be allocated. The race detector docs says: > The race detector currently allocates an extra 8 bytes per defer and recover > statement. Those extra allocations are not recovered until the goroutine > exits. This means that if you have a long-running goroutine that is > periodically issuing defer and recover calls, the program memory usage may > grow without bound. These memory allocations will not show up in the output > of runtime.ReadMemStats or runtime/pprof. Here's my question. I would like to: 1. Confirm the extra memory is due to the race detector overhead related to defer/recover (as opposed to some other bug in the program that only surfaces when building with `-race` 2. Find the coroutine(s?) responsible for that defer/recover Any idea on how to investigate? I have tried capturing with pprof. Even if the data race allocation are not visible ("These memory allocations will not show up in the output of runtime.ReadMemStats or runtime/pprof."), I could at least confirm it's not the program code allocating. However pprof does not work for a different reason: once the program is in "100%CPU" mode, pprof times out. So I can't ever capture a trace/heap/profile while the system is showing the behavior (because CPU and memory are already too pegged to handle a pprof dump) Anything else I could try to get to the bottom of this? For example, is there a way to trace all defer/recover calls? Or is there a way to attach a debugger and pause when memory usage exceeds a certain amount? I searched for these and more, but couldn't find much. Maybe some wizard on this list has some ideas or pointers. [go1.19.7.linux-amd64] Thank you, M. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/D974FDFF-1C41-44C4-9573-8CE69B8C76A7%40gmail.com.