https://github.com/open-telemetry/opentelemetry-go-contrib/issues/6625
On Wednesday, January 15, 2025 at 11:02:37 PM UTC-8 John wrote: > Thanks Kurtis for the advice. I was heading in that direction. > > This is definitely an OTEL problem. The minimal version required to > create the issue: > > metrics.go > ```go > package metrics > > import ( > _ "go.opentelemetry.io/contrib/instrumentation/host" > ) > ``` > > metrics_test.go > ```go > package metrics > ``` > > `go test -race` > > That will immediately cause the issue. You don't even require tests, it > fails before it even gets there. > > I'll make my way over to the OTEL bugs tomorrow. > > For those that are interested in some random debugger output, here is a > little from lldb and delve (which let's me see they are calling C from > purego): > > Process 58447 launched: '/Users/jdoak/base/concurrency/sync/sync.test' > (arm64) > warning: (arm64) > /Users/jdoak/base/concurrency/sync/sync.test(0x0000000100000000) address > 0x0000000100000000 maps to more than one section: sync.test.__TEXT and > sync.test.__TEXT > warning: (arm64) > /Users/jdoak/base/concurrency/sync/sync.test(0x0000000100000000) address > 0x0000000101bbc000 maps to more than one section: sync.test.__DATA_CONST > and sync.test.__DATA_CONST > warning: (arm64) > /Users/jdoak/base/concurrency/sync/sync.test(0x0000000100000000) address > 0x0000000102b18000 maps to more than one section: sync.test.__DATA and > sync.test.__DATA > Process 58447 stopped > * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS > (code=1, address=0x10) > frame #0: 0x000000010000423c sync.test`__tsan_func_enter + 16 > sync.test`__tsan_func_enter: > -> 0x10000423c <+16>: ldr x8, [x0, #0x10] > 0x100004240 <+20>: add w9, w8, #0x8 > 0x100004244 <+24>: tst x9, #0xff0 > 0x100004248 <+28>: b.eq 0x1000042a0 ; <+116> > Target 0: (sync.test) stopped. > (lldb) thread backtrace all > * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS > (code=1, address=0x10) > * frame #0: 0x000000010000423c sync.test`__tsan_func_enter + 16 > frame #1: 0x0000000101706e34 sync.test` > github.com/ebitengine/purego/internal/fakecgo.x_cgo_notify_runtime_init_done > + 20 > frame #2: 0x00000001017073f0 > sync.test`x_cgo_notify_runtime_init_done_trampoline + 16 > thread #2 > frame #0: 0x00000001945e64e8 libsystem_kernel.dylib`__semwait_signal + > 8 > frame #1: 0x00000001944c56f0 libsystem_c.dylib`nanosleep + 220 > frame #2: 0x00000001944c5608 libsystem_c.dylib`usleep + 68 > frame #3: 0x00000001000c6304 sync.test`runtime.usleep_trampoline.abi0 > + 20 > thread #3 > frame #0: 0x00000001945e66ec libsystem_kernel.dylib`__psynch_cvwait + 8 > frame #1: 0x0000000194624894 > libsystem_pthread.dylib`_pthread_cond_wait + 1204 > frame #2: 0x00000001000c6688 > sync.test`runtime.pthread_cond_wait_trampoline.abi0 + 24 > frame #3: 0x00000001000c4838 sync.test`runtime.asmcgocall.abi0 + 200 > thread #4 > frame #0: 0x00000001945e66ec libsystem_kernel.dylib`__psynch_cvwait + 8 > frame #1: 0x0000000194624894 > libsystem_pthread.dylib`_pthread_cond_wait + 1204 > frame #2: 0x00000001000c6688 > sync.test`runtime.pthread_cond_wait_trampoline.abi0 + 24 > frame #3: 0x00000001000c4838 sync.test`runtime.asmcgocall.abi0 + 200 > thread #5 > frame #0: 0x00000001945e66ec libsystem_kernel.dylib`__psynch_cvwait + 8 > frame #1: 0x0000000194624894 > libsystem_pthread.dylib`_pthread_cond_wait + 1204 > frame #2: 0x00000001000c6688 > sync.test`runtime.pthread_cond_wait_trampoline.abi0 + 24 > frame #3: 0x00000001000c4838 sync.test`runtime.asmcgocall.abi0 + 200 > thread #6 > frame #0: 0x00000001945e66ec libsystem_kernel.dylib`__psynch_cvwait + 8 > frame #1: 0x0000000194624894 > libsystem_pthread.dylib`_pthread_cond_wait + 1204 > frame #2: 0x00000001000c6688 > sync.test`runtime.pthread_cond_wait_trampoline.abi0 + 24 > frame #3: 0x00000001000c4838 sync.test`runtime.asmcgocall.abi0 + 200 > > > > (dlv) continue > > [runtime-fatal-throw] runtime.fatalsignal() > /usr/local/go/src/runtime/signal_unix.go:831 (hits goroutine(1):1 total:1) > (PC: 0x104f027bc) > Warning: debugging optimized function > 826: printDebugLog() > 827: > 828: exit(2) > 829: } > 830: > => 831: func fatalsignal(sig uint32, c *sigctxt, gp *g, mp *m) *g { > 832: if sig < uint32(len(sigtable)) { > 833: print(sigtable[sig].name, "\n") > 834: } else { > 835: print("Signal ", sig, "\n") > 836: } > (dlv) stack > 0 0x0000000104f027bc in runtime.fatalsignal > at /usr/local/go/src/runtime/signal_unix.go:831 > 1 0x0000000104f02390 in runtime.sighandler > at /usr/local/go/src/runtime/signal_unix.go:754 > 2 0x0000000104f01cac in runtime.sigtrampgo > at /usr/local/go/src/runtime/signal_unix.go:490 > 3 0x0000000104e6c23c in ??? > at ?:-1 > 4 0x0000000106569974 in > github.com/ebitengine/purego/internal/fakecgo.x_cgo_notify_runtime_init_done > at /Users/jdoak/go/pkg/mod/ > github.com/ebitengine/pur...@v0.8.1/internal/fakecgo/go_libinit.go:22 > <http://github.com/ebitengine/purego@v0.8.1/internal/fakecgo/go_libinit.go:22> > 5 0x000000016af95d88 in ??? > at ?:-1 > 6 0x0000000104f2cadc in runtime.asmcgocall > at /usr/local/go/src/runtime/asm_arm64.s:1000 > 7 0x0000000104f2daa8 in racecall > at /usr/local/go/src/runtime/race_arm64.s:476 > 8 0x0000000000000000 in ??? > at :0 > error: NULL address > (truncated) > > On Wednesday, January 15, 2025 at 9:41:47 PM UTC-8 Kurtis Rader wrote: > >> On Wed, Jan 15, 2025 at 8:31 PM John <johns...@gmail.com> wrote: >> >>> Hey Kurtis, >>> >>> Thanks for responding. >>> >>> Unfortunately, this does look like some type of OTEL problem. I was >>> able to make a copy and strip out all the OTEL code. As soon as I did >>> this, this stopped happening. Which means it is some type of OTEL issue >>> that I should probably track down with the OTEL people. >>> >>> As a note for someone who stumbles on this with a similar problem, the >>> OTEL packages included: >>> >>> "go.opentelemetry.io/otel/attribute" >>> "go.opentelemetry.io/otel/trace" >>> "go.opentelemetry.io/otel/metric" >>> >>> These packages are at v1.33.0 >>> >> >> Note that simply removing the references to the above mentioned OTEL >> package does not guarantee the problem is with that package. The failure >> could still be due to how you are using the package. Having said that, any >> public package should validate its inputs and provide a more meaningful >> failure than a SIGSEGV fault. So even if the proximate cause of the failure >> is a mistake in your code there is clearly room for improvement in the >> package you are using. >> >> As a retired software support engineer who has spent thousands of hours >> debugging these types of problems I can't stress how important it is to >> create a minimal reproducible example as the quickest way to get to the >> root cause of the problem. A minimal reproducible example will allow >> others, such as the OTEL package maintainers, to employ tools, such as gdb >> or lldb, which you may not be comfortable using. >> >> -- >> Kurtis Rader >> Caretaker of the exceptional canines Junior and Hank >> > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/d899fb29-2c7c-4983-9947-7e7fbfa65cb6n%40googlegroups.com.