chenBright opened a new pull request, #3297:
URL: https://github.com/apache/brpc/pull/3297

   libunwind ships its `src/unwind/*.c` (the GCC `_Unwind_*` ABI compatibility 
layer) as exported symbols of `libexternal_S~libunwind.so`. At runtime the 
dynamic loader resolves `_Unwind_*` lookups (from `pthread_exit`, libstdc++'s 
`__gxx_personality_v0`, etc.) to libunwind's DWARF-based implementation instead 
of `libgcc_s.so.1`, hitting an uninitialized internal context and crashing on 
the no-return cleanup chain triggered by pthread_exit / C++ exception unwinding 
-- e.g. it makes BthreadTest.bthread_exit segfault deterministically when 
`--define=with_bthread_tracer=true` is on.
   
   This is purely an ELF runtime symbol-resolution-order issue and reproduces 
identically on GCC and Clang, since both default to `libstdc++ + libgcc_s` on 
Linux.
   
   ### What problem does this PR solve?
   
   Issue Number: resolve 
   
   Problem Summary:
   
   When `--define=with_bthread_tracer=true` is enabled under Bazel
   (`bazel test //test:bthread_unittest`), tests such as
   `BthreadTest.bthread_exit` segfault inside `pthread_exit` /
   `__gxx_personality_v0`, with a stack like:
   
   ```
   #0  0x0000000000000000 in ?? ()
   #1  0x00007fa2b5d6458a in _ULx86_64_dwarf_find_proc_info ()
      from 
/root/.cache/bazel/_bazel_root/743b333b2429a1dbd390ef66b59c771d/execroot/_main/bazel-out/k8-fastbuild/bin/test/../_solib_k8/libexternal_Slibunwind~_Slibunwind.so
   #2  0x00007fa2b5d6668d in fetch_proc_info ()
      from 
/root/.cache/bazel/_bazel_root/743b333b2429a1dbd390ef66b59c771d/execroot/_main/bazel-out/k8-fastbuild/bin/test/../_solib_k8/libexternal_Slibunwind~_Slibunwind.so
   #3  0x00007fa2b5d681a1 in _ULx86_64_dwarf_make_proc_info ()
      from 
/root/.cache/bazel/_bazel_root/743b333b2429a1dbd390ef66b59c771d/execroot/_main/bazel-out/k8-fastbuild/bin/test/../_solib_k8/libexternal_Slibunwind~_Slibunwind.so
   #4  0x00007fa2b5d70cfd in _ULx86_64_get_proc_info ()
      from 
/root/.cache/bazel/_bazel_root/743b333b2429a1dbd390ef66b59c771d/execroot/_main/bazel-out/k8-fastbuild/bin/test/../_solib_k8/libexternal_Slibunwind~_Slibunwind.so
   #5  0x00007fa2b5d6c775 in __libunwind_Unwind_GetLanguageSpecificData ()
      from 
/root/.cache/bazel/_bazel_root/743b333b2429a1dbd390ef66b59c771d/execroot/_main/bazel-out/k8-fastbuild/bin/test/../_solib_k8/libexternal_Slibunwind~_Slibunwind.so
   #6  0x00007fa2b503c6df in __gxx_personality_v0 () from 
/lib/x86_64-linux-gnu/libstdc++.so.6
   #7  0x00007fa2b5452ce5 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
   #8  0x00007fa2b54533c0 in _Unwind_ForcedUnwind () from 
/lib/x86_64-linux-gnu/libgcc_s.so.1
   #9  0x00007fa2b4ca57a4 in __GI___pthread_unwind (buf=<optimized out>) at 
./nptl/unwind.c:130
   #10 0x00007fa2b4c9dd22 in __do_cancel () at ../sysdeps/nptl/pthreadP.h:271
   #11 __GI___pthread_exit (value=0x0) at ./nptl/pthread_exit.c:36
   #12 0x0000000000000000 in ?? ()
   ```
   
   Root cause: libunwind's `src/unwind/*.c` exports the `_Unwind_*` GCC ABI
   symbols, which collide with the same symbols in `libgcc_s.so.1`. Bazel's
   default fastbuild mode wraps each `cc_library` into an intermediate `.so`,
   so `libexternal_S~libunwind.so` ends up earlier in the binary's
   `DT_NEEDED` than `libgcc_s.so.1`, and the runtime dynamic linker resolves
   the `_Unwind_*` calls coming from `pthread_exit` / libstdc++ to
   libunwind's DWARF-based implementation instead of libgcc_s. 
   
   This is a runtime symbol-resolution-order issue independent of the
   compiler -- Clang's default toolchain on Linux uses the same
   `libstdc++ + libgcc_s` runtime and reproduces the exact same crash.
   
   The make / cmake CI does not hit this because it links the
   autoconf-built libunwind which hides `_Unwind_*`  by default. 
   Distro packages (`libunwind-dev`, `libunwind-devel`) typically 
   also re-export the symbols, so they have the same risk.
   
   ### What is changed and the side effects?
   
   Changed:
   
   * New self-maintained Bazel module registry under
     `registry/modules/libunwind/`, hosting two variants of libunwind:
     `1.8.1.brpc-no-unwind` and `1.8.3.brpc-no-unwind`. Both are forks of
     the corresponding modules in [Bazel Central 
Registry](https://github.com/bazelbuild/bazel-central-registry/tree/main/modules/libunwind),
 Apache-2.0 licensed. 
     License attribution and a per-file change list are added at the bottom 
     of the top-level `LICENSE` file.
   * The variant overlays add a `hide_unwind_symbols` `config_setting`
     gated by `--define=libunwind_hide_unwind_symbols=true`. When the switch
     is on, the `src/unwind/*.c` glob is dropped from the build via
     `select()`, so libunwind no longer provides `_Unwind_*` symbols and
     `libgcc_s.so.1` wins the runtime lookup.
   
   Side effects:
   - Performance effects:
   
   - Breaking backward compatibility: 
   
   ---
   ### Check List:
   - Please make sure your changes are compilable.
   - When providing us with a new feature, it is best to add related tests.
   - Please follow [Contributor Covenant Code of 
Conduct](https://github.com/apache/brpc/blob/master/CODE_OF_CONDUCT.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to