fausturs opened a new issue, #2726:
URL: https://github.com/apache/brpc/issues/2726

   **Describe the bug (描述bug)**
   我们尝试使用了一下brpc的最新的代码。包含了这个commit
   
https://github.com/apache/brpc/commit/e0c9c441922c070c887649718d8aa7d2ba459ea6
   
   这个commit中,覆盖了pthread_mutex_trylock这个符号,同时我们使用了jemalloc(5.2.1),,疑似导致死锁了
   
   这是我们的栈,
   ```
   (gdb) bt
   #0  futex_wait (private=0, expected=1, futex_word=0x25a1b1c 
<bthread::init_sys_mutex_lock_once>) at ../sysdeps/nptl/futex-internal.h:141
   #1  futex_wait_simple (private=0, expected=1, futex_word=0x25a1b1c 
<bthread::init_sys_mutex_lock_once>) at ../sysdeps/nptl/futex-internal.h:172
   #2  __pthread_once_slow (once_control=0x25a1b1c 
<bthread::init_sys_mutex_lock_once>, init_routine=0x12cfb30 
<bthread::init_sys_mutex_lock()>) at pthread_once.c:105
   #3  0x00000000012cfc1c in bthread::first_sys_pthread_mutex_trylock 
(mutex=0x2f4e2a0 <init_lock+64>) at src/bthread/mutex.cpp:453
   #4  0x00000000012d109c in bthread::internal::pthread_mutex_trylock_internal 
(mutex=0x25a1b1c <bthread::init_sys_mutex_lock_once>) at 
src/bthread/mutex.cpp:583
   #5  bthread::internal::pthread_mutex_trylock_impl<pthread_mutex_t> 
(mutex=0x25a1b1c <bthread::init_sys_mutex_lock_once>) at 
src/bthread/mutex.cpp:664
   #6  bthread::pthread_mutex_trylock_impl (mutex=0x25a1b1c 
<bthread::init_sys_mutex_lock_once>) at src/bthread/mutex.cpp:717
   #7  pthread_mutex_trylock (__mutex=0x25a1b1c 
<bthread::init_sys_mutex_lock_once>) at src/bthread/mutex.cpp:939
   #8  0x00000000016905a9 in malloc_mutex_trylock_final (mutex=<optimized out>) 
at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/include/jemalloc/internal/mutex.h:161
   #9  malloc_mutex_lock (tsdn=0x0, mutex=<optimized out>) at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/include/jemalloc/internal/mutex.h:220
   #10 malloc_init_hard () at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:1739
   #11 malloc_init () at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:223
   #12 imalloc_init_check (sopts=<optimized out>, dopts=<optimized out>) at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:2229
   #13 imalloc (sopts=<optimized out>, dopts=<optimized out>) at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:2260
   #14 calloc (num=num@entry=1, size=size@entry=32) at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:2494
   #15 0x00007f8b0cbcec05 in _dlerror_run (operate=operate@entry=0x7f8b0cbce490 
<dlsym_doit>, args=args@entry=0x7ffde1e4d880) at dlerror.c:148
   #16 0x00007f8b0cbce525 in __dlsym (handle=<optimized out>, name=0x1b97c44 
"pthread_mutex_trylock") at dlsym.c:70
   #17 0x00000000012cfbbc in bthread::init_sys_mutex_lock () at 
src/bthread/mutex.cpp:435
   #18 0x00007f8b0cd3447f in __pthread_once_slow (once_control=0x25a1b1c 
<bthread::init_sys_mutex_lock_once>, init_routine=0x12cfb30 
<bthread::init_sys_mutex_lock()>) at pthread_once.c:116
   #19 0x00000000012cfc1c in bthread::first_sys_pthread_mutex_trylock 
(mutex=0x2f4e2a0 <init_lock+64>) at src/bthread/mutex.cpp:453
   #20 0x00000000012d109c in bthread::internal::pthread_mutex_trylock_internal 
(mutex=0x25a1b1c <bthread::init_sys_mutex_lock_once>) at 
src/bthread/mutex.cpp:583
   #21 bthread::internal::pthread_mutex_trylock_impl<pthread_mutex_t> 
(mutex=0x25a1b1c <bthread::init_sys_mutex_lock_once>) at 
src/bthread/mutex.cpp:664
   #22 bthread::pthread_mutex_trylock_impl (mutex=0x25a1b1c 
<bthread::init_sys_mutex_lock_once>) at src/bthread/mutex.cpp:717
   #23 pthread_mutex_trylock (__mutex=0x25a1b1c 
<bthread::init_sys_mutex_lock_once>) at src/bthread/mutex.cpp:939
   #24 0x000000000168e302 in malloc_mutex_trylock_final (mutex=<optimized out>) 
at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/include/jemalloc/internal/mutex.h:161
   #25 malloc_mutex_lock (tsdn=0x0, mutex=<optimized out>) at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/include/jemalloc/internal/mutex.h:220
   #26 malloc_init_hard () at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:1739
   #27 malloc_init () at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:223
   #28 imalloc_init_check (sopts=<optimized out>, dopts=<optimized out>) at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:2229
   #29 imalloc (sopts=<optimized out>, dopts=<optimized out>) at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:2260
   #30 je_malloc_default (size=72704) at 
/root/.conan/data/jemalloc/5.2.1/_/_/build/55d721bf422a34e3db4f17a58c2f8d839c0b6932/src/src/jemalloc.c:2289
   #31 0x00007f8b0cdeba9a in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
   #32 0x00007f8b0cf4cb8a in ?? () from /lib64/ld-linux-x86-64.so.2
   #33 0x00007f8b0cf4cc91 in ?? () from /lib64/ld-linux-x86-64.so.2
   #34 0x00007f8b0cf3c13a in ?? () from /lib64/ld-linux-x86-64.so.2
   #35 0x0000000000000007 in ?? ()
   #36 0x00007ffde1e4f62c in ?? ()
   #37 0x00007ffde1e4f64c in ?? ()
   #38 0x00007ffde1e4f689 in ?? ()
   #39 0x00007ffde1e4f6b4 in ?? ()
   #40 0x00007ffde1e4f6cd in ?? ()
   #41 0x00007ffde1e4f6e3 in ?? ()
   #42 0x00007ffde1e4f701 in ?? ()
   #43 0x0000000000000000 in ?? ()
   
   ```
   
   可以看到
   
   
第24帧是,jemalloc中进行pthread_mutex_trylock,然后进入了第23帧brpc,然后自然的进入到了第16帧开始调用__dlsym, 
然后第15帧调用了_dlerror_run。
   然后第14帧调用到了calloc,又重新进入到jemalloc,然后第8帧进入到和24帧同一个位置进行trylock,此时应该死锁了。
   
   
从代码中的一段注释(https://github.com/apache/brpc/blob/b4d4acb7cd9a677039f662f18df37d4be7172ed3/src/bthread/mutex.cpp#L390)来看,
   看上去是类似的行为。
   这个问题有什么办法可以修复一下吗?
   
   **To Reproduce (复现方法)**
   
   
   **Expected behavior (期望行为)**
   
   
   **Versions (各种版本)**
   OS:
   Compiler:
   brpc:
   protobuf:
   
   **Additional context/screenshots (更多上下文/截图)**
   
   这是我们使用的jemalloc的源代码的161行所处的位置。
   
![image](https://github.com/user-attachments/assets/6c442135-228e-48da-a6cc-a5895ec2cb83)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@brpc.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@brpc.apache.org
For additional commands, e-mail: dev-h...@brpc.apache.org

Reply via email to