daming6 opened a new issue, #3295:
URL: https://github.com/apache/brpc/issues/3295

   **Describe the bug**
   加上tsan编译、链接选项后,在两个shell窗口分别执行如下命令:
   taskset -c 0-95 ./kpl_tools server -use_rdma 0 -thread_num 96 
   taskset -c 48-95 ./kpl_tools client -thread_num 48 -queue_depth 85 
-attachment_size 131072 -use_rdma 0 -connection_type pooled
   会出现一些tsan报错,报错均为data race,再具体可以分为两类,分别是Location is global以及Location is heap 
block of size xx at 0xxx allocated by main thread
   第一类报错举例如下:
   ==================
   WARNING: ThreadSanitizer: data race (pid=6737)
     Write of size 8 at 0xfffff6e40618 by thread T3:
       #0 logging::add_vlog_site(int const**, char const*, int, int) 
src/butil/logging.cc:1863 (libbrpc.so+0x22548c)
       #1 bthread::TaskControl::worker_thread(void*) 
src/bthread/task_control.cpp:120 (libbrpc.so+0x2db1b4)
       #2 <null> <null> (libtsan.so.2+0x3c874)
   
     Previous read of size 8 at 0xfffff6e40618 by thread T5:
       #0 bthread::TaskControl::worker_thread(void*) 
src/bthread/task_control.cpp:120 (libbrpc.so+0x2dad30)
       #1 <null> <null> (libtsan.so.2+0x3c874)
   
     Location is global 'bthread::TaskControl::worker_thread(void*)::vlocal' of 
size 8 at 0xfffff6e40618 (libbrpc.so+0x930618)
   
     Thread T3 'brpc_wkr:0-0' (tid=6741, running) created by main thread at:
       #0 pthread_create <null> (libtsan.so.2+0x6ad40)
       #1 bthread::TaskControl::init(int) src/bthread/task_control.cpp:263 
(libbrpc.so+0x2de51c)
       #2 bthread::get_or_new_task_control() src/bthread/bthread.cpp:117 
(libbrpc.so+0x2b1b9c)
       #3 bthread::start_from_non_worker(unsigned long*, bthread_attr_t const*, 
void* (*)(void*), void*) src/bthread/bthread.cpp:274 (libbrpc.so+0x2af8dc)
       #4 bthread_start_background src/bthread/bthread.cpp:356 
(libbrpc.so+0x2af8dc)
       #5 GlobalInitializeOrDieImpl src/brpc/global.cpp:651 
(libbrpc.so+0x40f91c)
       #6 pthread_once <null> (libtsan.so.2+0x4c654)
       #7 brpc::GlobalInitializeOrDie() src/brpc/global.cpp:656 
(libbrpc.so+0x40d690)
       #8 brpc::Server::InitializeOnce() src/brpc/server.cpp:676 
(libbrpc.so+0x486a3c)
       #9 brpc::Server::StartInternal(butil::EndPoint const&, brpc::PortRange 
const&, brpc::ServerOptions const*) src/brpc/server.cpp:862 
(libbrpc.so+0x494020)
       #10 brpc::Server::Start(butil::EndPoint const&, brpc::ServerOptions 
const*) src/brpc/server.cpp:1279 (libbrpc.so+0x496bd0)
       #11 brpc::Server::Start(int, brpc::ServerOptions const*) 
src/brpc/server.cpp:1298 (libbrpc.so+0x496e64)
       #12 brpc::StartDummyServerAt(int, brpc::ProfilerLinker) 
src/brpc/server.cpp:1974 (libbrpc.so+0x497144)
       #13 main 
/home/caolx5/brpc-master/example/rdma_performance/client.cpp:282 
(client+0x40a3c4)
   
     Thread T5 'brpc_wkr:0-1' (tid=6743, running) created by main thread at:
       #0 pthread_create <null> (libtsan.so.2+0x6ad40)
       #1 bthread::TaskControl::init(int) src/bthread/task_control.cpp:263 
(libbrpc.so+0x2de51c)
       #2 bthread::get_or_new_task_control() src/bthread/bthread.cpp:117 
(libbrpc.so+0x2b1b9c)
       #3 bthread::start_from_non_worker(unsigned long*, bthread_attr_t const*, 
void* (*)(void*), void*) src/bthread/bthread.cpp:274 (libbrpc.so+0x2af8dc)
       #4 bthread_start_background src/bthread/bthread.cpp:356 
(libbrpc.so+0x2af8dc)
       #5 GlobalInitializeOrDieImpl src/brpc/global.cpp:651 
(libbrpc.so+0x40f91c)
       #6 pthread_once <null> (libtsan.so.2+0x4c654)
       #7 brpc::GlobalInitializeOrDie() src/brpc/global.cpp:656 
(libbrpc.so+0x40d690)
       #8 brpc::Server::InitializeOnce() src/brpc/server.cpp:676 
(libbrpc.so+0x486a3c)
       #9 brpc::Server::StartInternal(butil::EndPoint const&, brpc::PortRange 
const&, brpc::ServerOptions const*) src/brpc/server.cpp:862 
(libbrpc.so+0x494020)
       #10 brpc::Server::Start(butil::EndPoint const&, brpc::ServerOptions 
const*) src/brpc/server.cpp:1279 (libbrpc.so+0x496bd0)
       #11 brpc::Server::Start(int, brpc::ServerOptions const*) 
src/brpc/server.cpp:1298 (libbrpc.so+0x496e64)
       #12 brpc::StartDummyServerAt(int, brpc::ProfilerLinker) 
src/brpc/server.cpp:1974 (libbrpc.so+0x497144)
       #13 main 
/home/caolx5/brpc-master/example/rdma_performance/client.cpp:282 
(client+0x40a3c4)
   
   SUMMARY: ThreadSanitizer: data race src/butil/logging.cc:1863 in 
logging::add_vlog_site(int const**, char const*, int, int)
   ==================
   函数栈指向的是logging.h BAIDU_VLOG_IS_ON函数宏
   # define BAIDU_VLOG_IS_ON(verbose_level, filepath)                      \
       ({ static const int* vlocal = &::logging::VLOG_UNINITIALIZED;       \
           const int saved_verbose_level = (verbose_level);                \
           (saved_verbose_level >= 0)/*VLOG(-1) is forbidden*/ &&          \
               (*vlocal >= saved_verbose_level) &&                         \
               ((vlocal != &::logging::VLOG_UNINITIALIZED) ||              \
                (::logging::add_vlog_site(&vlocal, filepath, __LINE__,     \
                                          saved_verbose_level))); })   // 
add_vlog_site 写&vlocal
   
   logging模块没有对BAIDU_VLOG_IS_ON函数宏 读vlocal和调用add_vlog_site 写&vlocal 
时做多线程之间的锁同步等同步保护机制,所以就会出现线程之间的数据竞争
   或者如果logging模块有上层异步队列机制同步保护,这个就是误报了
   
   第二类报错举例如下:
   ==================
   WARNING: ThreadSanitizer: data race (pid=6737)
     Write of size 8 at 0xffffeb401e60 by thread T7:
       #0 bthread::TimerThread::Bucket::schedule(void (*)(void*), void*, 
timespec const&) src/bthread/timer_thread.cpp:212 (libbrpc.so+0x309b1c)
       #1 bthread::TimerThread::schedule(void (*)(void*), void*, timespec 
const&) src/bthread/timer_thread.cpp:231 (libbrpc.so+0x309d9c)
       #2 bthread::TaskGroup::_add_sleep_event(void*) 
src/bthread/task_group.cpp:940 (libbrpc.so+0x2ff160)
       #3 bthread::TaskGroup::sched_to(bthread::TaskGroup**, 
bthread::TaskMeta*) src/bthread/task_group.cpp:792 (libbrpc.so+0x2fea6c)
       #4 bthread::TaskGroup::sched_to(bthread::TaskGroup**, unsigned long) 
src/bthread/task_group_inl.h:82 (libbrpc.so+0x300ea8)
       #5 bthread::TaskGroup::sched(bthread::TaskGroup**) 
src/bthread/task_group.cpp:700 (libbrpc.so+0x300ea8)
       #6 bthread::TaskGroup::usleep(bthread::TaskGroup**, unsigned long) 
src/bthread/task_group.cpp:986 (libbrpc.so+0x301560)
       #7 bthread_usleep src/bthread/bthread.cpp:569 (libbrpc.so+0x2b0c6c)
       #8 GlobalUpdate src/brpc/global.cpp:248 (libbrpc.so+0x40dfa8)
       #9 bthread::TaskGroup::task_runner(long) src/bthread/task_group.cpp:391 
(libbrpc.so+0x301798)
       #10 bthread_make_fcontext <null> (libbrpc.so+0x2b7734)
       #11 bthread::TaskGroup::sched_to(bthread::TaskGroup**, unsigned long) 
src/bthread/task_group_inl.h:82 (libbrpc.so+0x301de4)
       #12 bthread::TaskGroup::run_main_task() src/bthread/task_group.cpp:209 
(libbrpc.so+0x301de4)
       #13 bthread::TaskControl::worker_thread(void*) 
src/bthread/task_control.cpp:126 (libbrpc.so+0x2daed4)
       #14 <null> <null> (libtsan.so.2+0x3c874)
   
     Previous read of size 8 at 0xffffeb401e60 by thread T2:
       #0 bthread::TimerThread::Bucket::consume_tasks() 
src/bthread/timer_thread.cpp:174 (libbrpc.so+0x309110)
       #1 bthread::TimerThread::run() src/bthread/timer_thread.cpp:364 
(libbrpc.so+0x30ab10)
       #2 bthread::TimerThread::run_this(void*) 
src/bthread/timer_thread.cpp:125 (libbrpc.so+0x30b9a8)
       #3 <null> <null> (libtsan.so.2+0x3c874)
   
     Location is heap block of size 896 at 0xffffeb401c00 allocated by main 
thread:
       #0 operator new[](unsigned long, std::align_val_t, std::nothrow_t 
const&) <null> (libtsan.so.2+0x928b0)
       #1 bthread::TimerThread::start(bthread::TimerThreadOptions const*) 
src/bthread/timer_thread.cpp:159 (libbrpc.so+0x308ef0)
       #2 init_global_timer_thread src/bthread/timer_thread.cpp:476 
(libbrpc.so+0x309674)
       #3 pthread_once <null> (libtsan.so.2+0x4c654)
       #4 bthread::get_or_create_global_timer_thread() 
src/bthread/timer_thread.cpp:485 (libbrpc.so+0x3098f4)
       #5 bthread::TaskControl::init(int) src/bthread/task_control.cpp:248 
(libbrpc.so+0x2de460)
       #6 bthread::get_or_new_task_control() src/bthread/bthread.cpp:117 
(libbrpc.so+0x2b1b9c)
       #7 bthread::start_from_non_worker(unsigned long*, bthread_attr_t const*, 
void* (*)(void*), void*) src/bthread/bthread.cpp:274 (libbrpc.so+0x2af8dc)
       #8 bthread_start_background src/bthread/bthread.cpp:356 
(libbrpc.so+0x2af8dc)
       #9 GlobalInitializeOrDieImpl src/brpc/global.cpp:651 
(libbrpc.so+0x40f91c)
       #10 pthread_once <null> (libtsan.so.2+0x4c654)
       #11 brpc::GlobalInitializeOrDie() src/brpc/global.cpp:656 
(libbrpc.so+0x40d690)
       #12 brpc::Server::InitializeOnce() src/brpc/server.cpp:676 
(libbrpc.so+0x486a3c)
       #13 brpc::Server::StartInternal(butil::EndPoint const&, brpc::PortRange 
const&, brpc::ServerOptions const*) src/brpc/server.cpp:862 
(libbrpc.so+0x494020)
       #14 brpc::Server::Start(butil::EndPoint const&, brpc::ServerOptions 
const*) src/brpc/server.cpp:1279 (libbrpc.so+0x496bd0)
       #15 brpc::Server::Start(int, brpc::ServerOptions const*) 
src/brpc/server.cpp:1298 (libbrpc.so+0x496e64)
       #16 brpc::StartDummyServerAt(int, brpc::ProfilerLinker) 
src/brpc/server.cpp:1974 (libbrpc.so+0x497144)
       #17 main 
/home/caolx5/brpc-master/example/rdma_performance/client.cpp:282 
(client+0x40a3c4)
   
     Thread T7 'brpc_wkr:0-3' (tid=6745, running) created by main thread at:
       #0 pthread_create <null> (libtsan.so.2+0x6ad40)
       #1 bthread::TaskControl::init(int) src/bthread/task_control.cpp:263 
(libbrpc.so+0x2de51c)
       #2 bthread::get_or_new_task_control() src/bthread/bthread.cpp:117 
(libbrpc.so+0x2b1b9c)
       #3 bthread::start_from_non_worker(unsigned long*, bthread_attr_t const*, 
void* (*)(void*), void*) src/bthread/bthread.cpp:274 (libbrpc.so+0x2af8dc)
       #4 bthread_start_background src/bthread/bthread.cpp:356 
(libbrpc.so+0x2af8dc)
       #5 GlobalInitializeOrDieImpl src/brpc/global.cpp:651 
(libbrpc.so+0x40f91c)
       #6 pthread_once <null> (libtsan.so.2+0x4c654)
       #7 brpc::GlobalInitializeOrDie() src/brpc/global.cpp:656 
(libbrpc.so+0x40d690)
       #8 brpc::Server::InitializeOnce() src/brpc/server.cpp:676 
(libbrpc.so+0x486a3c)
       #9 brpc::Server::StartInternal(butil::EndPoint const&, brpc::PortRange 
const&, brpc::ServerOptions const*) src/brpc/server.cpp:862 
(libbrpc.so+0x494020)
       #10 brpc::Server::Start(butil::EndPoint const&, brpc::ServerOptions 
const*) src/brpc/server.cpp:1279 (libbrpc.so+0x496bd0)
       #11 brpc::Server::Start(int, brpc::ServerOptions const*) 
src/brpc/server.cpp:1298 (libbrpc.so+0x496e64)
       #12 brpc::StartDummyServerAt(int, brpc::ProfilerLinker) 
src/brpc/server.cpp:1974 (libbrpc.so+0x497144)
       #13 main 
/home/caolx5/brpc-master/example/rdma_performance/client.cpp:282 
(client+0x40a3c4)
   
     Thread T2 'brpc_timer' (tid=6740, running) created by main thread at:
       #0 pthread_create <null> (libtsan.so.2+0x6ad40)
       #1 bthread::TimerThread::start(bthread::TimerThreadOptions const*) 
src/bthread/timer_thread.cpp:164 (libbrpc.so+0x308f70)
       #2 init_global_timer_thread src/bthread/timer_thread.cpp:476 
(libbrpc.so+0x309674)
       #3 pthread_once <null> (libtsan.so.2+0x4c654)
       #4 bthread::get_or_create_global_timer_thread() 
src/bthread/timer_thread.cpp:485 (libbrpc.so+0x3098f4)
       #5 bthread::TaskControl::init(int) src/bthread/task_control.cpp:248 
(libbrpc.so+0x2de460)
       #6 bthread::get_or_new_task_control() src/bthread/bthread.cpp:117 
(libbrpc.so+0x2b1b9c)
       #7 bthread::start_from_non_worker(unsigned long*, bthread_attr_t const*, 
void* (*)(void*), void*) src/bthread/bthread.cpp:274 (libbrpc.so+0x2af8dc)
       #8 bthread_start_background src/bthread/bthread.cpp:356 
(libbrpc.so+0x2af8dc)
       #9 GlobalInitializeOrDieImpl src/brpc/global.cpp:651 
(libbrpc.so+0x40f91c)
       #10 pthread_once <null> (libtsan.so.2+0x4c654)
       #11 brpc::GlobalInitializeOrDie() src/brpc/global.cpp:656 
(libbrpc.so+0x40d690)
       #12 brpc::Server::InitializeOnce() src/brpc/server.cpp:676 
(libbrpc.so+0x486a3c)
       #13 brpc::Server::StartInternal(butil::EndPoint const&, brpc::PortRange 
const&, brpc::ServerOptions const*) src/brpc/server.cpp:862 
(libbrpc.so+0x494020)
       #14 brpc::Server::Start(butil::EndPoint const&, brpc::ServerOptions 
const*) src/brpc/server.cpp:1279 (libbrpc.so+0x496bd0)
       #15 brpc::Server::Start(int, brpc::ServerOptions const*) 
src/brpc/server.cpp:1298 (libbrpc.so+0x496e64)
       #16 brpc::StartDummyServerAt(int, brpc::ProfilerLinker) 
src/brpc/server.cpp:1974 (libbrpc.so+0x497144)
       #17 main 
/home/caolx5/brpc-master/example/rdma_performance/client.cpp:282 
(client+0x40a3c4)
   
   SUMMARY: ThreadSanitizer: data race src/bthread/timer_thread.cpp:212 in 
bthread::TimerThread::Bucket::schedule(void (*)(void*), void*, timespec const&)
   ==================
   函数栈指向的是timer_thread.cpp 如下两个函数
   TimerThread::Task* TimerThread::Bucket::consume_tasks() {
       Task* head = NULL;
       **if (_task_head)** { // NOTE: schedule() and consume_tasks() are 
sequenced   **// _task_head读**
           // by TimerThread._nearest_run_time and fenced by TimerThread._mutex.
           // We can avoid touching the mutex and related cacheline when the
           // bucket is actually empty.
           **BAIDU_SCOPED_LOCK(_mutex);**
           if (_task_head) {
               head = _task_head;
               _task_head = NULL;
               _nearest_run_time = std::numeric_limits<int64_t>::max();
           }
       }
       return head;
   }
   
   TimerThread::Bucket::ScheduleResult
   TimerThread::Bucket::schedule(void (*fn)(void*), void* arg,
                                 const timespec& abstime) {
       butil::ResourceId<Task> slot_id;
       Task* task = butil::get_resource<Task>(&slot_id);
       if (task == NULL) {
           ScheduleResult result = { INVALID_TASK_ID, false };
           return result;
       }
       task->next = NULL;
       task->fn = fn;
       task->arg = arg;
       task->run_time = butil::timespec_to_microseconds(abstime);
       uint32_t version = task->version.load(butil::memory_order_relaxed);
       if (version == 0) {  // skip 0.
           task->version.fetch_add(2, butil::memory_order_relaxed);
           version = 2;
       }
       const TaskId id = make_task_id(slot_id, version);
       task->task_id = id;
       bool earlier = false;
       {
           **BAIDU_SCOPED_LOCK(_mutex);**
           task->next = _task_head;
           **_task_head = task;   // _task_head写**
           if (task->run_time < _nearest_run_time) {
               _nearest_run_time = task->run_time;
               earlier = true;
           }
       }
       ScheduleResult result = { id, earlier };
       return result;
   }
   
这个看着是tsan无法识别封装宏BAIDU_SCOPED_LOCK(_mutex),导致tsan认为多线程之间没有锁同步保护,一个线程写_task_head,另一个线程同时读_task_head(上述代码有对应注释),所以报错数据竞争data
 race
   这第二种报错跟另外两个issue(https://github.com/apache/brpc/issues/2864  以及  
https://github.com/apache/brpc/issues/1687)可能是一个原因导致的tsan报错:brpc没有适配好tsan
   
   **To Reproduce**
   加上tsan编译、链接选项后,在两个shell窗口分别执行如下命令:
   taskset -c 0-95 ./kpl_tools server -use_rdma 0 -thread_num 96 
   taskset -c 48-95 ./kpl_tools client -thread_num 48 -queue_depth 85 
-attachment_size 131072 -use_rdma 0 -connection_type pooled
   
   **Expected behavior**
   如果brpc没有适配好tsan,那就没法跑tsan;如果适配好,期望不会出现上述的data race tsan报错
   
   **Versions**
   OS: openEuler 24.03 (LTS-SP2)
   Compiler: gcc 12.3.1
   brpc: 1.16
   protobuf: protobuf-25.1-12.oe2403sp2.aarch64
   
   **Additional context/screenshots**
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to