steel2013 edited a comment on issue #1064: URL: https://github.com/apache/incubator-brpc/issues/1064#issuecomment-618229774
@jamesge 偶现bthread_worker_usage是bthread_worker_count的2至3倍,一旦出现,就会一直持续,此时服务出现卡顿,api响应很慢,将此服务节点lvs负载关闭,bthread_worker_usage即恢复正常,维持在1左右,再次打开负载,又出现bthread_worker_usage是bthread_worker_count的2至3倍,此时抓取堆栈,如下: Thread 16 (Thread 0x7fe05b9f9700 (LWP 27903)): #0 0x00007fe09056dba9 in syscall () from /lib64/libc.so.6 #1 0x00007fe091ebaa3e in bthread::TaskGroup::wait_task(unsigned long*) () from ./libbrpc.so #2 0x00007fe091ebcb3b in bthread::TaskGroup::run_main_task() () from ./libbrpc.so #3 0x00007fe091eb87ee in bthread::TaskControl::worker_thread(void*) () from ./libbrpc.so #4 0x00007fe090260e65 in start_thread () from /lib64/libpthread.so.0 #5 0x00007fe09057388d in clone () from /lib64/libc.so.6 出现大量上述堆栈信息 另一次堆栈如下 Thread 20 (Thread 0x7f1c0cdfa700 (LWP 22801)): #0 0x00007f1c3702b57c in __lll_lock_wait_private () from /lib64/libc.so.6 #1 0x00007f1c36f8e7c4 in _L_lock_62 () from /lib64/libc.so.6 #2 0x00007f1c36f8e64e in fwrite () from /lib64/libc.so.6 #3 0x00007f1c3789b642 in std::ostream::write(char const*, long) () from /lib64/libstdc++.so.6 #4 0x00007f1c389192ca in butil::operator<<(std::ostream&, butil::IOBuf const&) () from ./libbrpc.so #5 0x00007f1c38a22323 in brpc::policy::PrintMessage(butil::IOBuf const&, bool, bool) () from ./libbrpc.so #6 0x00007f1c38a25ffa in brpc::policy::SendHttpResponse(brpc::Controller*, google::protobuf::Message const*, google::protobuf::Message const*, brpc::Server const*, brpc::MethodStatus*, long) () from ./libbrpc.so #7 0x00007f1c38a2822a in brpc::internal::FunctionClosure6<brpc::Controller*, google::protobuf::Message const*, google::protobuf::Message const*, brpc::Server const*, brpc::MethodStatus*, long>::Run() () from ./libbrpc.so #8 0x00007f1c389d1299 in brpc::VarsService::default_method(google::protobuf::RpcController*, brpc::VarsRequest const*, brpc::VarsResponse*, google::protobuf::Closure*) () from ./libbrpc.so #9 0x00007f1c38abd4f5 in brpc::vars::CallMethod(google::protobuf::MethodDescriptor const*, google::protobuf::RpcController*, google::protobuf::Message const*, google::protobuf::Message*, google::protobuf::Closure*) () from ./libbrpc.so #10 0x00007f1c38a274e6 in brpc::policy::ProcessHttpRequest(brpc::InputMessageBase*) () from ./libbrpc.so #11 0x00007f1c38a01d1a in brpc::ProcessInputMessage(void*) () from ./libbrpc.so #12 0x00007f1c38a02d04 in brpc::InputMessenger::OnNewMessages(brpc::Socket*) () from ./libbrpc.so #13 0x00007f1c38a9a8bd in brpc::Socket::ProcessEvent(void*) () from ./libbrpc.so #14 0x00007f1c389668ea in bthread::TaskGroup::task_runner(long) () from ./libbrpc.so #15 0x00007f1c3894f741 in bthread_make_fcontext () from ./libbrpc.so #16 0x0000000000000000 in ?? () 重启服务后,打开负载,即可恢复正常。  ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
