acelyc111 opened a new issue #1188:
URL: https://github.com/apache/incubator-brpc/issues/1188
**Describe the bug (描述bug)**
使用brpc库的Doris进程出现如下coredump栈:
```
Core was generated by
`/home/work/app/doris/c3prc-bigbi/be/package/be/lib/palo_be'.
Program terminated with signal 11, Segmentation fault.
#0 bthread::id_create_impl (id=id@entry=0x7f09b1140290,
data=data@entry=0x7ac5688, on_error=on_error@entry=0x0,
on_error2=on_error2@entry=0x1b9fea0
<brpc::Controller::HandleSocketFailed(bthread_id_t, void*, int,
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&)>)
at
/root/doris/doris-dev/thirdparty/src/incubator-brpc-0.9.5/src/bthread/id.cpp:333
333
/root/doris/doris-dev/thirdparty/src/incubator-brpc-0.9.5/src/bthread/id.cpp:
没有那个文件或目录.
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-157.el7_3.1.x86_64 libgcc-4.8.5-28.el7_5.1.x86_64
zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 bthread::id_create_impl (id=id@entry=0x7f09b1140290,
data=data@entry=0x7ac5688, on_error=on_error@entry=0x0,
on_error2=on_error2@entry=0x1b9fea0
<brpc::Controller::HandleSocketFailed(bthread_id_t, void*, int,
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&)>)
at
/root/doris/doris-dev/thirdparty/src/incubator-brpc-0.9.5/src/bthread/id.cpp:333
#1 0x0000000001d1387d in bthread_id_create2 (id=id@entry=0x7f09b1140290,
data=data@entry=0x7ac5688,
on_error=on_error@entry=0x1b9fea0
<brpc::Controller::HandleSocketFailed(bthread_id_t, void*, int,
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&)>)
at
/root/doris/doris-dev/thirdparty/src/incubator-brpc-0.9.5/src/bthread/id.cpp:693
#2 0x0000000001b9a86d in brpc::Controller::call_id
(this=this@entry=0x7ac5688) at
/root/doris/doris-dev/thirdparty/src/incubator-brpc-0.9.5/src/brpc/controller.cpp:1213
#3 0x0000000001b9634d in brpc::Channel::CallMethod (this=0x1a31af00,
method=0x21555800, controller_base=0x7ac5688, request=0x1efb80260,
response=0x7ac58d8, done=0x7ac5680) at
/root/doris/doris-dev/thirdparty/src/incubator-brpc-0.9.5/src/brpc/channel.cpp:394
#4 0x00000000013659bf in palo::PInternalService_Stub::transmit_data
(this=<optimized out>, controller=0x7ac5688, request=0x1efb80260,
response=0x7ac58d8, done=0x7ac5680) at
/builds/olap/doris/gensrc/build/gen_cpp/palo_internal_service.pb.cc:319
#5 0x00000000015fb4a1 in doris::DataStreamSender::Channel::send_batch
(this=this@entry=0x1efb80160, batch=batch@entry=0x0, eos=eos@entry=true) at
/builds/olap/doris/be/src/runtime/data_stream_sender.cpp:232
#6 0x00000000015fc03a in doris::DataStreamSender::Channel::close_internal
(this=0x1efb80160) at
/builds/olap/doris/be/src/runtime/data_stream_sender.cpp:289
#7 0x00000000015fc215 in close (state=0x905baa00, this=<optimized out>) at
/builds/olap/doris/be/src/runtime/data_stream_sender.cpp:296
#8 doris::DataStreamSender::close (this=0xad029c0, state=0x905baa00,
exec_status=...) at /builds/olap/doris/be/src/runtime/data_stream_sender.cpp:607
#9 0x00000000010208d3 in doris::PlanFragmentExecutor::open_internal
(this=this@entry=0x2655c5930) at
/builds/olap/doris/be/src/runtime/plan_fragment_executor.cpp:326
#10 0x0000000001020acc in doris::PlanFragmentExecutor::open
(this=this@entry=0x2655c5930) at
/builds/olap/doris/be/src/runtime/plan_fragment_executor.cpp:259
#11 0x0000000000fb1267 in doris::FragmentExecState::execute
(this=0x2655c58c0) at /builds/olap/doris/be/src/runtime/fragment_mgr.cpp:211
#12 0x0000000000fb2d16 in
doris::FragmentMgr::exec_actual(std::shared_ptr<doris::FragmentExecState>,
std::function<void (doris::PlanFragmentExecutor*)>) (this=0x692fc00,
exec_state=..., cb=...) at
/builds/olap/doris/be/src/runtime/fragment_mgr.cpp:394
#13 0x0000000000fb96b8 in __invoke_impl<void, void
(doris::FragmentMgr::*&)(std::shared_ptr<doris::FragmentExecState>,
std::function<void(doris::PlanFragmentExecutor*)>), doris::FragmentMgr*&,
std::shared_ptr<doris::FragmentExecState>&,
std::function<void(doris::PlanFragmentExecutor*)>&> (__t=@0x20fbf210:
0x692fc00, __f=
@0x20fbf1d0: (void (doris::FragmentMgr::*)(doris::FragmentMgr * const,
std::shared_ptr<doris::FragmentExecState>,
std::function<void(doris::PlanFragmentExecutor*)>)) 0xfb2cf0
<doris::FragmentMgr::exec_actual(std::shared_ptr<doris::FragmentExecState>,
std::function<void (doris::PlanFragmentExecutor*)>)>) at
/usr/include/c++/7.3.0/bits/invoke.h:73
#14 __invoke<void
(doris::FragmentMgr::*&)(std::shared_ptr<doris::FragmentExecState>,
std::function<void(doris::PlanFragmentExecutor*)>), doris::FragmentMgr*&,
std::shared_ptr<doris::FragmentExecState>&,
std::function<void(doris::PlanFragmentExecutor*)>&> (__fn=
@0x20fbf1d0: (void (doris::FragmentMgr::*)(doris::FragmentMgr * const,
std::shared_ptr<doris::FragmentExecState>,
std::function<void(doris::PlanFragmentExecutor*)>)) 0xfb2cf0
<doris::FragmentMgr::exec_actual(std::shared_ptr<doris::FragmentExecState>,
std::function<void (doris::PlanFragmentExecutor*)>)>) at
/usr/include/c++/7.3.0/bits/invoke.h:95
#15 __call<void, 0, 1, 2> (__args=..., this=0x20fbf1d0) at
/usr/include/c++/7.3.0/functional:632
#16 operator()<> (this=0x20fbf1d0) at /usr/include/c++/7.3.0/functional:718
#17
boost::detail::function::void_function_obj_invoker0<std::_Bind_result<void,
void (doris::FragmentMgr::*(doris::FragmentMgr*,
std::shared_ptr<doris::FragmentExecState>, std::function<void
(doris::PlanFragmentExecutor*)>))(std::shared_ptr<doris::FragmentExecState>,
std::function<void (doris::PlanFragmentExecutor*)>)>,
void>::invoke(boost::detail::function::function_buffer&) (function_obj_ptr=...)
at
/var/local/thirdparty/installed/include/boost/function/function_template.hpp:159
#18 0x0000000000fb24d4 in operator() (this=0x3b1cd01c0) at
/var/local/thirdparty/installed/include/boost/function/function_template.hpp:759
#19 doris::fragment_executor (param=0x3b1cd01c0) at
/builds/olap/doris/be/src/runtime/fragment_mgr.cpp:419
#20 0x00007f0b08218dc5 in start_thread () from /lib64/libpthread.so.0
#21 0x00007f0b0852473d in clone () from /lib64/libc.so.6
(gdb) f 0
#0 bthread::id_create_impl (id=id@entry=0x7f09b1140290,
data=data@entry=0x7ac5688, on_error=on_error@entry=0x0,
on_error2=on_error2@entry=0x1b9fea0
<brpc::Controller::HandleSocketFailed(bthread_id_t, void*, int,
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&)>)
at
/root/doris/doris-dev/thirdparty/src/incubator-brpc-0.9.5/src/bthread/id.cpp:333
333 in
/root/doris/doris-dev/thirdparty/src/incubator-brpc-0.9.5/src/bthread/id.cpp
(gdb) p butex
$1 = (uint32_t *) 0x0
(gdb)
```
还有一个类似的栈:
```
Core was generated by
`/home/work/app/doris/c3prc-whalecore/be/package/be/lib/palo_be'.
Program terminated with signal 6, Aborted.
#0 0x00007fafe031f1d7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-157.el7_3.1.x86_64 libgcc-4.8.5-28.el7_5.1.x86_64
zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x00007fafe031f1d7 in raise () from /lib64/libc.so.6
#1 0x00007fafe03208c8 in abort () from /lib64/libc.so.6
#2 0x000000000230f3b6 in google::DumpStackTraceAndExit () at
src/utilities.cc:147
#3 0x00000000023066bd in google::LogMessage::Fail () at src/logging.cc:1599
#4 0x0000000002308544 in google::LogMessage::SendToLog
(this=0x7faf5a0f28a0) at src/logging.cc:1553
#5 0x00000000023061e4 in google::LogMessage::Flush (this=0x7faf5a0f28a0) at
src/logging.cc:1422
#6 0x0000000002308f79 in google::LogMessageFatal::~LogMessageFatal
(this=<optimized out>, __in_chrg=<optimized out>) at src/logging.cc:2125
#7 0x000000000259b0a0 in bthread::id_create_impl
(id=id@entry=0x7faf5a0f2900, data=data@entry=0x83f09408,
on_error=on_error@entry=0x0,
on_error2=on_error2@entry=0x2427bb0
<brpc::Controller::HandleSocketFailed(bthread_id_t, void*, int,
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&)>)
at
/var/local/incubator-doris/thirdparty/src/incubator-brpc-0.9.5/src/bthread/id.cpp:331
#8 0x000000000259b5cd in bthread_id_create2 (id=id@entry=0x7faf5a0f2900,
data=data@entry=0x83f09408,
on_error=on_error@entry=0x2427bb0
<brpc::Controller::HandleSocketFailed(bthread_id_t, void*, int,
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
const&)>)
at
/var/local/incubator-doris/thirdparty/src/incubator-brpc-0.9.5/src/bthread/id.cpp:693
#9 0x000000000242257d in brpc::Controller::call_id
(this=this@entry=0x83f09408) at
/var/local/incubator-doris/thirdparty/src/incubator-brpc-0.9.5/src/brpc/controller.cpp:1213
#10 0x000000000241e05d in brpc::Channel::CallMethod (this=0xcd14600,
method=0x10bd2400, controller_base=0x83f09408, request=0x139d77180,
response=0x83f09658, done=0x83f09400) at
/var/local/incubator-doris/thirdparty/src/incubator-brpc-0.9.5/src/brpc/channel.cpp:394
#11 0x000000000134fbff in palo::PInternalService_Stub::transmit_data
(this=<optimized out>, controller=0x83f09408, request=0x139d77180,
response=0x83f09658, done=0x83f09400) at
/builds/olap/doris/gensrc/build/gen_cpp/palo_internal_service.pb.cc:319
#12 0x00000000015d8a91 in doris::DataStreamSender::Channel::send_batch
(this=this@entry=0x139d77080, batch=batch@entry=0x139d77138,
eos=eos@entry=true) at
/builds/olap/doris/be/src/runtime/data_stream_sender.cpp:232
#13 0x00000000015d8d64 in
doris::DataStreamSender::Channel::send_current_batch
(this=this@entry=0x139d77080, eos=eos@entry=true) at
/builds/olap/doris/be/src/runtime/data_stream_sender.cpp:275
#14 0x00000000015d9661 in doris::DataStreamSender::Channel::close_internal
(this=0x139d77080) at
/builds/olap/doris/be/src/runtime/data_stream_sender.cpp:287
#15 0x00000000015d9805 in close (state=0x1712ed800, this=<optimized out>) at
/builds/olap/doris/be/src/runtime/data_stream_sender.cpp:296
#16 doris::DataStreamSender::close (this=0x48cc6820, state=0x1712ed800,
exec_status=...) at /builds/olap/doris/be/src/runtime/data_stream_sender.cpp:607
#17 0x0000000001054f13 in doris::PlanFragmentExecutor::open_internal
(this=this@entry=0x863465f0) at
/builds/olap/doris/be/src/runtime/plan_fragment_executor.cpp:351
#18 0x0000000001055114 in doris::PlanFragmentExecutor::open
(this=this@entry=0x863465f0) at
/builds/olap/doris/be/src/runtime/plan_fragment_executor.cpp:284
#19 0x0000000000fdc7d7 in doris::FragmentExecState::execute
(this=0x86346580) at /builds/olap/doris/be/src/runtime/fragment_mgr.cpp:209
#20 0x0000000000fde5f6 in
doris::FragmentMgr::exec_actual(std::shared_ptr<doris::FragmentExecState>,
std::function<void (doris::PlanFragmentExecutor*)>) (this=0x6e9b180,
exec_state=..., cb=...) at
/builds/olap/doris/be/src/runtime/fragment_mgr.cpp:393
#21 0x0000000000fe4724 in operator() (a2=<error reading variable: access
outside bounds of object referenced via synthetic pointer>, a1=...,
p=<optimized out>, this=<optimized out>) at
/var/local/thirdparty/installed/include/boost/bind/mem_fn_template.hpp:280
#22 operator()<boost::_mfi::mf2<void, doris::FragmentMgr,
std::shared_ptr<doris::FragmentExecState>,
std::function<void(doris::PlanFragmentExecutor*)> >, boost::_bi::list0>
(a=<synthetic pointer>, f=..., this=<optimized out>)
at /var/local/thirdparty/installed/include/boost/bind/bind.hpp:398
#23 operator() (this=<optimized out>) at
/var/local/thirdparty/installed/include/boost/bind/bind.hpp:1294
#24
boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void,
boost::_mfi::mf2<void, doris::FragmentMgr,
std::shared_ptr<doris::FragmentExecState>, std::function<void
(doris::PlanFragmentExecutor*)> >,
boost::_bi::list3<boost::_bi::value<doris::FragmentMgr*>,
boost::_bi::value<std::shared_ptr<doris::FragmentExecState> >,
boost::_bi::value<std::function<void (doris::PlanFragmentExecutor*)> > > >,
void>::invoke(boost::detail::function::function_buffer&) (function_obj_ptr=...)
at
/var/local/thirdparty/installed/include/boost/function/function_template.hpp:159
#25 0x0000000000edc7e8 in operator() (this=0x7faf5a0f2fc0) at
/var/local/thirdparty/installed/include/boost/function/function_template.hpp:759
#26 doris::ThreadPool::work_thread (this=0x6e9b200, thread_id=<optimized
out>) at /builds/olap/doris/be/src/util/thread_pool.hpp:120
#27 0x0000000001a20a1d in thread_proxy ()
#28 0x00007fafe00d5dc5 in start_thread () from /lib64/libpthread.so.0
#29 0x00007fafe03e173d in clone () from /lib64/libc.so.6
(gdb)
```
相关代码:
https://github.com/apache/incubator-brpc/blob/a6ccc96aeb92d178b38885dc7ca3c525e5699648/src/bthread/id.cpp#L321-L345
**To Reproduce (复现方法)**
无明确复现方法,但出现频次还挺高
**Expected behavior (期望行为)**
正常运行
**Versions (各种版本)**
OS:
Compiler:
brpc: 0.9.5
protobuf:
**Additional context/screenshots (更多上下文/截图)**
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]