shutao917 opened a new issue, #17558: URL: https://github.com/apache/doris/issues/17558
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version 1.1.5 ### What's Wrong? 1.FragmentMgrThreadPool线程数一直增长,平均每天涨40个左右  2.涨到512个(fragment_pool_thread_num_max=512)后,集群就卡死了,所有查询无法执行 3.跟踪了几天,发现很多FragmentMgrThreadPool线程的Cumulative User CPU (s)、Cumulative Kernel CPU (s)、Cumulative IO-wait (s)时间一直没有变化 4.pstack发现大量的线程都处理下面的状态: Thread 251 (Thread 0x7f88d5734700 (LWP 18283)): #0 0x00007f8a16334a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000055bda565ea6c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () #2 0x000055bda1928e37 in doris::BufferControlBlock::add_batch(std::unique_ptr<doris::TFetchDataResult, std::default_delete<doris::TFetchDataResult> >&) () at /root/doris/doris/be/src/runtime/buffer_control_block.cpp:98 #3 0x000055bda30437a0 in doris::vectorized::VMysqlResultWriter::append_block(doris::vectorized::Block&) () at /root/doris/doris/be/src/vec/sink/mysql_result_writer.cpp:331 #4 0x000055bda2fcb399 in doris::vectorized::VResultSink::send(doris::RuntimeState*, doris::vectorized::Block*) () at /var/local/ldb-toolchain/include/c++/11/bits/shared_ptr_base.h:1290 #5 0x000055bda1936a75 in doris::PlanFragmentExecutor::open_vectorized_internal() () at /root/doris/doris/be/src/util/stopwatch.hpp:65 #6 0x000055bda193818f in doris::PlanFragmentExecutor::open() () at /root/doris/doris/be/src/runtime/plan_fragment_executor.cpp:259 #7 0x000055bda18a5cdf in doris::FragmentExecState::execute() () at /root/doris/doris/be/src/runtime/fragment_mgr.cpp:248 #8 0x000055bda18aa87a in doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>) () at /var/local/ldb-toolchain/include/c++/11/bits/shared_ptr_base.h:1290 #9 0x000055bda18b3b5c in __invoke_impl<void, void (doris::FragmentMgr::*&)(std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor*)>), doris::FragmentMgr*&, std::shared_ptr<doris::FragmentExecState>&, std::function<void(doris::PlanFragmentExecutor*)>&> (__f=@0x55c00685ca50: (void (doris::FragmentMgr::*)(doris::FragmentMgr * const, std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor*)>)) 0x55bda18aa330 <doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>)>, __t=@0x55c00685ca90: 0x55bda8c12000, __f=@0x55c00685ca50: (void (doris::FragmentMgr::*)(doris::FragmentMgr * const, std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor*)>)) 0x55bda18aa330 <doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>)>, __t=@0x55c00685ca90: 0x55bda 8c12000) at /var/local/ldb-toolchain/include/c++/11/ext/atomicity.h:109 #10 __invoke_r<void, void (doris::FragmentMgr::*&)(std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor*)>), doris::FragmentMgr*&, std::shared_ptr<doris::FragmentExecState>&, std::function<void(doris::PlanFragmentExecutor*)>&> (__fn=@0x55c00685ca50: (void (doris::FragmentMgr::*)(doris::FragmentMgr * const, std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor*)>)) 0x55bda18aa330 <doris::FragmentMgr::_exec_actual(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>)>) at /var/local/ldb-toolchain/include/c++/11/bits/invoke.h:111 #11 __call<void, 0, 1, 2> (__args=<optimized out>, this=0x55c00685ca50) at /var/local/ldb-toolchain/include/c++/11/functional:570 #12 operator()<> (this=0x55c00685ca50) at /var/local/ldb-toolchain/include/c++/11/functional:629 #13 __invoke_impl<void, std::_Bind_result<void, void (doris::FragmentMgr::*(doris::FragmentMgr*, std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor*)>))(std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor*)>)>&> (__f=...) at /var/local/ldb-toolchain/include/c++/11/bits/invoke.h:61 #14 __invoke_r<void, std::_Bind_result<void, void (doris::FragmentMgr::*(doris::FragmentMgr*, std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor*)>))(std::shared_ptr<doris::FragmentExecState>, std::function<void(doris::PlanFragmentExecutor*)>)>&> (__fn=...) at /var/local/ldb-toolchain/include/c++/11/bits/invoke.h:111 #15 std::_Function_handler<void (), std::_Bind_result<void, void (doris::FragmentMgr::*(doris::FragmentMgr*, std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>))(std::shared_ptr<doris::FragmentExecState>, std::function<void (doris::PlanFragmentExecutor*)>)> >::_M_invoke(std::_Any_data const&) (__functor=...) at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:291 #16 0x000055bda1a6f51b in operator() (this=0x55be1d8a0e98) at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:556 #17 run (this=0x55be1d8a0e90) at /root/doris/doris/be/src/util/threadpool.cpp:42 #18 doris::ThreadPool::dispatch_thread() () at /root/doris/doris/be/src/util/threadpool.cpp:570 #19 0x000055bda1a68b1f in operator() (this=0x55c0332e5e18) at /var/local/ldb-toolchain/include/c++/11/bits/std_function.h:556 #20 doris::Thread::supervise_thread(void*) () at /root/doris/doris/be/src/util/thread.cpp:406 #21 0x00007f8a16330ea5 in start_thread () from /lib64/libpthread.so.0 #22 0x00007f8a16643b0d in clone () from /lib64/libc.so.6 ### What You Expected? 正常 ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
