lyh20093867 opened a new issue, #27998: URL: https://github.com/apache/doris/issues/27998
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version doris-1.2.5-rc01 ### What's Wrong? 通过定时任务调度执行的时候,出现be异常挂掉 be.out的日志内容 ``` start time: Tue Dec 5 10:19:52 CST 2023 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] *** Query id: 5dc89c8f82254364-90b96ccf1b8e5a95 *** *** Aborted at 1701742826 (unix time) try "date -d @1701742826" if you are using GNU date *** *** Current BE git commitID: Unknown *** *** SIGSEGV address not mapped to object (@0x0) received by PID 19770 (TID 0x7f1d892ac700) from PID 0; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420 1# os::Linux::chained_handler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 4# 0x00007F1E9C320400 in /lib64/libc.so.6 5# je_arena_dalloc_promoted at ../src/arena.c:1604 6# __pthread_create_2_1 in /lib64/libpthread.so.0 7# std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) at ../../../../../libstdc++-v3/src/c++11/thread.cc:150 8# std::thread* doris::ThreadGroup::create_thread<std::_Bind_result<void, std::_Mem_fn<void (doris::PriorityThreadPool::*)(int)> (doris::PriorityThreadPool*, int)> >(std::_Bind_result<void, std::_Mem_fn<void (doris::PriorityThreadPool::*)(int)> (doris::PriorityThreadPool*, int)>) at /root/doris/be/src/util/thread_group.h:65 9# doris::PriorityThreadPool::PriorityThreadPool(unsigned int, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /root/doris/be/src/util/priority_thread_pool.hpp:58 10# doris::KafkaDataConsumerGroup::KafkaDataConsumerGroup() at /root/doris/be/src/runtime/routine_load/data_consumer_group.h:67 11# doris::DataConsumerPool::get_consumer_grp(doris::StreamLoadContext*, std::shared_ptr<doris::DataConsumerGroup>*) at /root/doris/be/src/runtime/routine_load/data_consumer_pool.cpp:73 12# doris::RoutineLoadTaskExecutor::exec_task(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>) at /root/doris/be/src/runtime/routine_load/routine_load_task_executor.cpp:267 13# std::_Function_handler<void (), std::_Bind_result<void, void (doris::RoutineLoadTaskExecutor::*(doris::RoutineLoadTaskExecutor*, doris::StreamLoadContext*, doris::DataConsumerPool*, doris::RoutineLoadTaskExecutor::submit_task(doris::TRoutineLoadTask const&)::$_0))(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>)> >::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291 14# doris::PriorityThreadPool::work_thread(int) at /root/doris/be/src/util/priority_thread_pool.hpp:146 15# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84 16# start_thread in /lib64/libpthread.so.0 17# clone in /lib64/libc.so.6 start time: Tue Dec 5 10:20:29 CST 2023 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] *** Query id: 96480545f8c447ab-9e217a469d6bd219 *** *** Aborted at 1701743541 (unix time) try "date -d @1701743541" if you are using GNU date *** *** Current BE git commitID: Unknown *** *** SIGSEGV address not mapped to object (@0x0) received by PID 22785 (TID 0x7f48d06e6700) from PID 0; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420 1# os::Linux::chained_handler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 4# 0x00007F49DF81E400 in /lib64/libc.so.6 5# je_arena_dalloc_promoted at ../src/arena.c:1604 6# __GI__dl_deallocate_tls in /lib64/ld-linux-x86-64.so.2 7# __free_stacks in /lib64/libpthread.so.0 8# __deallocate_stack in /lib64/libpthread.so.0 9# pthread_join in /lib64/libpthread.so.0 10# std::thread::join() at ../../../../../libstdc++-v3/src/c++11/thread.cc:114 11# doris::ThreadGroup::join_all() in /opt/module/be/lib/doris_be 12# doris::KafkaDataConsumerGroup::start_all(doris::StreamLoadContext*) at /root/doris/be/src/runtime/routine_load/data_consumer_group.cpp:141 13# doris::RoutineLoadTaskExecutor::exec_task(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>) at /root/doris/be/src/runtime/routine_load/routine_load_task_executor.cpp:306 14# std::_Function_handler<void (), std::_Bind_result<void, void (doris::RoutineLoadTaskExecutor::*(doris::RoutineLoadTaskExecutor*, doris::StreamLoadContext*, doris::DataConsumerPool*, doris::RoutineLoadTaskExecutor::submit_task(doris::TRoutineLoadTask const&)::$_0))(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>)> >::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291 15# doris::PriorityThreadPool::work_thread(int) at /root/doris/be/src/util/priority_thread_pool.hpp:146 16# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84 17# start_thread in /lib64/libpthread.so.0 18# clone in /lib64/libc.so.6 start time: Tue Dec 5 10:32:23 CST 2023 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] *** Query id: 53d2fd854e40455b-917387cc504136c1 *** *** Aborted at 1701744844 (unix time) try "date -d @1701744844" if you are using GNU date *** *** Current BE git commitID: Unknown *** *** SIGSEGV address not mapped to object (@0x0) received by PID 12521 (TID 0x7f7e24e74700) from PID 0; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420 1# os::Linux::chained_handler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 4# 0x00007F7F3DCA1400 in /lib64/libc.so.6 5# je_arena_dalloc_promoted at ../src/arena.c:1604 6# __GI__dl_deallocate_tls in /lib64/ld-linux-x86-64.so.2 7# __free_stacks in /lib64/libpthread.so.0 8# __deallocate_stack in /lib64/libpthread.so.0 9# pthread_join in /lib64/libpthread.so.0 10# std::thread::join() at ../../../../../libstdc++-v3/src/c++11/thread.cc:114 11# doris::ThreadGroup::join_all() in /opt/module/be/lib/doris_be 12# doris::KafkaDataConsumerGroup::start_all(doris::StreamLoadContext*) at /root/doris/be/src/runtime/routine_load/data_consumer_group.cpp:141 13# doris::RoutineLoadTaskExecutor::exec_task(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>) at /root/doris/be/src/runtime/routine_load/routine_load_task_executor.cpp:306 14# std::_Function_handler<void (), std::_Bind_result<void, void (doris::RoutineLoadTaskExecutor::*(doris::RoutineLoadTaskExecutor*, doris::StreamLoadContext*, doris::DataConsumerPool*, doris::RoutineLoadTaskExecutor::submit_task(doris::TRoutineLoadTask const&)::$_0))(doris::StreamLoadContext*, doris::DataConsumerPool*, std::function<void (doris::StreamLoadContext*)>)> >::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291 15# doris::PriorityThreadPool::work_thread(int) at /root/doris/be/src/util/priority_thread_pool.hpp:146 16# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84 17# start_thread in /lib64/libpthread.so.0 18# clone in /lib64/libc.so.6 start time: Tue Dec 5 10:54:06 CST 2023 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] *** Query id: bdc5fe65d3e7446b-9eba5514ca9d6637 *** *** Aborted at 1701745459 (unix time) try "date -d @1701745459" if you are using GNU date *** *** Current BE git commitID: Unknown *** *** SIGSEGV address not mapped to object (@0x20000) received by PID 18914 (TID 0x7f9cf18c8700) from PID 131072; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:420 1# os::Linux::chained_handler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo*, void*) in /opt/module//openjdk/jre/lib/amd64/server/libjvm.so 4# 0x00007FA0CD920400 in /lib64/libc.so.6 5# jemalloc_usable_size at ../src/jemalloc.c:3740 6# doris_free at /root/doris/be/src/runtime/memory/jemalloc_hook.cpp:43 7# __GI__dl_deallocate_tls in /lib64/ld-linux-x86-64.so.2 8# __free_stacks in /lib64/libpthread.so.0 9# __deallocate_stack in /lib64/libpthread.so.0 10# start_thread in /lib64/libpthread.so.0 11# clone in /lib64/libc.so.6 start time: Tue Dec 5 11:04:21 CST 2023 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/module/be/lib/hadoop_hdfs/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/module/be/lib/java-udf-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory] ``` cpu信息 ``` Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Thread(s) per core: 1 Core(s) per socket: 24 Socket(s): 1 NUMA node(s): 2 Vendor ID: AuthenticAMD CPU family: 25 Model: 1 Model name: AMD EPYC 7763 64-Core Processor Stepping: 1 CPU MHz: 2445.405 BogoMIPS: 4890.81 Hypervisor vendor: VMware Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 32768K NUMA node0 CPU(s): 0-11 NUMA node1 CPU(s): 12-23 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl tsc_reliable nonstop_tsc extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext invpcid_single retpoline_amd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero arat umip pku ospke vaes vpclmulqdq overflow_recov succor ``` 系统信息 ``` Linux app2 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux ``` 内存信息 ``` total used free shared buff/cache available Mem: 68G 10G 3.0G 1.0G 55G 56G Swap: 0B 0B 0B ``` doris集群信息 ``` show backends\G *************************** 1. row *************************** BackendId: 10003 Cluster: default_cluster IP: 192.168.0.2 HeartbeatPort: 9050 BePort: 9060 HttpPort: 8040 BrpcPort: 8060 LastStartTime: 2023-12-05 11:04:22 LastHeartbeat: 2023-12-05 11:49:41 Alive: true SystemDecommissioned: false ClusterDecommissioned: false TabletNum: 2245 DataUsedCapacity: 97.423 GB AvailCapacity: 14.453 TB TotalCapacity: 14.551 TB UsedPct: 0.67 % MaxDiskUsedPct: 0.72 % RemoteUsedCapacity: 0.000 Tag: {"location" : "default"} ErrMsg: Version: doris-1.2.5-rc01-Unknown Status: {"lastSuccessReportTabletsTime":"2023-12-05 11:49:38","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} HeartbeatFailureCounter: 0 NodeRole: mix *************************** 2. row *************************** BackendId: 10004 Cluster: default_cluster IP: 192.168.0.3 HeartbeatPort: 9050 BePort: 9060 HttpPort: 8040 BrpcPort: 8060 LastStartTime: 2023-12-05 11:33:59 LastHeartbeat: 2023-12-05 11:49:41 Alive: true SystemDecommissioned: false ClusterDecommissioned: false TabletNum: 2093 DataUsedCapacity: 105.563 GB AvailCapacity: 14.445 TB TotalCapacity: 14.551 TB UsedPct: 0.73 % MaxDiskUsedPct: 0.76 % RemoteUsedCapacity: 0.000 Tag: {"location" : "default"} ErrMsg: Version: doris-1.2.5-rc01-Unknown Status: {"lastSuccessReportTabletsTime":"2023-12-05 11:48:46","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} HeartbeatFailureCounter: 0 NodeRole: mix *************************** 3. row *************************** BackendId: 10005 Cluster: default_cluster IP: 192.168.0.4 HeartbeatPort: 9050 BePort: 9060 HttpPort: 8040 BrpcPort: 8060 LastStartTime: 2023-12-05 11:35:00 LastHeartbeat: 2023-12-05 11:49:41 Alive: true SystemDecommissioned: false ClusterDecommissioned: false TabletNum: 2128 DataUsedCapacity: 103.338 GB AvailCapacity: 14.445 TB TotalCapacity: 14.551 TB UsedPct: 0.73 % MaxDiskUsedPct: 0.79 % RemoteUsedCapacity: 0.000 Tag: {"location" : "default"} ErrMsg: Version: doris-1.2.5-rc01-Unknown Status: {"lastSuccessReportTabletsTime":"2023-12-05 11:48:42","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} HeartbeatFailureCounter: 0 NodeRole: mix *************************** 4. row *************************** BackendId: 10006 Cluster: default_cluster IP: 192.168.0.5 HeartbeatPort: 9050 BePort: 9060 HttpPort: 8040 BrpcPort: 8060 LastStartTime: 2023-12-05 10:04:45 LastHeartbeat: 2023-12-05 11:49:41 Alive: true SystemDecommissioned: false ClusterDecommissioned: false TabletNum: 1886 DataUsedCapacity: 0.000 AvailCapacity: 14.551 TB TotalCapacity: 14.551 TB UsedPct: 0.00 % MaxDiskUsedPct: 0.00 % RemoteUsedCapacity: 0.000 Tag: {"location" : "origin_data"} ErrMsg: Version: doris-1.2.5-rc01-Unknown Status: {"lastSuccessReportTabletsTime":"2023-12-05 11:48:56","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} HeartbeatFailureCounter: 0 NodeRole: mix ``` ### What You Expected? be 异常退出的具体原因和解决办法 ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
