嗯 On Fri, Nov 11, 2022 at 10:30 陈奇昌 <1879467...@qq.com.invalid> wrote:
> 您好,我这个是大版本1.1.2,要升级到1.1.3吗? > > > > > ------------------ 原始邮件 ------------------ > 发件人: > "dev" > < > chji...@gmail.com>; > 发送时间: 2022年11月11日(星期五) 上午10:20 > 收件人: "dev"<dev@doris.apache.org>; > > 主题: Re: be节点宕机后起不来 > > > > 之前compact有个bug,升个级再试试 > > On Thu, Nov 10, 2022 at 17:28 陈奇昌 <1879467...@qq.com.invalid> wrote: > > > &nbsp; hi: > > &nbsp; &nbsp; &nbsp; > 使用版本为apache-doris-be-1.1.2,因为写了demo,循环6万次,单个insert > > > into语句,执行后,3个节点都挂掉,并且启动后立马停止。disable_auto_compaction=true可以启动,但不能从根本解决问题。be.out日志如下: > > > > > > start time: Thu Nov 10 17:11:43 CST 2022 > > WARNING: Logging before InitGoogleLogging() is written to STDERR > > I1110 17:11:43.721869&nbsp; 4706 env.cpp:46] Env init > successfully. > > *** Aborted at 1668071504 (unix time) try "date -d @1668071504" if > you are > > using GNU date *** > > *** SIGSEGV unkown detail explain (@0x0) received by PID 4706 (TID > > 0x7f0bf7546700) from PID 0; stack trace: *** > > &nbsp;0# doris::signal::(anonymous > namespace)::FailureSignalHandler(int, > > siginfo_t*, void*) at > > /mnt/disk2/apache-doris/be/src/common/signal_handler.h:420 > > &nbsp;1# 0x00007F0C5595C400 in /lib64/libc.so.6 > > &nbsp;2# > doris::BaseFieldtypeTraits<(doris::FieldType)9&gt;::equal(void > > const*, void const*) at > /mnt/disk2/apache-doris/be/src/olap/types.h:491 > > &nbsp;3# > doris::TupleReader::_unique_key_next_row(doris::RowCursor*, > > doris::MemPool*, doris::ObjectPool*, bool*) at > > /mnt/disk2/apache-doris/be/src/olap/tuple_reader.cpp:197 > > &nbsp;4# > doris::Merger::merge_rowsets(std::shared_ptr<doris::Tablet&gt;, > > doris::ReaderType, > std::vector<std::shared_ptr<doris::RowsetReader&gt;, > > std::allocator<std::shared_ptr<doris::RowsetReader&gt; &gt; > &gt; > > const&amp;, doris::RowsetWriter*, doris::Merger::Statistics*) in > > /home/cv/apache-doris-be-1.1.2/be/lib/doris_be > > &nbsp;5# doris::Compaction::do_compaction_impl(long) in > > /home/cv/apache-doris-be-1.1.2/be/lib/doris_be > > &nbsp;6# doris::Compaction::do_compaction(long) at > > /mnt/disk2/apache-doris/be/src/olap/compaction.cpp:112 > > &nbsp;7# doris::CumulativeCompaction::execute_compact_impl() in > > /home/cv/apache-doris-be-1.1.2/be/lib/doris_be > > &nbsp;8# doris::Compaction::execute_compact() at > > /mnt/disk2/apache-doris/be/src/olap/compaction.cpp:50 > > &nbsp;9# doris::Tablet::execute_compaction(doris::CompactionType) > in > > /home/cv/apache-doris-be-1.1.2/be/lib/doris_be > > 10# std::_Function_handler<void (), > > > doris::StorageEngine::_submit_compaction_task(std::shared_ptr<doris::Tablet&gt;, > > doris::CompactionType)::$_12&gt;::_M_invoke(std::_Any_data > const&amp;) at > > > /mnt/disk2/ygl/installs/ldbtools/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291 > > 11# doris::ThreadPool::dispatch_thread() at > > /mnt/disk2/apache-doris/be/src/util/threadpool.cpp:578 > > 12# doris::Thread::supervise_thread(void*) at > > /mnt/disk2/apache-doris/be/src/util/thread.cpp:407 > > 13# start_thread in /lib64/libpthread.so.0 > > 14# clone in /lib64/libc.so.6 > > > > > > > > > > > > be.INFO日志如下: > > > > > > I1110 17:11:43.801707&nbsp; 4706 daemon.cpp:240]&nbsp; > version 1.1.2-rc05 > > RELEASE (build > > > git://hk-dev01/mnt/disk2/apache-doris@a8323dae4f93cc4653b9b071607090449208fd7c > > ) > > Built on Fri, 09 Sep 2022 18:12:12 CST by ygl@hk-dev01 > > I1110 17:11:43.806525&nbsp; 4706 mem_info.cpp:89] Physical > Memory: 31.26 GB > > I1110 17:11:43.810570&nbsp; 4706 daemon.cpp:272] Cpu Info: > > &nbsp; Model: Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz > > &nbsp; Cores: 32 > > &nbsp; Max Possible Cores: 32 > > &nbsp; L1 Cache: 32.00 KB (Line: 64.00 B) > > &nbsp; L2 Cache: 1.00 MB (Line: 64.00 B) > > &nbsp; L3 Cache: 11.00 MB (Line: 64.00 B) > > &nbsp; Hardware Supports: > > &nbsp; &nbsp; ssse3 > > &nbsp; &nbsp; sse4_1 > > &nbsp; &nbsp; sse4_2 > > &nbsp; &nbsp; popcnt > > &nbsp; &nbsp; avx > > &nbsp; &nbsp; avx2 > > &nbsp; Numa Nodes: 2 > > &nbsp; Numa Nodes of Cores: 0-&gt;0 | 1-&gt;0 | > 2-&gt;0 | 3-&gt;0 | > > 4-&gt;0 | 5-&gt;0 | 6-&gt;0 | 7-&gt;0 | 8-&gt;0 | > 9-&gt;0 | 10-&gt;0 | > > 11-&gt;0 | 12-&gt;0 | 13-&gt;0 | 14-&gt;0 | > 15-&gt;0 | 16-&gt;1 | 17-&gt;1 > > | 18-&gt;1 | 19-&gt;1 | 20-&gt;1 | 21-&gt;1 | > 22-&gt;1 | 23-&gt;1 | > > 24-&gt;1 | 25-&gt;1 | 26-&gt;1 | 27-&gt;1 | > 28-&gt;1 | 29-&gt;1 | 30-&gt;1 > > | 31-&gt;1 | > > I1110 17:11:43.810644&nbsp; 4706 daemon.cpp:273] Disk > Info:&nbsp; > > &nbsp; Num disks 3: sda, sr, dm- > > I1110 17:11:43.810652&nbsp; 4706 daemon.cpp:274] Physical Memory: > 31.26 GB > > Memory Limt: 25.01 GB > > Current Usage: 0 > > CGroup Info: Process CGroup Info: > > memory.limit_in_bytes=9223372036854771712, cpu cfs limits: unlimited > > I1110 17:11:43.811475&nbsp; 4706 backend_options.cpp:88] priority > cidrs in > > conf: 10.10.11.0/24 > > I1110 <http://10.10.11.0/24I1110> 17:11:43.811609&nbsp; 4706 > > backend_options.cpp:76] local host ip=10.10.11.151 > > I1110 17:11:43.815408&nbsp; 4706 exec_env_init.cpp:118] scan > thread pool > > use PriorityWorkStealingThreadPool > > I1110 17:11:43.816560&nbsp; 4771 fragment_mgr.cpp:699] > FragmentMgr cancel > > worker start working. > > I1110 17:11:43.873380&nbsp; 4706 load_path_mgr.cpp:58] Load path > > configured to > [/home/cv/apache-doris-be-1.1.2/be/storage/mini_download] > > I1110 17:11:43.873428&nbsp; 4848 result_buffer_mgr.cpp:142] > result buffer > > manager cancel thread begin. > > I1110 17:11:43.873529&nbsp; 4706 exec_env_init.cpp:223] Using > global > > memory limit: 25.01 GB, origin config value: 80% > > I1110 17:11:43.873634&nbsp; 4706 exec_env_init.cpp:264] Buffer > pool memory > > limit: 5.00 GB, origin config value: 20%. clean pages limit: 2.50 GB, > > origin config value: 50% > > I1110 17:11:43.876905&nbsp; 4706 exec_env_init.cpp:280] Storage > page cache > > memory limit: 5.00 GB, origin config value: 20% > > I1110 17:11:43.878834&nbsp; 4706 tmp_file_mgr.cc:113] Using > scratch > > directory /home/cv/apache-doris-be-1.1.2/be/storage/doris-scratch on > disk 2 > > I1110 17:11:43.878902&nbsp; 4706 exec_env_init.cpp:309] Chunk > allocator > > memory limit: 2.00 GB, origin config value: 2147483648 > > I1110 17:11:43.881217&nbsp; 4706 storage_engine.cpp:100] starting > backend > > using uid:bf427e76d6265c55-0ff40281990f2189 > > I1110 17:11:43.881491&nbsp; 4854 data_dir.cpp:739] path: > > /home/cv/apache-doris-be-1.1.2/be/storage total capacity: > 458232905728, > > available capacity: 380353990656 > > I1110 17:11:43.881717&nbsp; 4854 data_dir.cpp:204] path: > > /home/cv/apache-doris-be-1.1.2/be/storage, hash: 5426859093020019257 > > I1110 17:11:43.976120&nbsp; 4706 storage_engine.cpp:256] stream > load > > record path: /home/cv/apache-doris-be-1.1.2/be/storage > > I1110 17:11:44.043159&nbsp; 4940 data_dir.cpp:386] start to load > tablets > > from /home/cv/apache-doris-be-1.1.2/be/storage > > I1110 17:11:44.043560&nbsp; 4940 data_dir.cpp:379] successfully > check > > incompatible old format meta /home/cv/apache-doris-be-1.1.2/be/storage > > I1110 17:11:44.043581&nbsp; 4940 data_dir.cpp:396] begin loading > rowset > > from meta > > I1110 17:11:44.043814&nbsp; 4940 data_dir.cpp:416] load rowset > from meta > > finished, data dir: /home/cv/apache-doris-be-1.1.2/be/storage > > I1110 17:11:44.043828&nbsp; 4940 data_dir.cpp:421] begin loading > tablet > > from meta > > I1110 17:11:44.242991&nbsp; 4940 data_dir.cpp:470] load tablet > from meta > > finished, loaded tablet: 4539, error tablet: 0, path: > > /home/cv/apache-doris-be-1.1.2/be/storage > > I1110 17:11:44.243028&nbsp; 4940 data_dir.cpp:543] finish to load > tablets > > from /home/cv/apache-doris-be-1.1.2/be/storage, total rowset meta: 3, > > invalid rowset num: 0 > > I1110 17:11:44.244043&nbsp; 4706 storage_engine.cpp:104] success > to init > > storage engine. > > I1110 17:11:44.244122&nbsp; 4706 olap_server.cpp:50] unused > rowset monitor > > thread started > > I1110 17:11:44.244285&nbsp; 4706 olap_server.cpp:56] garbage > sweeper > > thread started > > I1110 17:11:44.244421&nbsp; 4706 olap_server.cpp:62] disk stat > monitor > > thread started > > I1110 17:11:44.248678&nbsp; 4706 olap_server.cpp:114] compaction > tasks > > producer thread started > > I1110 17:11:44.248697&nbsp; 4985 olap_server.cpp:426] try to start > > compaction producer process! > > I1110 17:11:44.248818&nbsp; 4706 olap_server.cpp:128] tablet > checkpoint > > tasks producer thread started > > I1110 17:11:44.248914&nbsp; 4986 olap_server.cpp:331] begin to > produce > > tablet meta checkpoint tasks. > > I1110 17:11:44.248952&nbsp; 4706 olap_server.cpp:134] fd cache > clean > > thread started > > I1110 17:11:44.250069&nbsp; 4988 olap_server.cpp:312] try to > perform path > > scan! > > I1110 17:11:44.250088&nbsp; 4988 data_dir.cpp:660] start to scan > data dir > > path:/home/cv/apache-doris-be-1.1.2/be/storage > > I1110 17:11:44.250088&nbsp; 4706 olap_server.cpp:156] path > scan/gc threads > > started. number:1 > > I1110 17:11:44.250108&nbsp; 4706 olap_server.cpp:159] all storage > engine's > > background threads are started. > > I1110 17:11:44.250173&nbsp; 4990 olap_server.cpp:287] try to > start path gc > > thread! > > I1110 17:11:44.250198&nbsp; 4990 olap_server.cpp:290] try to > perform path > > gc by tablet! > > I1110 17:11:44.254791&nbsp; 4974 compaction.cpp:138] start > cumulative > > compaction. > tablet=223773.1946636312.9142e80eeff35fb3-beb419325a9c37b4, > > output_version=[2-60], permits: 59 > > I1110 17:11:44.260146&nbsp; 4989 tablet_manager.cpp:1202] finish > to do > > meta checkpoint on dir: /home/cv/apache-doris-be-1.1.2/be/storage, > number: > > 0, cost(ms): 1 > > I1110 17:11:44.262894&nbsp; 4706 agent_server.cpp:99] Register > user > > resource listener > > I1110 17:11:44.262941&nbsp; 4706 backend_service.cpp:82] > > DorisInternalService listening on 9060 > > I1110 17:11:44.263602&nbsp; 4706 thrift_server.cpp:355] > ThriftServer > > 'backend' started on port: 9060 > > I1110 17:11:44.270102&nbsp; 4973 compaction.cpp:138] start > cumulative > > compaction. > tablet=223777.1946636312.2449a4f3990d0cf0-5192af096b13689b, > > output_version=[2-60], permits: 59 > > I1110 17:11:44.285086&nbsp; 4706 server.cpp:1066] > > Server[doris::PInternalServiceImpl<doris::PBackendService&gt;] is > serving > > on port=8060. > > I1110 17:11:44.285102&nbsp; 4706 server.cpp:1069] Check out > > http://localhost.localdomain:8060 in web browser. > > I1110 17:11:44.309863&nbsp; 4706 thrift_server.cpp:355] > ThriftServer > > 'heartbeat' started on port: 9050 > > I1110 17:11:44.350798&nbsp; 4988 data_dir.cpp:709] scan data dir > path: > > /home/cv/apache-doris-be-1.1.2/be/storage finished. path size: 5914 > > I1110 17:11:44.351671&nbsp; 4990 data_dir.cpp:567] start to path > gc by > > tablet schemahash. > > I1110 17:11:44.410344&nbsp; 4990 data_dir.cpp:606] finished one > time path > > gc by tablet. > > I1110 17:11:44.410378&nbsp; 4990 olap_server.cpp:293] try to > perform path > > gc by rowsetid! > > I1110 17:11:44.410387&nbsp; 4990 data_dir.cpp:617] start to path > gc by > > rowsetid. > > I1110 17:11:44.431969&nbsp; 4990 data_dir.cpp:650] finished one > time path > > gc by rowsetid.