morningman opened a new issue #3406:
URL: https://github.com/apache/incubator-doris/issues/3406
**Describe the bug**
Sometimes when restarting a BE, BE will crash several seconds later, and
be.out shows:
```
*** Aborted at 1588019672 (unix time) try "date -d @1588019672" if you are
using GNU date ***
PC: @ 0xf57722 doris::fs::fs_util::block_manager()
*** SIGSEGV (@0x248) received by PID 25024 (TID 0x7efe9c1e2700) from PID
584; stack trace: ***
@ 0x7efeabffb3b0 (unknown)
@ 0xf57722 doris::fs::fs_util::block_manager()
@ 0x157ecfa doris::segment_v2::Segment::_parse_footer()
@ 0x158045a doris::segment_v2::Segment::_open()
@ 0x1580910 doris::segment_v2::Segment::open()
@ 0xf114ea doris::BetaRowset::do_load()
@ 0xefa6fe doris::Rowset::load()
@ 0xf1366f doris::BetaRowsetReader::init()
@ 0xe85eb8 doris::Reader::_capture_rs_readers()
@ 0xe890e3 doris::Reader::init()
@ 0xe6fa1f doris::Merger::merge_rowsets()
@ 0xe648ad doris::Compaction::do_compaction_impl()
@ 0xe66a2d doris::Compaction::do_compaction()
@ 0xe66b90 doris::CumulativeCompaction::compact()
@ 0xdeb254
doris::StorageEngine::_perform_cumulative_compaction()
@ 0xe7ef5b
doris::StorageEngine::_cumulative_compaction_thread_callback()
@ 0x26181df execute_native_thread_routine
@ 0x7efeabdb0e65 start_thread
@ 0x7efeac0c388d __clone
```
This is because the wrong order of instance initialization.
In file `src/service/doris_main.cpp`:
```
180 doris::StorageEngine* engine = nullptr;
181 auto st = doris::StorageEngine::open(options, &engine);
182 if (!st.ok()) {
183 LOG(FATAL) << "fail to open StorageEngine, res=" <<
st.get_error_msg();
184 exit(-1);
185 }
186
187 // start backend service for the coordinator on be_port
188 auto exec_env = doris::ExecEnv::GetInstance();
189 doris::ExecEnv::init(exec_env, paths);
190 exec_env->set_storage_engine(engine);
```
the `line 181` first open the `StorageEngine`. it will start all backgroud
thread such as
base and cumulative compaction thread. In these thread, it will try to call
`fs::fs_util::block_manager()`.
But the `block_manager()` is only available after `line 190`. So this may
cause null pointer
access error, which crashes the BE.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]