morningman opened a new issue #3406:
URL: https://github.com/apache/incubator-doris/issues/3406


   **Describe the bug**
   Sometimes when restarting a BE, BE will crash several seconds later, and 
be.out shows:
   
   ```
   *** Aborted at 1588019672 (unix time) try "date -d @1588019672" if you are 
using GNU date ***
   PC: @           0xf57722 doris::fs::fs_util::block_manager()
   *** SIGSEGV (@0x248) received by PID 25024 (TID 0x7efe9c1e2700) from PID 
584; stack trace: ***
       @     0x7efeabffb3b0 (unknown)
       @           0xf57722 doris::fs::fs_util::block_manager()
       @          0x157ecfa doris::segment_v2::Segment::_parse_footer()
       @          0x158045a doris::segment_v2::Segment::_open()
       @          0x1580910 doris::segment_v2::Segment::open()
       @           0xf114ea doris::BetaRowset::do_load()
       @           0xefa6fe doris::Rowset::load()
       @           0xf1366f doris::BetaRowsetReader::init()
       @           0xe85eb8 doris::Reader::_capture_rs_readers()
       @           0xe890e3 doris::Reader::init()
       @           0xe6fa1f doris::Merger::merge_rowsets()
       @           0xe648ad doris::Compaction::do_compaction_impl()
       @           0xe66a2d doris::Compaction::do_compaction()
       @           0xe66b90 doris::CumulativeCompaction::compact()
       @           0xdeb254 
doris::StorageEngine::_perform_cumulative_compaction()
       @           0xe7ef5b 
doris::StorageEngine::_cumulative_compaction_thread_callback()
       @          0x26181df execute_native_thread_routine
       @     0x7efeabdb0e65 start_thread
       @     0x7efeac0c388d __clone
   ```
   
   This is because the wrong order of instance initialization.
   In file `src/service/doris_main.cpp`:
   
   ```
   180     doris::StorageEngine* engine = nullptr;
   181     auto st = doris::StorageEngine::open(options, &engine);
   182     if (!st.ok()) {
   183         LOG(FATAL) << "fail to open StorageEngine, res=" << 
st.get_error_msg();
   184         exit(-1);
   185     }
   186
   187     // start backend service for the coordinator on be_port
   188     auto exec_env = doris::ExecEnv::GetInstance();
   189     doris::ExecEnv::init(exec_env, paths);
   190     exec_env->set_storage_engine(engine);
   ```
   
   the `line 181` first open the `StorageEngine`. it will start all backgroud 
thread such as 
   base and cumulative compaction thread. In these thread, it will try to call 
`fs::fs_util::block_manager()`.
   
   But the `block_manager()` is only available after `line 190`. So this may 
cause null pointer 
   access error, which crashes the BE.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to