liaoxin01 opened a new pull request, #59769:
URL: https://github.com/apache/doris/pull/59769
### What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
```
start BE in local mode
00:47:11 AddressSanitizer: CHECK failed: sanitizer_posix_libcdep.cpp:319
"((14)) == ((write_errno))" (0xe, 0x20) (tid=3088213)
00:47:11 #0 0x55bf40a248e1 in __asan::CheckUnwind()
(/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x237f68e1)
00:47:11 #1 0x55bf40a3f182 in __sanitizer::CheckFailed(char const*,
int, char const*, unsigned long long, unsigned long long)
(/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x23811182)
00:47:11 #2 0x55bf40a416cf in
__sanitizer::IsAccessibleMemoryRange(unsigned long, unsigned long)
(/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x238136cf)
00:47:11 #3 0x55bf40a5e37a in __ubsan::checkDynamicType(void*,
void*, unsigned long)
(/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x2383037a)
00:47:11 #4 0x55bf40a5d712 in
HandleDynamicTypeCacheMiss(__ubsan::DynamicTypeCacheMissData*, unsigned long,
unsigned long,
__ubsan::ReportOptions)
(/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x2382f712)
00:47:11 #5 0x55bf40a5d6e3 in __ubsan_handle_dynamic_type_cache_miss
(/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x2382f6e3)
00:47:11 #6 0x55bf48f94a83 in
std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count()
/usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/shared_ptr_base.h:1069:11
00:47:11 + echo 'cp -r
/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0//Cluster0/fe/log
/home/work/pipline/backup_center/59710_cf4031baccf1a81a3ea0db0d909aa064ee22340d_p0/fe/'
00:47:11 #7 0x55bf48f94a83 in std::__shared_ptr<evhttp,
(__gnu_cxx::_Lock_policy)2>::~__shared_ptr()
/usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/shared_ptr_base.h:1531:31
00:47:11 #8 0x55bf48f94a83 in
doris::EvHttpServer::start()::$_0::operator()() const
/root/doris/be/src/http/ev_http_server.cpp:140:9
00:47:11 + cp -r
/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0//Cluster0/fe/log
/home/work/pipline/backup_center/59710_cf4031baccf1a81a3ea0db0d909aa064ee22340d_p0/fe/
00:47:11 #9 0x55bf48f94a83 in void std::__invoke_impl<void,
doris::EvHttpServer::start()::$_0&>(std::__invoke_other,
doris::EvHttpServer::start()::$_0&)
/usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/invoke.h:63:14
00:47:11 #10 0x55bf48f94a83 in std::enable_if<is_invocable_r_v<void,
doris::EvHttpServer::start()::$_0&>, void>::type std::__invoke_r<void,
doris::EvHttpServer::start()::$_0&>(doris::EvHttpServer::start()::$_0&)
/usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/invoke.h:113:2
00:47:11 #11 0x55bf48f94a83 in std::_Function_handler<void (),
doris::EvHttpServer::start()::$_0>::_M_invoke(std::_Any_data const&)
/usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:292:9
00:47:11 #12 0x55bf464e534d in doris::ThreadPool::dispatch_thread()
/root/doris/be/src/util/threadpool.cpp:616:24
00:47:11 #13 0x55bf464b8706 in std::function<void ()>::operator()()
const
/usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:593:9
00:47:11 #14 0x55bf464b8706 in
doris::Thread::supervise_thread(void*) /root/doris/be/src/util/thread.cpp:460:5
00:47:11 #15 0x55bf40a14d26 in asan_thread_start(void*)
(/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x237e6d26)
00:47:11 #16 0x7f64a847c608 in start_thread
/build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
00:47:11 #17 0x7f64a838f132 in __clone
/build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
```
The crash occurred because evhttp was created as a local shared_ptr in the
lambda function, and its destruction happened after event_base_dispatch
returned. This caused use-after-free issues when ASAN/UBSAN tried to verify the
object dynamic type during shared_ptr destruction.
Fix by:
1. Moving evhttp objects to class member _evhttp_servers for explicit
lifecycle management
2. Reordering resource cleanup in stop():
- Close server_fd first to reject new connections
- Break event loops to make dispatch return
- Wait for worker threads to finish
- Clear evhttp before event_base (correct dependency order)
### Release note
None
### Check List (For Author)
- Test <!-- At least one of them must be included. -->
- [ ] Regression test
- [ ] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason <!-- Add your reason? -->
- Behavior changed:
- [ ] No.
- [ ] Yes. <!-- Explain the behavior change -->
- Does this need documentation?
- [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR should
merge into -->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]