[
https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149737#comment-16149737
]
Yan Xu commented on MESOS-7921:
-------------------------------
[~benjaminhindman]
In the newly attached
FetcherCacheTest.CachedCustomOutputFileWithSubdirectory.log.txt:
{noformat:title=}
W0831 22:06:30.170509 32070 process.cpp:3240] Attempted to spawn already
running process [email protected]:43674
W0831 22:06:30.179316 32070 process.cpp:3240] Attempted to spawn already
running process [email protected]:43674
{noformat}
{noformat:title=}
*** Aborted at 1504217191 (unix time) try "date -d @1504217191" if you are
using GNU date ***
PC: @ 0x7f43f8cb7956 process::EventQueue::Consumer::empty()
*** SIGSEGV (@0x8) received by PID 32070 (TID 0x7f43fa98c800) from PID 8; stack
trace: ***
@ 0x7f43f070f390 (unknown)
@ 0x7f43f8cb7956 process::EventQueue::Consumer::empty()
@ 0x7f43f8ca2be5 process::ProcessManager::resume()
@ 0x7f43f8ca3b80 process::ProcessManager::wait()
@ 0x7f43f8ca8d7d process::wait()
@ 0x7f43f8c4c3c7 process::Latch::await()
@ 0x1ea6749 process::Future<>::await()
@ 0x7f43f7e897f0
mesos::internal::slave::FetcherProcess::Metrics::~Metrics()
@ 0x7f43f7e89d16
mesos::internal::slave::FetcherProcess::~FetcherProcess()
@ 0x195c578
mesos::internal::tests::MockFetcherProcess::~MockFetcherProcess()
@ 0x195c5ce
mesos::internal::tests::MockFetcherProcess::~MockFetcherProcess()
@ 0x13b68c1 process::Owned<>::Data::~Data()
@ 0x13bfdfe std::_Sp_counted_ptr<>::_M_dispose()
@ 0xd065ce std::_Sp_counted_base<>::_M_release()
@ 0xd04875 std::__shared_count<>::~__shared_count()
@ 0x139f916 std::__shared_ptr<>::~__shared_ptr()
@ 0x139f932 std::shared_ptr<>::~shared_ptr()
@ 0x139f94e process::Owned<>::~Owned()
@ 0x7f43f7e87b38 mesos::internal::slave::Fetcher::~Fetcher()
@ 0x7f43f7e87b7c mesos::internal::slave::Fetcher::~Fetcher()
@ 0x115ee95 process::Owned<>::Data::~Data()
@ 0x1160f5c std::_Sp_counted_ptr<>::_M_dispose()
@ 0xd065ce std::_Sp_counted_base<>::_M_release()
@ 0xd04875 std::__shared_count<>::~__shared_count()
@ 0x1150d3c std::__shared_ptr<>::~__shared_ptr()
@ 0x1150d58 std::shared_ptr<>::~shared_ptr()
@ 0x1150d74 process::Owned<>::~Owned()
@ 0x13a0054
mesos::internal::tests::FetcherCacheTest::~FetcherCacheTest()
@ 0x13bf7b2
mesos::internal::tests::FetcherCacheTest_CachedCustomOutputFileWithSubdirectory_Test::~FetcherCacheTest_CachedCustomOutputFileWithSubdirectory_Test()
@ 0x13bf7e2
mesos::internal::tests::FetcherCacheTest_CachedCustomOutputFileWithSubdirectory_Test::~FetcherCacheTest_CachedCustomOutputFileWithSubdirectory_Test()
@ 0x23a9eee testing::Test::DeleteSelf_()
@ 0x23b65bf
testing::internal::HandleSehExceptionsInMethodIfSupported<>()
Segmentation fault (core dumped)
{noformat}
> process::EventQueue sometimes crashes
> -------------------------------------
>
> Key: MESOS-7921
> URL: https://issues.apache.org/jira/browse/MESOS-7921
> Project: Mesos
> Issue Type: Bug
> Components: libprocess
> Affects Versions: 1.4.0
> Environment: autotools,gcc,--verbose,GLOG_v=1
> MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)
> Note that --enable-lock-free-event-queue is not enabled.
> Details:
> https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/
> Reporter: Yan Xu
> Priority: Blocker
> Attachments:
> FetcherCacheTest.CachedCustomOutputFileWithSubdirectory.log.txt,
> MesosContainerizerSlaveRecoveryTest.ResourceStatisticsFullLog.txt
>
>
> The following segfault is found on
> [ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
> in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky
> and shows up in other tests and environments (with or without
> --enable-lock-free-event-queue) as well.
> {noformat: title=Configuration}
> ./bootstrap '&&' ./configure --verbose '&&' make -j6 distcheck
> {noformat}
> {noformat:title=}
> *** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are
> using GNU date ***
> PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> *** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack
> trace: ***
> @ 0x2b9e29d26330 (unknown)
> @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> @ 0x2b9e25800a40 process::ProcessManager::resume()
> @ 0x2b9e2580f891
> process::ProcessManager::init_threads()::$_9::operator()()
> @ 0x2b9e2580f7d5
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
> @ 0x2b9e2580f77c std::thread::_Impl<>::_M_run()
> @ 0x2b9e29fe5a60 (unknown)
> @ 0x2b9e29d1e184 start_thread
> @ 0x2b9e2a851ffd (unknown)
> make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
> {noformat}
> A [email protected] query shows many such instances:
> https://lists.apache.org/[email protected]:lte=1M:process%3A%3AEventQueue%3A%3AConsumer%3A%3Aempty
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)