[ https://issues.apache.org/jira/browse/MESOS-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137967#comment-15137967 ]
Joseph Wu commented on MESOS-4604: ---------------------------------- I'm convinced this is a [problem in docker|https://github.com/docker/docker/issues/12738]. The tests fail when [{{docker stop}}|https://github.com/apache/mesos/blob/7aafb8e44d347a03cbef83d3f7ee4705b9d23c09/src/slave/containerizer/docker.cpp#L1547] hangs indefinitely. Note: docker [doesn't support this Ubuntu15.04 anymore|https://github.com/docker/docker/pull/18809]. To get rid of the {{__cxa_pure_virtual}}, we'll need to do some refactoring (see [MESOS-2017|https://issues.apache.org/jira/browse/MESOS-2017]). At the moment, most of our containerizer tests must call {{Shutdown()}} before they exit the scope of the test. Otherwise, {{MesosTest}} will call {{Shutdown()}} and dereference some stack-allocated containerizers. I propose: * Change {{MesosTest::StartSlave}} to take a {{Shared<Containerizer>}}. Change all tests to dynamically allocate containerizers. * We remove all manual {{Shutdown()}} calls if they occur at the end of the test. > ROOT_DOCKER_DockerHealthyTask is flaky. > --------------------------------------- > > Key: MESOS-4604 > URL: https://issues.apache.org/jira/browse/MESOS-4604 > Project: Mesos > Issue Type: Bug > Components: tests > Environment: CentOS 6/7, Ubuntu 15.04 on AWS. > Reporter: Jan Schlicht > Assignee: Joseph Wu > Labels: flaky-test, mesosphere, test > > Log from Teamcity that is running {{sudo ./bin/mesos-tests.sh}} on AWS EC2 > instances: > {noformat} > [18:27:14][Step 8/8] [----------] 8 tests from HealthCheckTest > [18:27:14][Step 8/8] [ RUN ] HealthCheckTest.HealthyTask > [18:27:17][Step 8/8] [ OK ] HealthCheckTest.HealthyTask (2222 ms) > [18:27:17][Step 8/8] [ RUN ] > HealthCheckTest.ROOT_DOCKER_DockerHealthyTask > [18:27:36][Step 8/8] ../../src/tests/health_check_tests.cpp:388: Failure > [18:27:36][Step 8/8] Failed to wait 15secs for termination > [18:27:36][Step 8/8] F0204 18:27:35.981302 23085 logging.cpp:64] RAW: Pure > virtual method called > [18:27:36][Step 8/8] @ 0x7f7077055e1c google::LogMessage::Fail() > [18:27:36][Step 8/8] @ 0x7f707705ba6f google::RawLog__() > [18:27:36][Step 8/8] @ 0x7f70760f76c9 __cxa_pure_virtual > [18:27:36][Step 8/8] @ 0xa9423c > mesos::internal::tests::Cluster::Slaves::shutdown() > [18:27:36][Step 8/8] @ 0x1074e45 > mesos::internal::tests::MesosTest::ShutdownSlaves() > [18:27:36][Step 8/8] @ 0x1074de4 > mesos::internal::tests::MesosTest::Shutdown() > [18:27:36][Step 8/8] @ 0x1070ec7 > mesos::internal::tests::MesosTest::TearDown() > [18:27:36][Step 8/8] @ 0x16eb7b2 > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > [18:27:36][Step 8/8] @ 0x16e61a9 > testing::internal::HandleExceptionsInMethodIfSupported<>() > [18:27:36][Step 8/8] @ 0x16c56aa testing::Test::Run() > [18:27:36][Step 8/8] @ 0x16c5e89 testing::TestInfo::Run() > [18:27:36][Step 8/8] @ 0x16c650a testing::TestCase::Run() > [18:27:36][Step 8/8] @ 0x16cd1f6 > testing::internal::UnitTestImpl::RunAllTests() > [18:27:36][Step 8/8] @ 0x16ec513 > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > [18:27:36][Step 8/8] @ 0x16e6df1 > testing::internal::HandleExceptionsInMethodIfSupported<>() > [18:27:36][Step 8/8] @ 0x16cbe26 testing::UnitTest::Run() > [18:27:36][Step 8/8] @ 0xe54c84 RUN_ALL_TESTS() > [18:27:36][Step 8/8] @ 0xe54867 main > [18:27:36][Step 8/8] @ 0x7f7071560a40 (unknown) > [18:27:36][Step 8/8] @ 0x9b52d9 _start > [18:27:36][Step 8/8] Aborted (core dumped) > [18:27:36][Step 8/8] Process exited with code 134 > {noformat} > Happens with Ubuntu 15.04, CentOS 6, CentOS 7 _quite_ often. -- This message was sent by Atlassian JIRA (v6.3.4#6332)