[
https://issues.apache.org/jira/browse/MESOS-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089240#comment-15089240
]
Jan Schlicht commented on MESOS-4035:
-------------------------------------
I assume that this was in a virtual machine and that something like {{sudo
./bin/mesos-tests.sh}} was running prior to this? If tried reproducing this and
am pretty sure, that I've seen the exact same error before, but could only find
something that is quite similar and probably having the same cause:
Some virtual machines (e.g. Virtualbox) don't provide _CPU performance
counters_ for their guests. This affects some root tests of Mesos that try to
use {{perf}} to sample the {{cycles}} event. One of these tests is
{{PerfEventIsolatorTest.ROOT_CGROUPS_Sample}}. Running {{sudo
./bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_Sample"}} in such an
environment will fail and keep a child process running that will block some
cgroups from being removed. This affects all test processes that are run
afterwards that try to clean up some cgroups before being run.
{{UserCgroupIsolatorTest.ROOT_CGROUPS_UserCgroup}} is one of those. Restarting
the VM will reset this behavior.
So, in a fresh VM, running {{sudo ./bin/mesos-tests.sh
--gtest_filter="*ROOT_CGROUPS_UserCgroup"}} should pass, but doing this after
running {{sudo ./bin/mesos-tests.sh --gtest_filter="*ROOT_CGROUPS_Sample"}}
should fail.
> UserCgroupIsolatorTest.ROOT_CGROUPS_UserCgroup fails on CentOS 6.6
> ------------------------------------------------------------------
>
> Key: MESOS-4035
> URL: https://issues.apache.org/jira/browse/MESOS-4035
> Project: Mesos
> Issue Type: Bug
> Environment: CentOS6.6
> Reporter: Gilbert Song
> Assignee: Jan Schlicht
>
> `ROOT_CGROUPS_UserCgroup` on CentOS6.6 with 0.26rc3. The environment setup on
> CentOS6.6 is based on latest update of /docs/getting-started.md. Either using
> devtoolset-2 or devtoolset-3 returns the same failure.
> If running `sudo ./bin/mesos-tests.sh
> --gtest_filter="*ROOT_CGROUPS_UserCgroup*"`, it would return failures as
> following log:
> {noformat}
> [==========] Running 3 tests from 3 test cases.
> [----------] Global test environment set-up.
> [----------] 1 test from UserCgroupIsolatorTest/0, where TypeParam =
> mesos::internal::slave::CgroupsMemIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
> ../../src/tests/mesos.cpp:722: Failure
> cgroups::mount(hierarchy, subsystem): '/tmp/mesos_test_cgroup/perf_event'
> already exists in the file system
> -------------------------------------------------------------
> We cannot run any cgroups tests that require
> a hierarchy with subsystem 'perf_event'
> because we failed to find an existing hierarchy
> or create a new one (tried '/tmp/mesos_test_cgroup/perf_event').
> You can either remove all existing
> hierarchies, or disable this test case
> (i.e., --gtest_filter=-UserCgroupIsolatorTest/0.*).
> -------------------------------------------------------------
> ../../src/tests/mesos.cpp:776: Failure
> cgroups: '/tmp/mesos_test_cgroup/perf_event' is not a valid hierarchy
> [ FAILED ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup, where
> TypeParam = mesos::internal::slave::CgroupsMemIsolatorProcess (1 ms)
> [----------] 1 test from UserCgroupIsolatorTest/0 (1 ms total)
> [----------] 1 test from UserCgroupIsolatorTest/1, where TypeParam =
> mesos::internal::slave::CgroupsCpushareIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup
> ../../src/tests/mesos.cpp:722: Failure
> cgroups::mount(hierarchy, subsystem): '/tmp/mesos_test_cgroup/perf_event'
> already exists in the file system
> -------------------------------------------------------------
> We cannot run any cgroups tests that require
> a hierarchy with subsystem 'perf_event'
> because we failed to find an existing hierarchy
> or create a new one (tried '/tmp/mesos_test_cgroup/perf_event').
> You can either remove all existing
> hierarchies, or disable this test case
> (i.e., --gtest_filter=-UserCgroupIsolatorTest/1.*).
> -------------------------------------------------------------
> ../../src/tests/mesos.cpp:776: Failure
> cgroups: '/tmp/mesos_test_cgroup/perf_event' is not a valid hierarchy
> [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where
> TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess (4 ms)
> [----------] 1 test from UserCgroupIsolatorTest/1 (5 ms total)
> [----------] 1 test from UserCgroupIsolatorTest/2, where TypeParam =
> mesos::internal::slave::CgroupsPerfEventIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
> ../../src/tests/mesos.cpp:722: Failure
> cgroups::mount(hierarchy, subsystem): '/tmp/mesos_test_cgroup/perf_event'
> already exists in the file system
> -------------------------------------------------------------
> We cannot run any cgroups tests that require
> a hierarchy with subsystem 'perf_event'
> because we failed to find an existing hierarchy
> or create a new one (tried '/tmp/mesos_test_cgroup/perf_event').
> You can either remove all existing
> hierarchies, or disable this test case
> (i.e., --gtest_filter=-UserCgroupIsolatorTest/2.*).
> -------------------------------------------------------------
> ../../src/tests/mesos.cpp:776: Failure
> cgroups: '/tmp/mesos_test_cgroup/perf_event' is not a valid hierarchy
> [ FAILED ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where
> TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess (2 ms)
> [----------] 1 test from UserCgroupIsolatorTest/2 (2 ms total)
> [----------] Global test environment tear-down
> [==========] 3 tests from 3 test cases ran. (349 ms total)
> [ PASSED ] 0 tests.
> [ FAILED ] 3 tests, listed below:
> [ FAILED ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup, where
> TypeParam = mesos::internal::slave::CgroupsMemIsolatorProcess
> [ FAILED ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where
> TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess
> [ FAILED ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where
> TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess
> 3 FAILED TESTS
> {noformat}
> If running it with `sudo ./bin/mesos-tests.sh
> --gtest_filter="*ROOT_CGROUPS_UserCgroup*" --gtest_repeat=-1
> --gtest_break_on_failure`, it returned a segmentation fault (at iteration 1)
> as following log:
> {noformat}
> [==========] Running 3 tests from 3 test cases.
> [----------] Global test environment set-up.
> [----------] 1 test from UserCgroupIsolatorTest/0, where TypeParam =
> mesos::internal::slave::CgroupsMemIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
> ../../src/tests/mesos.cpp:722: Failure
> cgroups::mount(hierarchy, subsystem): '/tmp/mesos_test_cgroup/perf_event'
> already exists in the file system
> -------------------------------------------------------------
> We cannot run any cgroups tests that require
> a hierarchy with subsystem 'perf_event'
> because we failed to find an existing hierarchy
> or create a new one (tried '/tmp/mesos_test_cgroup/perf_event').
> You can either remove all existing
> hierarchies, or disable this test case
> (i.e., --gtest_filter=-UserCgroupIsolatorTest/0.*).
> -------------------------------------------------------------
> *** Aborted at 1449018895 (unix time) try "date -d @1449018895" if you are
> using GNU date ***
> PC: @ 0x152039f testing::UnitTest::AddTestPartResult()
> *** SIGSEGV (@0x0) received by PID 2930 (TID 0x7f9a07d92840) from PID 0;
> stack trace: ***
> @ 0x7f9a0194d790 (unknown)
> @ 0x152039f testing::UnitTest::AddTestPartResult()
> @ 0x151ff0e testing::internal::AssertHelper::operator=()
> @ 0xed245f mesos::internal::tests::ContainerizerTest<>::SetUp()
> @ 0x155a9a3
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0x1547f51
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0x15281d3 testing::Test::Run()
> @ 0x1528e9b testing::TestInfo::Run()
> @ 0x15295e7 testing::TestCase::Run()
> @ 0x1530d42 testing::internal::UnitTestImpl::RunAllTests()
> @ 0x1558163
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @ 0x1549fc1
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0x15309fb testing::UnitTest::Run()
> @ 0xc808d1 RUN_ALL_TESTS()
> @ 0xc7f306 main
> @ 0x7f9a007b5d5d __libc_start_main
> @ 0x783e79 (unknown)
> {noformat}
> Both of these two failure cases occur 100%, not occasionally.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)