[jira] [Commented] (MESOS-3160) CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky
[ https://issues.apache.org/jira/browse/MESOS-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405417#comment-16405417 ] Alexander Rukletsov commented on MESOS-3160: Disabled this test for now. > CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky > > > Key: MESOS-3160 > URL: https://issues.apache.org/jira/browse/MESOS-3160 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0, 0.26.0 > Environment: Ubuntu 14.04 > CentOS 7 >Reporter: Paul Brett >Assignee: Greg Mann >Priority: Major > Labels: cgroups, flaky-test, mesosphere > > Test will occasionally with: > [ RUN ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): Failed to sync with the subprocess > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): The subprocess has not been spawned yet > [ FAILED ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > (223 ms) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-3160) CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky
[ https://issues.apache.org/jira/browse/MESOS-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341866#comment-16341866 ] Greg Mann commented on MESOS-3160: -- In the testing I've done today, the most common reason for this failure is when the {{MemoryTestHelper}} receives EOF from the subprocess's output FD, [at this line|https://github.com/apache/mesos/blob/15fc434e47e026790a6f6dc8e974a8440d0b1bdf/src/tests/containerizer/memory_test_helper.cpp#L156]. Another failure mode I observed occurred at [this line|https://github.com/apache/mesos/blob/15fc434e47e026790a6f6dc8e974a8440d0b1bdf/src/tests/containerizer/cgroups_tests.cpp#L1163], with {{critical == 1}}. > CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky > > > Key: MESOS-3160 > URL: https://issues.apache.org/jira/browse/MESOS-3160 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0, 0.26.0 > Environment: Ubuntu 14.04 > CentOS 7 >Reporter: Paul Brett >Assignee: Greg Mann >Priority: Major > Labels: cgroups, flaky-test, mesosphere > > Test will occasionally with: > [ RUN ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): Failed to sync with the subprocess > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): The subprocess has not been spawned yet > [ FAILED ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > (223 ms) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-3160) CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky
[ https://issues.apache.org/jira/browse/MESOS-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16278640#comment-16278640 ] Alexander Rukletsov commented on MESOS-3160: At the moment of writing the segfault has not been observed for some time (and is probably fixed by ). However, the test still fails frequently with the following error: {noformat} ../../src/tests/containerizer/cgroups_tests.cpp:1132 helper.increaseRSS(os::pagesize()): Failed to sync with the subprocess {noformat} > CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky > > > Key: MESOS-3160 > URL: https://issues.apache.org/jira/browse/MESOS-3160 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0, 0.26.0 > Environment: Ubuntu 14.04 > CentOS 7 >Reporter: Paul Brett > Labels: cgroups, flaky-test, mesosphere > > Test will occasionally with: > [ RUN ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): Failed to sync with the subprocess > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): The subprocess has not been spawned yet > [ FAILED ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > (223 ms) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-3160) CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky
[ https://issues.apache.org/jira/browse/MESOS-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188114#comment-16188114 ] Till Toenshoff commented on MESOS-3160: --- Just saw it crashing on our internal CI (ubuntu 14.04): {noformat} 00:39:21 [ RUN ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS 00:39:21 *** Aborted at 1506731961 (unix time) try "date -d @1506731961" if you are using GNU date *** 00:39:21 PC: @ 0x7fa16bc17b91 process::ProcessManager::resume() 00:39:21 *** SIGSEGV (@0x8) received by PID 31454 (TID 0x7fa15ea32700) from PID 8; stack trace: *** 00:39:21 @ 0x7fa1367483fd (unknown) 00:39:21 @ 0x7fa13674d419 (unknown) 00:39:21 @ 0x7fa136741918 (unknown) 00:39:21 @ 0x7fa169011330 (unknown) 00:39:21 @ 0x7fa16bc17b91 process::ProcessManager::resume() 00:39:21 @ 0x7fa16bc1d6e6 _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv 00:39:21 @ 0x7fa1697eca60 (unknown) 00:39:21 @ 0x7fa169009184 start_thread 00:39:21 @ 0x7fa168d35ffd (unknown) {noformat} > CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky > > > Key: MESOS-3160 > URL: https://issues.apache.org/jira/browse/MESOS-3160 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0, 0.26.0 >Reporter: Paul Brett > Labels: cgroups, mesosphere > > Test will occasionally with: > [ RUN ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): Failed to sync with the subprocess > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): The subprocess has not been spawned yet > [ FAILED ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > (223 ms) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-3160) CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky
[ https://issues.apache.org/jira/browse/MESOS-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15585762#comment-15585762 ] Benjamin Bannier commented on MESOS-3160: - [~tillt]: This test is "disabled" by an {{ASSERT}} on systems with swap enabled, also {code} // TODO(vinod): Instead of asserting here dynamically disable // the test if swap is enabled on the host. ASSERT_EQ(memory.get().totalSwap, Bytes(0)) {code} Instead you should either disable swap on your host, or filter that test yourself for the time being. > CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky > > > Key: MESOS-3160 > URL: https://issues.apache.org/jira/browse/MESOS-3160 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0, 0.26.0 >Reporter: Paul Brett > Labels: cgroups, mesosphere > > Test will occasionally with: > [ RUN ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): Failed to sync with the subprocess > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): The subprocess has not been spawned yet > [ FAILED ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > (223 ms) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3160) CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky
[ https://issues.apache.org/jira/browse/MESOS-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15585618#comment-15585618 ] Till Toenshoff commented on MESOS-3160: --- Just saw it failing on Centos6 in an SSL build as well. > CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky > > > Key: MESOS-3160 > URL: https://issues.apache.org/jira/browse/MESOS-3160 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0, 0.26.0 >Reporter: Paul Brett > Labels: cgroups, mesosphere > > Test will occasionally with: > [ RUN ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): Failed to sync with the subprocess > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): The subprocess has not been spawned yet > [ FAILED ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > (223 ms) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3160) CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky
[ https://issues.apache.org/jira/browse/MESOS-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15125235#comment-15125235 ] Greg Mann commented on MESOS-3160: -- I just noticed this failure on a CentOS 7.1 VM, with libevent and SSL enabled: {code} [ RUN ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS ../../src/tests/containerizer/cgroups_tests.cpp:1127: Failure helper.increaseRSS(getpagesize()): Failed to sync with the subprocess *** Aborted at 1454219819 (unix time) try "date -d @1454219819" if you are using GNU date *** PC: @ 0x1623fac testing::UnitTest::AddTestPartResult() *** SIGSEGV (@0x0) received by PID 22745 (TID 0x7efd13acd8c0) from PID 0; stack trace: *** @ 0x7efd0db2a130 (unknown) @ 0x1623fac testing::UnitTest::AddTestPartResult() @ 0x1618b23 testing::internal::AssertHelper::operator=() @ 0x1588780 mesos::internal::tests::CgroupsAnyHierarchyMemoryPressureTest_ROOT_IncreaseRSS_Test::TestBody() @ 0x16418a6 testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x163c7fc testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x161dde1 testing::Test::Run() @ 0x161e564 testing::TestInfo::Run() @ 0x161ebaa testing::TestCase::Run() @ 0x1625484 testing::internal::UnitTestImpl::RunAllTests() @ 0x16424cb testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x163d372 testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x16241ca testing::UnitTest::Run() @ 0xdea5ae RUN_ALL_TESTS() @ 0xdea1c4 main @ 0x7efd0cb50af5 __libc_start_main @ 0x9930a9 (unknown) {code} The test fails most (but not all) of the time, about 4/5 repetitions. It's worth noting that when I set {{GLOG_v=2}}, the failure disappears. > CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseRSS Flaky > > > Key: MESOS-3160 > URL: https://issues.apache.org/jira/browse/MESOS-3160 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0 >Reporter: Paul Brett > > Test will occasionally with: > [ RUN ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): Failed to sync with the subprocess > ../../src/tests/containerizer/cgroups_tests.cpp:1103: Failure > helper.increaseRSS(getpagesize()): The subprocess has not been spawned yet > [ FAILED ] CgroupsAnyHierarchyMemoryPressureTest.ROOT_IncreaseUnlockedRSS > (223 ms) -- This message was sent by Atlassian JIRA (v6.3.4#6332)