[
https://issues.apache.org/jira/browse/MESOS-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhitao Li reassigned MESOS-7381:
--------------------------------
Assignee: Zhitao Li
> Flaky tests in NestedMesosContainerizerTest
> -------------------------------------------
>
> Key: MESOS-7381
> URL: https://issues.apache.org/jira/browse/MESOS-7381
> Project: Mesos
> Issue Type: Bug
> Components: test
> Affects Versions: 1.1.1, 1.2.0
> Reporter: Zhitao Li
> Assignee: Zhitao Li
> Priority: Blocker
>
> There are about 12-13 tests in that class which are very flaky in my
> environment (Debian/jessie running 4.4.38 kernel).
> It seems like the primary cause is that root container "failed" and caused
> `reap` to be called on itself, which cascalade and cause both containers to
> be actively killed by containerizer instead of terminate by themselves.
> This happens on master branch at commit
> 6c1e20c0f2777d9bb831be3ff43c885b253af7bb
> Some log:
> {panel}
> [ RUN ] NestedMesosContainerizerTest.ROOT_CGROUPS_LaunchNested
> I0412 16:06:29.698456 16110 containerizer.cpp:221] Using isolation:
> cgroups/cpu,filesystem/linux,namespaces/pid,network/cni,volume/image
> I0412 16:06:29.703445 16110 linux_launcher.cpp:150] Using
> /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> I0412 16:06:29.704658 16110 provisioner.cpp:249] Using default backend
> 'overlay'
> I0412 16:06:29.714778 16125 containerizer.cpp:608] Recovering containerizer
> I0412 16:06:29.718374 16125 provisioner.cpp:410] Provisioner recovery complete
> I0412 16:06:29.719195 16127 containerizer.cpp:1001] Starting container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d for executor 'executor' of framework
> I0412 16:06:29.721225 16128 cgroups.cpp:410] Creating cgroup at
> '/sys/fs/cgroup/cpu,cpuacct/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d'
> for container 7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:29.723470 16128 cpu.cpp:101] Updated 'cpu.shares' to 1024 (cpus
> 1) for container 7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:29.727627 16126 containerizer.cpp:1499] Launching
> 'mesos-containerizer' with flags '--help="false"
> --launch_info="{"clone_namespaces":[131072,536870912],"command":{"shell":true,"value":"sleep
>
> 1000"},"environment":{"variables":[{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb"}]},"pre_exec_commands":[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/uber\/mesos\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount
> -n -t proc proc \/proc -o
> nosuid,noexec,nodev"}],"working_directory":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb"}"
> --pipe_read="7" --pipe_write="8"
> --runtime_directory="/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_CfIMuj/containers/7cd8794a-4c4f-43e2-8824-2459dc57753d"
> --unshare_namespace_mnt="false"'
> I0412 16:06:29.728123 16131 linux_launcher.cpp:429] Launching container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d and cloning with namespaces CLONE_NEWNS
> | CLONE_NEWPID
> I0412 16:06:29.750948 16126 containerizer.cpp:1598] Checkpointing container's
> forked pid 16164 to
> '/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_SX8Ner/meta/slaves/frameworks/executors/executor/runs/7cd8794a-4c4f-43e2-8824-2459dc57753d/pids/forked.pid'
> I0412 16:06:29.755005 16130 fetcher.cpp:353] Starting to fetch URIs for
> container: 7cd8794a-4c4f-43e2-8824-2459dc57753d, directory:
> /tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb
> I0412 16:06:29.758663 16124 containerizer.cpp:1766] Starting nested container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.762702 16126 containerizer.cpp:1499] Launching
> 'mesos-containerizer' with flags '--help="false"
> --launch_info="{"clone_namespaces":[131072,536870912],"command":{"shell":true,"value":"exit
>
> 42"},"enter_namespaces":[536870912],"environment":{"variables":[{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb\/containers\/fc7f3523-e348-43f3-b809-3be02e35315c"}]},"pre_exec_commands":[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/uber\/mesos\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount
> -n -t proc proc \/proc -o
> nosuid,noexec,nodev"}],"working_directory":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb\/containers\/fc7f3523-e348-43f3-b809-3be02e35315c"}"
> --pipe_read="7" --pipe_write="8"
> --runtime_directory="/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_CfIMuj/containers/7cd8794a-4c4f-43e2-8824-2459dc57753d/containers/fc7f3523-e348-43f3-b809-3be02e35315c"
> --unshare_namespace_mnt="false"'
> I0412 16:06:29.763293 16124 linux_launcher.cpp:429] Launching nested
> container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c and
> cloning with namespaces CLONE_NEWNS | CLONE_NEWPID
> I0412 16:06:29.771055 16131 fetcher.cpp:353] Starting to fetch URIs for
> container:
> 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c,
> directory:
> /tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_AtLcOb/containers/fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.888044 16130 containerizer.cpp:2483] Container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d has exited
> I0412 16:06:29.888092 16130 containerizer.cpp:2077] Destroying container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d in RUNNING state
> I0412 16:06:29.888116 16130 containerizer.cpp:2077] Destroying container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c in
> RUNNING state
> I0412 16:06:29.888913 16130 containerizer.cpp:2483] Container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c has
> exited
> I0412 16:06:29.889135 16129 linux_launcher.cpp:505] Asked to destroy
> container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.889752 16129 linux_launcher.cpp:548] Using freezer to destroy
> cgroup
> mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.891119 16124 cgroups.cpp:2692] Freezing cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.892491 16130 cgroups.cpp:1405] Successfully froze cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c
> after 1.32096ms
> I0412 16:06:29.894062 16127 cgroups.cpp:2710] Thawing cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.895290 16127 cgroups.cpp:1434] Successfully thawed cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos/fc7f3523-e348-43f3-b809-3be02e35315c
> after 1.184768ms
> I0412 16:06:29.900456 16130 provisioner.cpp:484] Ignoring destroy request for
> unknown container
> 7cd8794a-4c4f-43e2-8824-2459dc57753d.fc7f3523-e348-43f3-b809-3be02e35315c
> I0412 16:06:29.900617 16128 containerizer.cpp:2356] Checkpointing termination
> state to nested container's runtime directory
> '/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_LaunchNested_CfIMuj/containers/7cd8794a-4c4f-43e2-8824-2459dc57753d/containers/fc7f3523-e348-43f3-b809-3be02e35315c/termination'
> ../../src/tests/containerizer/nested_mesos_containerizer_tests.cpp:230:
> Failure
> Expecting WIFEXITED(wait.get()->status()) but
> WIFSIGNALED(wait.get()->status()) is true and WTERMSIG(wait.get()->status())
> is Killed
> I0412 16:06:29.901587 16125 linux_launcher.cpp:505] Asked to destroy
> container 7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:29.902186 16125 linux_launcher.cpp:548] Using freezer to destroy
> cgroup
> mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:29.903249 16125 cgroups.cpp:2692] Freezing cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos
> I0412 16:06:29.903316 16124 cgroups.cpp:2692] Freezing cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:30.006909 16130 cgroups.cpp:1405] Successfully froze cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d
> after 103.539968ms
> I0412 16:06:30.007313 16124 cgroups.cpp:1405] Successfully froze cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos
> after 104.015872ms
> I0412 16:06:30.008755 16128 cgroups.cpp:2710] Thawing cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos
> I0412 16:06:30.011144 16125 cgroups.cpp:2710] Thawing cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:30.012442 16125 cgroups.cpp:1434] Successfully thawed cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d
> after 1.254912ms
> I0412 16:06:30.111548 16131 cgroups.cpp:1434] Successfully thawed cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8/7cd8794a-4c4f-43e2-8824-2459dc57753d/mesos
> after 102.735872ms
> I0412 16:06:30.118083 16126 provisioner.cpp:484] Ignoring destroy request for
> unknown container 7cd8794a-4c4f-43e2-8824-2459dc57753d
> I0412 16:06:30.142882 16130 cgroups.cpp:2692] Freezing cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8
> I0412 16:06:30.144379 16126 cgroups.cpp:1405] Successfully froze cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8 after
> 1.382144ms
> I0412 16:06:30.145800 16131 cgroups.cpp:2710] Thawing cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8
> I0412 16:06:30.147094 16131 cgroups.cpp:1434] Successfully thawed cgroup
> /sys/fs/cgroup/freezer/mesos_test_723562d9-904b-4108-aaab-973184996ab8 after
> 1.203968ms
> [ FAILED ] NestedMesosContainerizerTest.ROOT_CGROUPS_LaunchNested (500 ms)
> {panel}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)