-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68257/
-----------------------------------------------------------
(Updated Aug. 17, 2018, 11:44 a.m.)
Review request for mesos, Alexander Rukletsov, Gilbert Song, Jie Yu, and Kevin
Klues.
Bugs: MESOS-9116
https://issues.apache.org/jira/browse/MESOS-9116
Repository: mesos
Description (updated)
-------
Previously, we were walking the process tree from the container's
`init` process to find the first process along the way whose `mnt`
namespace differs from the `init` process. We expected this algorithm
to always return the PID of the command executor's task.
However, if someone launches multiple nested containers within the
process tree, the aforementioned algorithm might detect the PID of
one of those nested container instead of the command executor's task.
Even though the `mnt` namespace will be the same across all these
candidates, the detected PID might belong to a short-lived container,
which might terminate before the containerizer launcher (aka `nanny`
process) tries to enter its `mnt` namespace.
This patch fixes the detection algorithm so that it always returns
the PID of the command executor's task.
Diffs
-----
src/slave/containerizer/mesos/utils.cpp
30e76d1d91651975033078f5450e45f5f2fd8ba0
Diff: https://reviews.apache.org/r/68257/diff/2/
Testing
-------
1) Internal CI with disabled
`ROOT_CGROUPS_LaunchNestedContainerSessionsInParallel` test (see previous
patch).
2) Fedora 25: `./src/mesos-tests
--gtest_filter=*AgentAPITest.LaunchNestedContainerSessionInParallel*
--gtest_break_on_failure --gtest_repeat=100 --verbose`
Thanks,
Andrei Budnik