[mesos] branch master updated: Disabled 3 windows failure tests DockerFetcherPluginTest.INTERNET_CURL_.
This is an automated email from the ASF dual-hosted git repository. gilbert pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new af7a876 Disabled 3 windows failure tests DockerFetcherPluginTest.INTERNET_CURL_. af7a876 is described below commit af7a87608179fc07284d3827f6722d77faf70a4e Author: Gilbert Song AuthorDate: Tue Apr 30 20:10:17 2019 -0700 Disabled 3 windows failure tests DockerFetcherPluginTest.INTERNET_CURL_. --- src/tests/uri_fetcher_tests.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/tests/uri_fetcher_tests.cpp b/src/tests/uri_fetcher_tests.cpp index 55e75be..c727cc5 100644 --- a/src/tests/uri_fetcher_tests.cpp +++ b/src/tests/uri_fetcher_tests.cpp @@ -309,7 +309,7 @@ static constexpr char TEST_DIGEST[] = "sha256:a3ed95caeb02ffe68cdd9fd844066" class DockerFetcherPluginTest : public TemporaryDirectoryTest {}; -TEST_F(DockerFetcherPluginTest, INTERNET_CURL_FetchManifest) +TEST_F(DockerFetcherPluginTest, DISABLED_INTERNET_CURL_FetchManifest) { URI uri = uri::docker::manifest( TEST_REPOSITORY, "latest", DOCKER_REGISTRY_HOST); @@ -352,7 +352,7 @@ TEST_F(DockerFetcherPluginTest, INTERNET_CURL_FetchBlob) // Fetches the image manifest and all blobs in that image. -TEST_F(DockerFetcherPluginTest, INTERNET_CURL_FetchImage) +TEST_F(DockerFetcherPluginTest, DISABLED_INTERNET_CURL_FetchImage) { URI uri = uri::docker::image( TEST_REPOSITORY, "latest", DOCKER_REGISTRY_HOST); @@ -388,7 +388,7 @@ TEST_F(DockerFetcherPluginTest, INTERNET_CURL_FetchImage) // This test verifies invoking 'fetch' by plugin name. -TEST_F(DockerFetcherPluginTest, INTERNET_CURL_InvokeFetchByName) +TEST_F(DockerFetcherPluginTest, DISABLED_INTERNET_CURL_InvokeFetchByName) { URI uri = uri::docker::image( TEST_REPOSITORY, "latest", DOCKER_REGISTRY_HOST);
[mesos] branch master updated: Added a test to verify the sort correctness of the random sorter.
This is an automated email from the ASF dual-hosted git repository. mzhu pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 89c3dd9 Added a test to verify the sort correctness of the random sorter. 89c3dd9 is described below commit 89c3dd95a421e14044bc91ceb1998ff4ae3883b4 Author: Meng Zhu AuthorDate: Sun Apr 7 15:55:42 2019 -0700 Added a test to verify the sort correctness of the random sorter. Review: https://reviews.apache.org/r/70418 --- src/tests/sorter_tests.cpp | 53 ++ 1 file changed, 53 insertions(+) diff --git a/src/tests/sorter_tests.cpp b/src/tests/sorter_tests.cpp index c9a0bda..9aee2b4 100644 --- a/src/tests/sorter_tests.cpp +++ b/src/tests/sorter_tests.cpp @@ -170,6 +170,59 @@ TEST(RandomSorterTest, HierarchicalProbabilityDistribution) } +TEST(RandomSorterTest, ProbabilityDistribution) +{ + // Test the behavior of the random sorter by ensuring that the + // probability distribution after a number of runs is within + // a particular error bound. + + RandomSorter sorter; + + vector clients = {"0", "1", "2", "3", "4"}; + vector weights = {1.0, 2.0, 3.0, 4.0, 5.0}; + + for (size_t i = 0; i < 5; ++i) { +sorter.add(clients.at(i)); +sorter.activate(clients.at(i)); +sorter.updateWeight(clients.at(i), weights.at(i)); + } + + // Count of how many times client i returned as the jth client + // in the sort result. + size_t totalRuns = 1000u; + size_t counts[5][5] = {}; + + for (size_t run = 0; run < totalRuns; ++run) { +vector candidates = sorter.sort(); +for (size_t i = 0; i < candidates.size(); ++i) { + ++counts[std::stoi(candidates.at(i))][i]; +} + } + + // This table was generated by running a weighted shuffle algorithm + // for a large number of iterations. + double expectedProbabilities[5][5] = { +{0.07, 0.08, 0.12, 0.20, 0.54}, +{0.13, 0.16, 0.20, 0.28, 0.23}, +{0.20, 0.22, 0.24, 0.22, 0.12}, +{0.27, 0.26, 0.23, 0.17, 0.07}, +{0.33, 0.28, 0.21, 0.13, 0.04}, + }; + + double actualProbabilities[5][5]; + + for (int i = 0; i < 5; ++i) { +for (int j = 0; j < 5; ++j) { + actualProbabilities[i][j] = counts[i][j] / (1.0 * totalRuns); + + // Assert that the actual probabilities differ less than + // an absolute 5%. + ASSERT_NEAR(expectedProbabilities[i][j], actualProbabilities[i][j], 0.05); +} + } +} + + template class CommonSorterTest : public ::testing::Test {};
[mesos] branch master updated: Added debug logging when framework is missing during agent removal.
This is an automated email from the ASF dual-hosted git repository. grag pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new fa39fe2 Added debug logging when framework is missing during agent removal. fa39fe2 is described below commit fa39fe2a932de6a6ddccf65e9322738e48c7b39e Author: Greg Mann AuthorDate: Tue Apr 30 10:27:15 2019 -0700 Added debug logging when framework is missing during agent removal. This patch adds extra debug logging to `Master::__removeSlave()` in order to help triage MESOS-9609 if that issue is observed again in the future. Review: https://reviews.apache.org/r/70559/ --- src/master/master.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/master/master.cpp b/src/master/master.cpp index 7dcdc9a..9f0a976 100644 --- a/src/master/master.cpp +++ b/src/master/master.cpp @@ -11460,7 +11460,9 @@ void Master::__removeSlave( // the framework has opted in to the PARTITION_AWARE capability. foreachkey (const FrameworkID& frameworkId, utils::copy(slave->tasks)) { Framework* framework = getFramework(frameworkId); -CHECK_NOTNULL(framework); +CHECK(framework != nullptr) + << "Framework " << frameworkId << " not found while removing agent " + << *slave << "; agent tasks: " << slave->tasks; TaskState newTaskState = TASK_UNREACHABLE; TaskStatus::Reason newTaskReason = TaskStatus::REASON_SLAVE_REMOVED;
[mesos] branch 1.8.x updated: Fixed a performance issue in the random sorter.
This is an automated email from the ASF dual-hosted git repository. mzhu pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/1.8.x by this push: new 855d1e7 Fixed a performance issue in the random sorter. 855d1e7 is described below commit 855d1e79a401828176fe36b2cc1182d6856817b0 Author: Meng Zhu AuthorDate: Sun Apr 28 15:53:17 2019 -0700 Fixed a performance issue in the random sorter. Review: https://reviews.apache.org/r/70564 --- src/master/allocator/sorter/random/sorter.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/master/allocator/sorter/random/sorter.cpp b/src/master/allocator/sorter/random/sorter.cpp index f4132bb..813f5b5 100644 --- a/src/master/allocator/sorter/random/sorter.cpp +++ b/src/master/allocator/sorter/random/sorter.cpp @@ -472,7 +472,7 @@ void RandomSorter::remove(const SlaveID& slaveId, const Resources& resources) vector RandomSorter::sort() { pair, vector> clientsAndWeights = -SortInfo(this).getClientsAndWeights(); +sortInfo.getClientsAndWeights(); weightedShuffle( clientsAndWeights.first.begin(),
[mesos] branch master updated: Fixed a performance issue in the random sorter.
This is an automated email from the ASF dual-hosted git repository. mzhu pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 2d38c3e Fixed a performance issue in the random sorter. 2d38c3e is described below commit 2d38c3ee8d1dcf2c7aacff3a3a18d017b2e0c907 Author: Meng Zhu AuthorDate: Sun Apr 28 15:53:17 2019 -0700 Fixed a performance issue in the random sorter. Review: https://reviews.apache.org/r/70564 --- src/master/allocator/sorter/random/sorter.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/master/allocator/sorter/random/sorter.cpp b/src/master/allocator/sorter/random/sorter.cpp index f4132bb..813f5b5 100644 --- a/src/master/allocator/sorter/random/sorter.cpp +++ b/src/master/allocator/sorter/random/sorter.cpp @@ -472,7 +472,7 @@ void RandomSorter::remove(const SlaveID& slaveId, const Resources& resources) vector RandomSorter::sort() { pair, vector> clientsAndWeights = -SortInfo(this).getClientsAndWeights(); +sortInfo.getClientsAndWeights(); weightedShuffle( clientsAndWeights.first.begin(),
[mesos] 03/04: Added MESOS-9695 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 138bfe41c822823afe6d2a0532e3c95e4f7d3bfe Author: Andrei Budnik AuthorDate: Tue Apr 30 14:36:47 2019 +0200 Added MESOS-9695 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index e7ec1a5..425df0e 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -886,6 +886,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement
[mesos] 04/04: Added MESOS-9695 to the 1.5.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 4eec48e17d08575c18458f713d2e8280faab99d6 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:45:01 2019 +0200 Added MESOS-9695 to the 1.5.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 425df0e..1a4d782 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1350,6 +1350,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP) ** Bug * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement
[mesos] branch master updated (c8004ee -> 4eec48e)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from c8004ee Removed the duplicate pid check in Docker containerizer. new a8fe548 Added MESOS-9695 to the 1.8.1 CHANGELOG. new 592f7c4 Added MESOS-9695 to the 1.7.3 CHANGELOG. new 138bfe4 Added MESOS-9695 to the 1.6.3 CHANGELOG. new 4eec48e Added MESOS-9695 to the 1.5.4 CHANGELOG. The 4 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 5 - 1 file changed, 4 insertions(+), 1 deletion(-)
[mesos] 01/04: Added MESOS-9695 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit a8fe548b091732c2b46f5e6a7d7392de92644f5a Author: Andrei Budnik AuthorDate: Tue Apr 30 14:08:58 2019 +0200 Added MESOS-9695 to the 1.8.1 CHANGELOG. --- CHANGELOG | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG b/CHANGELOG index 9c01040..89452a0 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,7 +4,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) ** Bug * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. - + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer Release Notes - Mesos - Version 1.8.0 -
[mesos] 02/04: Added MESOS-9695 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 592f7c44a304f5c02804a0de98df07ec295ce070 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:30:32 2019 +0200 Added MESOS-9695 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 89452a0..e7ec1a5 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -417,6 +417,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations. * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvements
[mesos] branch 1.5.x updated (f8e0e41 -> 791ac63)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from f8e0e41 Added MESOS-9619 to the 1.5.4 CHANGELOG. new f4a6453 Removed the duplicate pid check in Docker containerizer. new 791ac63 Added MESOS-9695 to the 1.5.4 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/containerizer/docker.cpp | 27 ++- 2 files changed, 7 insertions(+), 21 deletions(-)
[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit f4a6453c77719ed531a9287b6c9cdeb7ad268865 Author: Qian Zhang AuthorDate: Tue Apr 30 13:23:26 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index 9dbb286..81f72d4 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -934,10 +934,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1016,9 +1012,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1054,20 +1053,6 @@ Future DockerContainerizerProcess::_recover( container->status.future().get() .onAny(defer(self(), ::reaped, containerId)); -if (pids.containsValue(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 02/02: Added MESOS-9695 to the 1.5.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 791ac63c72c1c4b868c2b3a126c04075575757c1 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:45:01 2019 +0200 Added MESOS-9695 to the 1.5.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 6e175cf..fd85213 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -5,6 +5,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP) ** Bug * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement
[mesos] branch 1.6.x updated (45bfa2a -> 13fdaa4)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 45bfa2a Added MESOS-9536 to the 1.6.3 CHANGELOG. new d627a91 Removed the duplicate pid check in Docker containerizer. new 13fdaa4 Added MESOS-9695 to the 1.6.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/containerizer/docker.cpp | 27 ++- 2 files changed, 7 insertions(+), 21 deletions(-)
[mesos] 02/02: Added MESOS-9695 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 13fdaa4e7be41f5129d2b96944569abca2b60905 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:36:47 2019 +0200 Added MESOS-9695 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 58fda2e..55b74d1 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -9,6 +9,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement
[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit d627a919c651e8dacd14569c027fe4422ef87828 Author: Qian Zhang AuthorDate: Tue Apr 30 13:23:26 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index ef468ed..85b22f4 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -937,10 +937,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1019,9 +1015,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1057,20 +1056,6 @@ Future DockerContainerizerProcess::_recover( container->status.future() ->onAny(defer(self(), ::reaped, containerId)); -if (pids.containsValue(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit bd438b797a15724945a60ee8d57cec656e23ae2d Author: Qian Zhang AuthorDate: Tue Apr 30 13:23:26 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index 192dc29..dacf4de 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -945,10 +945,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1027,9 +1023,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1065,20 +1064,6 @@ Future DockerContainerizerProcess::_recover( container->status.future() ->onAny(defer(self(), ::reaped, containerId)); -if (pids.containsValue(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 02/02: Added MESOS-9695 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 80c9fd79448b1ea6f6367da8def911b75ababafd Author: Andrei Budnik AuthorDate: Tue Apr 30 14:30:32 2019 +0200 Added MESOS-9695 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index ae40637..369a2c8 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -15,6 +15,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations. * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvements
[mesos] branch 1.7.x updated (f0cbafd -> 80c9fd7)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from f0cbafd Added MESOS-9536 to the 1.7.3 CHANGELOG. new bd438b7 Removed the duplicate pid check in Docker containerizer. new 80c9fd7 Added MESOS-9695 to the 1.7.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/containerizer/docker.cpp | 27 ++- 2 files changed, 7 insertions(+), 21 deletions(-)
[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 02f532ae876196c0c8abad9d6effb75d3ffa5db7 Author: Qian Zhang AuthorDate: Tue Apr 30 13:59:54 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index 7f1d471..e4ad945 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -936,10 +936,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1018,9 +1014,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1056,20 +1055,6 @@ Future DockerContainerizerProcess::_recover( container->status.future() ->onAny(defer(self(), ::reaped, containerId)); -if (pids.contains_value(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 02/02: Added MESOS-9695 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit b032d4ec81c76277b274150f8027766f0e3d2275 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:08:58 2019 +0200 Added MESOS-9695 to the 1.8.1 CHANGELOG. --- CHANGELOG | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG b/CHANGELOG index d19085d..c99523c 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,7 +4,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) ** Bug * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. - + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer Release Notes - Mesos - Version 1.8.0 -
[mesos] branch 1.8.x updated (6160315 -> b032d4e)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 6160315 Added MESOS-9536 to the 1.8.1 CHANGELOG. new 02f532a Removed the duplicate pid check in Docker containerizer. new b032d4e Added MESOS-9695 to the 1.8.1 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 2 +- src/slave/containerizer/docker.cpp | 27 ++- 2 files changed, 7 insertions(+), 22 deletions(-)
[mesos] branch master updated: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new c8004ee Removed the duplicate pid check in Docker containerizer. c8004ee is described below commit c8004ee8a0962d0e0f9147718853160bb708f5bc Author: Qian Zhang AuthorDate: Tue Apr 30 13:23:26 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index 7f1d471..e4ad945 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -936,10 +936,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1018,9 +1014,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1056,20 +1055,6 @@ Future DockerContainerizerProcess::_recover( container->status.future() ->onAny(defer(self(), ::reaped, containerId)); -if (pids.contains_value(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 01/03: Added MESOS-9536 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. qianzhang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 63391fe89d7581cc2a42a8f5095630a5aa4bd502 Author: Qian Zhang AuthorDate: Tue Apr 30 09:46:12 2019 +0800 Added MESOS-9536 to the 1.8.1 CHANGELOG. --- CHANGELOG | 8 1 file changed, 8 insertions(+) diff --git a/CHANGELOG b/CHANGELOG index 799da78..83f7fca 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,3 +1,11 @@ +Release Notes - Mesos - Version 1.8.1 (WIP) +--- +* This is a bug fix release. + +** Bug + * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. + + Release Notes - Mesos - Version 1.8.0 - This release contains the following highlights:
[mesos] branch master updated (4fa4f77 -> 977af9b)
This is an automated email from the ASF dual-hosted git repository. qianzhang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 4fa4f77 Documented LIBPROCESS_SSL_ENABLE_TLS_V1_3. new 63391fe Added MESOS-9536 to the 1.8.1 CHANGELOG. new 9c20d4e Added MESOS-9536 to the 1.7.3 CHANGELOG. new 977af9b Added MESOS-9536 to the 1.6.3 CHANGELOG. The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 10 ++ 1 file changed, 10 insertions(+)
[mesos] 02/03: Added MESOS-9536 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. qianzhang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 9c20d4ec869a4fb3eb90fda975afc10c1bdb49c3 Author: Qian Zhang AuthorDate: Tue Apr 30 09:47:24 2019 +0800 Added MESOS-9536 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 83f7fca..0870089 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -408,6 +408,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-8467] - Destroyed executors might be used after `Slave::publishResource()`. * [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files. * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. + * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. * [MESOS-9549] - nvidia/cuda 10 does not work on GPU isolator. * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9568] - SLRP does not clean up mount directories for destroyed MOUNT disks.
[mesos] 03/03: Added MESOS-9536 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. qianzhang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 977af9b87f582d6301083c730046d5be32c5fea6 Author: Qian Zhang AuthorDate: Tue Apr 30 09:48:33 2019 +0800 Added MESOS-9536 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 0870089..9c01040 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -881,6 +881,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) ** Bug * [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files. * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. + * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9692] - Quota may be under allocated for disk resources.
[mesos] 02/02: Added MESOS-9536 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. qianzhang pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit f0cbafdc82bfa9b93df01053c9b1bd5855dc0dba Author: Qian Zhang AuthorDate: Tue Apr 30 09:47:24 2019 +0800 Added MESOS-9536 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 1fbeb1b..ae40637 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -6,6 +6,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-8467] - Destroyed executors might be used after `Slave::publishResource()`. * [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files. * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. + * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. * [MESOS-9549] - nvidia/cuda 10 does not work on GPU isolator. * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9568] - SLRP does not clean up mount directories for destroyed MOUNT disks.
[mesos] 01/02: Made nested contaienr can access its sandbox via `MESOS_SANDBOX`.
This is an automated email from the ASF dual-hosted git repository. qianzhang pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit beaae8df702e51102069b2b0502e924697ae36a2 Author: Qian Zhang AuthorDate: Fri Apr 19 17:22:45 2019 +0800 Made nested contaienr can access its sandbox via `MESOS_SANDBOX`. Previously in MESOS-8332 we narrowed task sandbox permissions from 0755 to 0750 which will cause nested container may not has permission to access its sandbox via the environment variable `MESOS_SANDBOX`. Now in this patch, for nested container which does not have its own rootfs, we bind mount its sandbox to the directory specified via the agent flag `--sandbox_directory` and set `MESOS_SANDBOX` to `--sandbox_directory` as well, in this way such nested container will have the permission to access its sandbox via `MESOS_SANDBOX`. Review: https://reviews.apache.org/r/70514 --- src/slave/containerizer/mesos/containerizer.cpp| 24 +++-- .../mesos/isolators/filesystem/linux.cpp | 25 ++ 2 files changed, 42 insertions(+), 7 deletions(-) diff --git a/src/slave/containerizer/mesos/containerizer.cpp b/src/slave/containerizer/mesos/containerizer.cpp index e8a4ab3..1867f3b 100644 --- a/src/slave/containerizer/mesos/containerizer.cpp +++ b/src/slave/containerizer/mesos/containerizer.cpp @@ -1779,15 +1779,25 @@ Future MesosContainerizerProcess::_launch( if (container->containerClass() == ContainerClass::DEFAULT) { // TODO(jieyu): Consider moving this to filesystem isolator. // -// NOTE: For the command executor case, although it uses the host -// filesystem for itself, we still set 'MESOS_SANDBOX' according to -// the root filesystem of the task (if specified). Command executor -// itself does not use this environment variable. +// NOTE: For the command executor case, although it uses the host filesystem +// for itself, we still set `MESOS_SANDBOX` according to the root filesystem +// of the task (if specified). Command executor itself does not use this +// environment variable. For nested container which does not have its own +// rootfs, if the `filesystem/linux` isolator is enabled, we will also set +// `MESOS_SANDBOX` to `flags.sandbox_directory` since in `prepare` method +// of the `filesystem/linux` isolator we bind mount such nested container's +// sandbox to `flags.sandbox_directory`. Since such bind mount is only done +// by the `filesystem/linux` isolator, if another filesystem isolator (e.g., +// `filesystem/posix`) is enabled instead, nested container may still have +// no permission to access its sandbox via `MESOS_SANDBOX`. Environment::Variable* variable = containerEnvironment.add_variables(); variable->set_name("MESOS_SANDBOX"); -variable->set_value(container->config->has_rootfs() - ? flags.sandbox_directory - : container->config->directory()); +variable->set_value( +(container->config->has_rootfs() || + (strings::contains(flags.isolation, "filesystem/linux") && + containerId.has_parent())) + ? flags.sandbox_directory + : container->config->directory()); } // `launchInfo.environment` contains the environment returned by diff --git a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp index a47899c..93a88a0 100644 --- a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp +++ b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp @@ -202,6 +202,16 @@ Try LinuxFilesystemIsolatorProcess::create(const Flags& flags) } } + // Create sandbox directory. We will bind mount the sandbox of nested + // container which does not have its own rootfs to this directory. See + // `prepare` for details. + Try mkdir = os::mkdir(flags.sandbox_directory); + if (mkdir.isError()) { +return Error( +"Failed to create sandbox directory at '" + +flags.sandbox_directory + "': " + mkdir.error()); + } + Owned process( new LinuxFilesystemIsolatorProcess(flags)); @@ -395,6 +405,21 @@ Future> LinuxFilesystemIsolatorProcess::prepare( mount->set_source(containerConfig.directory()); mount->set_target(sandbox); mount->set_flags(MS_BIND | MS_REC); + } else if (containerId.has_parent()) { +// For nested container which does not have its own rootfs, bind mount its +// sandbox to the directory specified via `flags.sandbox_directory` (e.g., +// `/mnt/mesos/sandbox`) in its own mount namespace and set the environment +// variable `MESOS_SANDBOX` to `flags.sandbox_directory` (see the `_launch` +// method of `MesosContainerizerProcess` for details). The reason that we do +// this is, in MESOS-8332 we narrowed task sandbox permissions from 0755 to +// 0750, since nested
[mesos] 02/02: Added MESOS-9536 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. qianzhang pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 45bfa2aa42c119da6f83b865d4929ee6064c2697 Author: Qian Zhang AuthorDate: Tue Apr 30 09:48:33 2019 +0800 Added MESOS-9536 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index a46b93f..58fda2e 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -5,6 +5,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) ** Bug * [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files. * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. + * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9692] - Quota may be under allocated for disk resources.
[mesos] 01/02: Made nested contaienr can access its sandbox via `MESOS_SANDBOX`.
This is an automated email from the ASF dual-hosted git repository. qianzhang pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit e5149a4a00625845995e38eaf96c35ef6817be37 Author: Qian Zhang AuthorDate: Fri Apr 19 17:22:45 2019 +0800 Made nested contaienr can access its sandbox via `MESOS_SANDBOX`. Previously in MESOS-8332 we narrowed task sandbox permissions from 0755 to 0750 which will cause nested container may not has permission to access its sandbox via the environment variable `MESOS_SANDBOX`. Now in this patch, for nested container which does not have its own rootfs, we bind mount its sandbox to the directory specified via the agent flag `--sandbox_directory` and set `MESOS_SANDBOX` to `--sandbox_directory` as well, in this way such nested container will have the permission to access its sandbox via `MESOS_SANDBOX`. Review: https://reviews.apache.org/r/70514 --- src/slave/containerizer/mesos/containerizer.cpp| 24 +++-- .../mesos/isolators/filesystem/linux.cpp | 25 ++ 2 files changed, 42 insertions(+), 7 deletions(-) diff --git a/src/slave/containerizer/mesos/containerizer.cpp b/src/slave/containerizer/mesos/containerizer.cpp index 6e635d8..a34978a 100644 --- a/src/slave/containerizer/mesos/containerizer.cpp +++ b/src/slave/containerizer/mesos/containerizer.cpp @@ -1747,15 +1747,25 @@ Future MesosContainerizerProcess::_launch( if (container->containerClass() == ContainerClass::DEFAULT) { // TODO(jieyu): Consider moving this to filesystem isolator. // -// NOTE: For the command executor case, although it uses the host -// filesystem for itself, we still set 'MESOS_SANDBOX' according to -// the root filesystem of the task (if specified). Command executor -// itself does not use this environment variable. +// NOTE: For the command executor case, although it uses the host filesystem +// for itself, we still set `MESOS_SANDBOX` according to the root filesystem +// of the task (if specified). Command executor itself does not use this +// environment variable. For nested container which does not have its own +// rootfs, if the `filesystem/linux` isolator is enabled, we will also set +// `MESOS_SANDBOX` to `flags.sandbox_directory` since in `prepare` method +// of the `filesystem/linux` isolator we bind mount such nested container's +// sandbox to `flags.sandbox_directory`. Since such bind mount is only done +// by the `filesystem/linux` isolator, if another filesystem isolator (e.g., +// `filesystem/posix`) is enabled instead, nested container may still have +// no permission to access its sandbox via `MESOS_SANDBOX`. Environment::Variable* variable = containerEnvironment.add_variables(); variable->set_name("MESOS_SANDBOX"); -variable->set_value(container->config->has_rootfs() - ? flags.sandbox_directory - : container->config->directory()); +variable->set_value( +(container->config->has_rootfs() || + (strings::contains(flags.isolation, "filesystem/linux") && + containerId.has_parent())) + ? flags.sandbox_directory + : container->config->directory()); } // `launchInfo.environment` contains the environment returned by diff --git a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp index 2844327..b3d1d4e 100644 --- a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp +++ b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp @@ -203,6 +203,16 @@ Try LinuxFilesystemIsolatorProcess::create(const Flags& flags) } } + // Create sandbox directory. We will bind mount the sandbox of nested + // container which does not have its own rootfs to this directory. See + // `prepare` for details. + Try mkdir = os::mkdir(flags.sandbox_directory); + if (mkdir.isError()) { +return Error( +"Failed to create sandbox directory at '" + +flags.sandbox_directory + "': " + mkdir.error()); + } + Owned process( new LinuxFilesystemIsolatorProcess(flags)); @@ -396,6 +406,21 @@ Future> LinuxFilesystemIsolatorProcess::prepare( mount->set_source(containerConfig.directory()); mount->set_target(sandbox); mount->set_flags(MS_BIND | MS_REC); + } else if (containerId.has_parent()) { +// For nested container which does not have its own rootfs, bind mount its +// sandbox to the directory specified via `flags.sandbox_directory` (e.g., +// `/mnt/mesos/sandbox`) in its own mount namespace and set the environment +// variable `MESOS_SANDBOX` to `flags.sandbox_directory` (see the `_launch` +// method of `MesosContainerizerProcess` for details). The reason that we do +// this is, in MESOS-8332 we narrowed task sandbox permissions from 0755 to +// 0750, since nested
[mesos] branch 1.6.x updated (ebf1478 -> 45bfa2a)
This is an automated email from the ASF dual-hosted git repository. qianzhang pushed a change to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from ebf1478 Added MESOS-9619 to the 1.6.3 CHANGELOG. new e5149a4 Made nested contaienr can access its sandbox via `MESOS_SANDBOX`. new 45bfa2a Added MESOS-9536 to the 1.6.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/containerizer/mesos/containerizer.cpp| 24 +++-- .../mesos/isolators/filesystem/linux.cpp | 25 ++ 3 files changed, 43 insertions(+), 7 deletions(-)