[mesos] branch master updated: Disabled 3 windows failure tests DockerFetcherPluginTest.INTERNET_CURL_.

2019-04-30 Thread gilbert
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git


The following commit(s) were added to refs/heads/master by this push:
 new af7a876  Disabled 3 windows failure tests 
DockerFetcherPluginTest.INTERNET_CURL_.
af7a876 is described below

commit af7a87608179fc07284d3827f6722d77faf70a4e
Author: Gilbert Song 
AuthorDate: Tue Apr 30 20:10:17 2019 -0700

Disabled 3 windows failure tests DockerFetcherPluginTest.INTERNET_CURL_.
---
 src/tests/uri_fetcher_tests.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/tests/uri_fetcher_tests.cpp b/src/tests/uri_fetcher_tests.cpp
index 55e75be..c727cc5 100644
--- a/src/tests/uri_fetcher_tests.cpp
+++ b/src/tests/uri_fetcher_tests.cpp
@@ -309,7 +309,7 @@ static constexpr char TEST_DIGEST[] = 
"sha256:a3ed95caeb02ffe68cdd9fd844066"
 class DockerFetcherPluginTest : public TemporaryDirectoryTest {};
 
 
-TEST_F(DockerFetcherPluginTest, INTERNET_CURL_FetchManifest)
+TEST_F(DockerFetcherPluginTest, DISABLED_INTERNET_CURL_FetchManifest)
 {
   URI uri = uri::docker::manifest(
   TEST_REPOSITORY, "latest", DOCKER_REGISTRY_HOST);
@@ -352,7 +352,7 @@ TEST_F(DockerFetcherPluginTest, INTERNET_CURL_FetchBlob)
 
 
 // Fetches the image manifest and all blobs in that image.
-TEST_F(DockerFetcherPluginTest, INTERNET_CURL_FetchImage)
+TEST_F(DockerFetcherPluginTest, DISABLED_INTERNET_CURL_FetchImage)
 {
   URI uri = uri::docker::image(
   TEST_REPOSITORY, "latest", DOCKER_REGISTRY_HOST);
@@ -388,7 +388,7 @@ TEST_F(DockerFetcherPluginTest, INTERNET_CURL_FetchImage)
 
 
 // This test verifies invoking 'fetch' by plugin name.
-TEST_F(DockerFetcherPluginTest, INTERNET_CURL_InvokeFetchByName)
+TEST_F(DockerFetcherPluginTest, DISABLED_INTERNET_CURL_InvokeFetchByName)
 {
   URI uri = uri::docker::image(
   TEST_REPOSITORY, "latest", DOCKER_REGISTRY_HOST);



[mesos] branch master updated: Added a test to verify the sort correctness of the random sorter.

2019-04-30 Thread mzhu
This is an automated email from the ASF dual-hosted git repository.

mzhu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git


The following commit(s) were added to refs/heads/master by this push:
 new 89c3dd9  Added a test to verify the sort correctness of the random 
sorter.
89c3dd9 is described below

commit 89c3dd95a421e14044bc91ceb1998ff4ae3883b4
Author: Meng Zhu 
AuthorDate: Sun Apr 7 15:55:42 2019 -0700

Added a test to verify the sort correctness of the random sorter.

Review: https://reviews.apache.org/r/70418
---
 src/tests/sorter_tests.cpp | 53 ++
 1 file changed, 53 insertions(+)

diff --git a/src/tests/sorter_tests.cpp b/src/tests/sorter_tests.cpp
index c9a0bda..9aee2b4 100644
--- a/src/tests/sorter_tests.cpp
+++ b/src/tests/sorter_tests.cpp
@@ -170,6 +170,59 @@ TEST(RandomSorterTest, HierarchicalProbabilityDistribution)
 }
 
 
+TEST(RandomSorterTest, ProbabilityDistribution)
+{
+  // Test the behavior of the random sorter by ensuring that the
+  // probability distribution after a number of runs is within
+  // a particular error bound.
+
+  RandomSorter sorter;
+
+  vector clients = {"0", "1", "2", "3", "4"};
+  vector weights = {1.0, 2.0, 3.0, 4.0, 5.0};
+
+  for (size_t i = 0; i < 5; ++i) {
+sorter.add(clients.at(i));
+sorter.activate(clients.at(i));
+sorter.updateWeight(clients.at(i), weights.at(i));
+  }
+
+  // Count of how many times client i returned as the jth client
+  // in the sort result.
+  size_t totalRuns = 1000u;
+  size_t counts[5][5] = {};
+
+  for (size_t run = 0; run < totalRuns; ++run) {
+vector candidates = sorter.sort();
+for (size_t i = 0; i < candidates.size(); ++i) {
+  ++counts[std::stoi(candidates.at(i))][i];
+}
+  }
+
+  // This table was generated by running a weighted shuffle algorithm
+  // for a large number of iterations.
+  double expectedProbabilities[5][5] = {
+{0.07, 0.08, 0.12, 0.20, 0.54},
+{0.13, 0.16, 0.20, 0.28, 0.23},
+{0.20, 0.22, 0.24, 0.22, 0.12},
+{0.27, 0.26, 0.23, 0.17, 0.07},
+{0.33, 0.28, 0.21, 0.13, 0.04},
+  };
+
+  double actualProbabilities[5][5];
+
+  for (int i = 0; i < 5; ++i) {
+for (int j = 0; j < 5; ++j) {
+  actualProbabilities[i][j] = counts[i][j] / (1.0 * totalRuns);
+
+  // Assert that the actual probabilities differ less than
+  // an absolute 5%.
+  ASSERT_NEAR(expectedProbabilities[i][j], actualProbabilities[i][j], 
0.05);
+}
+  }
+}
+
+
 template 
 class CommonSorterTest : public ::testing::Test {};
 



[mesos] branch master updated: Added debug logging when framework is missing during agent removal.

2019-04-30 Thread grag
This is an automated email from the ASF dual-hosted git repository.

grag pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git


The following commit(s) were added to refs/heads/master by this push:
 new fa39fe2  Added debug logging when framework is missing during agent 
removal.
fa39fe2 is described below

commit fa39fe2a932de6a6ddccf65e9322738e48c7b39e
Author: Greg Mann 
AuthorDate: Tue Apr 30 10:27:15 2019 -0700

Added debug logging when framework is missing during agent removal.

This patch adds extra debug logging to `Master::__removeSlave()`
in order to help triage MESOS-9609 if that issue is observed
again in the future.

Review: https://reviews.apache.org/r/70559/
---
 src/master/master.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/master/master.cpp b/src/master/master.cpp
index 7dcdc9a..9f0a976 100644
--- a/src/master/master.cpp
+++ b/src/master/master.cpp
@@ -11460,7 +11460,9 @@ void Master::__removeSlave(
   // the framework has opted in to the PARTITION_AWARE capability.
   foreachkey (const FrameworkID& frameworkId, utils::copy(slave->tasks)) {
 Framework* framework = getFramework(frameworkId);
-CHECK_NOTNULL(framework);
+CHECK(framework != nullptr)
+  << "Framework " << frameworkId << " not found while removing agent "
+  << *slave << "; agent tasks: " << slave->tasks;
 
 TaskState newTaskState = TASK_UNREACHABLE;
 TaskStatus::Reason newTaskReason = TaskStatus::REASON_SLAVE_REMOVED;



[mesos] branch 1.8.x updated: Fixed a performance issue in the random sorter.

2019-04-30 Thread mzhu
This is an automated email from the ASF dual-hosted git repository.

mzhu pushed a commit to branch 1.8.x
in repository https://gitbox.apache.org/repos/asf/mesos.git


The following commit(s) were added to refs/heads/1.8.x by this push:
 new 855d1e7  Fixed a performance issue in the random sorter.
855d1e7 is described below

commit 855d1e79a401828176fe36b2cc1182d6856817b0
Author: Meng Zhu 
AuthorDate: Sun Apr 28 15:53:17 2019 -0700

Fixed a performance issue in the random sorter.

Review: https://reviews.apache.org/r/70564
---
 src/master/allocator/sorter/random/sorter.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/master/allocator/sorter/random/sorter.cpp 
b/src/master/allocator/sorter/random/sorter.cpp
index f4132bb..813f5b5 100644
--- a/src/master/allocator/sorter/random/sorter.cpp
+++ b/src/master/allocator/sorter/random/sorter.cpp
@@ -472,7 +472,7 @@ void RandomSorter::remove(const SlaveID& slaveId, const 
Resources& resources)
 vector RandomSorter::sort()
 {
   pair, vector> clientsAndWeights =
-SortInfo(this).getClientsAndWeights();
+sortInfo.getClientsAndWeights();
 
   weightedShuffle(
   clientsAndWeights.first.begin(),



[mesos] branch master updated: Fixed a performance issue in the random sorter.

2019-04-30 Thread mzhu
This is an automated email from the ASF dual-hosted git repository.

mzhu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git


The following commit(s) were added to refs/heads/master by this push:
 new 2d38c3e  Fixed a performance issue in the random sorter.
2d38c3e is described below

commit 2d38c3ee8d1dcf2c7aacff3a3a18d017b2e0c907
Author: Meng Zhu 
AuthorDate: Sun Apr 28 15:53:17 2019 -0700

Fixed a performance issue in the random sorter.

Review: https://reviews.apache.org/r/70564
---
 src/master/allocator/sorter/random/sorter.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/master/allocator/sorter/random/sorter.cpp 
b/src/master/allocator/sorter/random/sorter.cpp
index f4132bb..813f5b5 100644
--- a/src/master/allocator/sorter/random/sorter.cpp
+++ b/src/master/allocator/sorter/random/sorter.cpp
@@ -472,7 +472,7 @@ void RandomSorter::remove(const SlaveID& slaveId, const 
Resources& resources)
 vector RandomSorter::sort()
 {
   pair, vector> clientsAndWeights =
-SortInfo(this).getClientsAndWeights();
+sortInfo.getClientsAndWeights();
 
   weightedShuffle(
   clientsAndWeights.first.begin(),



[mesos] 03/04: Added MESOS-9695 to the 1.6.3 CHANGELOG.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 138bfe41c822823afe6d2a0532e3c95e4f7d3bfe
Author: Andrei Budnik 
AuthorDate: Tue Apr 30 14:36:47 2019 +0200

Added MESOS-9695 to the 1.6.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index e7ec1a5..425df0e 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -886,6 +886,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP)
   * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary 
commands in the Mesos agent's namespace.
   * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port 
Resources
   * [MESOS-9692] - Quota may be under allocated for disk resources.
+  * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
   * [MESOS-9707] - Calling link::lo() may cause runtime error
 
 ** Improvement



[mesos] 04/04: Added MESOS-9695 to the 1.5.4 CHANGELOG.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 4eec48e17d08575c18458f713d2e8280faab99d6
Author: Andrei Budnik 
AuthorDate: Tue Apr 30 14:45:01 2019 +0200

Added MESOS-9695 to the 1.5.4 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 425df0e..1a4d782 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1350,6 +1350,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP)
 ** Bug
   * [MESOS-9529] - `/proc` should be remounted even if a nested container set 
`share_pid_namespace` to true.
   * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port 
Resources
+  * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
   * [MESOS-9707] - Calling link::lo() may cause runtime error
 
 ** Improvement



[mesos] branch master updated (c8004ee -> 4eec48e)

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from c8004ee  Removed the duplicate pid check in Docker containerizer.
 new a8fe548  Added MESOS-9695 to the 1.8.1 CHANGELOG.
 new 592f7c4  Added MESOS-9695 to the 1.7.3 CHANGELOG.
 new 138bfe4  Added MESOS-9695 to the 1.6.3 CHANGELOG.
 new 4eec48e  Added MESOS-9695 to the 1.5.4 CHANGELOG.

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)



[mesos] 01/04: Added MESOS-9695 to the 1.8.1 CHANGELOG.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit a8fe548b091732c2b46f5e6a7d7392de92644f5a
Author: Andrei Budnik 
AuthorDate: Tue Apr 30 14:08:58 2019 +0200

Added MESOS-9695 to the 1.8.1 CHANGELOG.
---
 CHANGELOG | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CHANGELOG b/CHANGELOG
index 9c01040..89452a0 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -4,7 +4,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP)
 
 ** Bug
   * [MESOS-9536] - Nested container launched with non-root user may not be 
able to write to its sandbox via the environment variable `MESOS_SANDBOX`.
-
+  * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
 
 Release Notes - Mesos - Version 1.8.0
 -



[mesos] 02/04: Added MESOS-9695 to the 1.7.3 CHANGELOG.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 592f7c44a304f5c02804a0de98df07ec295ce070
Author: Andrei Budnik 
AuthorDate: Tue Apr 30 14:30:32 2019 +0200

Added MESOS-9695 to the 1.7.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 89452a0..e7ec1a5 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -417,6 +417,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP)
   * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port 
Resources
   * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations.
   * [MESOS-9692] - Quota may be under allocated for disk resources.
+  * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
   * [MESOS-9707] - Calling link::lo() may cause runtime error
 
 ** Improvements



[mesos] branch 1.5.x updated (f8e0e41 -> 791ac63)

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from f8e0e41  Added MESOS-9619 to the 1.5.4 CHANGELOG.
 new f4a6453  Removed the duplicate pid check in Docker containerizer.
 new 791ac63  Added MESOS-9695 to the 1.5.4 CHANGELOG.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG  |  1 +
 src/slave/containerizer/docker.cpp | 27 ++-
 2 files changed, 7 insertions(+), 21 deletions(-)



[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit f4a6453c77719ed531a9287b6c9cdeb7ad268865
Author: Qian Zhang 
AuthorDate: Tue Apr 30 13:23:26 2019 +0200

Removed the duplicate pid check in Docker containerizer.

Review: https://reviews.apache.org/r/70561/
---
 src/slave/containerizer/docker.cpp | 27 ++-
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/src/slave/containerizer/docker.cpp 
b/src/slave/containerizer/docker.cpp
index 9dbb286..81f72d4 100644
--- a/src/slave/containerizer/docker.cpp
+++ b/src/slave/containerizer/docker.cpp
@@ -934,10 +934,6 @@ Future DockerContainerizerProcess::_recover(
   }
 }
 
-// Collection of pids that we've started reaping in order to
-// detect very unlikely duplicate scenario (see below).
-hashmap pids;
-
 foreachvalue (const FrameworkState& framework, state->frameworks) {
   foreachvalue (const ExecutorState& executor, framework.executors) {
 if (executor.info.isNone()) {
@@ -1016,9 +1012,12 @@ Future DockerContainerizerProcess::_recover(
 
 // Only reap the executor process if the executor can be connected
 // otherwise just set `container->status` to `None()`. This is to
-// avoid reaping an irrelevant process, e.g., after the agent host is
-// rebooted, the executor pid happens to be reused by another process.
-// See MESOS-8125 for details.
+// avoid reaping an irrelevant process, e.g., agent process is stopped
+// for a long time, and during this time executor terminates and its
+// pid happens to be reused by another irrelevant process. When agent
+// is restarted, it still considers this executor not complete (i.e.,
+// `run->completed` is false), so we would reap the irrelevant process
+// if we do not check whether that process can be connected.
 // Note that if both the pid and the port of the executor are reused
 // by another process or two processes respectively after the agent
 // host reboots we will still reap an irrelevant process, but that
@@ -1054,20 +1053,6 @@ Future DockerContainerizerProcess::_recover(
 container->status.future().get()
   .onAny(defer(self(), ::reaped, containerId));
 
-if (pids.containsValue(pid)) {
-  // This should (almost) never occur. There is the
-  // possibility that a new executor is launched with the same
-  // pid as one that just exited (highly unlikely) and the
-  // slave dies after the new executor is launched but before
-  // it hears about the termination of the earlier executor
-  // (also unlikely).
-  return Failure(
-  "Detected duplicate pid " + stringify(pid) +
-  " for container " + stringify(containerId));
-}
-
-pids.put(containerId, pid);
-
 const string sandboxDirectory = paths::getExecutorRunPath(
 flags.work_dir,
 state->id,



[mesos] 02/02: Added MESOS-9695 to the 1.5.4 CHANGELOG.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 791ac63c72c1c4b868c2b3a126c04075575757c1
Author: Andrei Budnik 
AuthorDate: Tue Apr 30 14:45:01 2019 +0200

Added MESOS-9695 to the 1.5.4 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 6e175cf..fd85213 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -5,6 +5,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP)
 ** Bug
   * [MESOS-9529] - `/proc` should be remounted even if a nested container set 
`share_pid_namespace` to true.
   * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port 
Resources
+  * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
   * [MESOS-9707] - Calling link::lo() may cause runtime error
 
 ** Improvement



[mesos] branch 1.6.x updated (45bfa2a -> 13fdaa4)

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from 45bfa2a  Added MESOS-9536 to the 1.6.3 CHANGELOG.
 new d627a91  Removed the duplicate pid check in Docker containerizer.
 new 13fdaa4  Added MESOS-9695 to the 1.6.3 CHANGELOG.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG  |  1 +
 src/slave/containerizer/docker.cpp | 27 ++-
 2 files changed, 7 insertions(+), 21 deletions(-)



[mesos] 02/02: Added MESOS-9695 to the 1.6.3 CHANGELOG.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 13fdaa4e7be41f5129d2b96944569abca2b60905
Author: Andrei Budnik 
AuthorDate: Tue Apr 30 14:36:47 2019 +0200

Added MESOS-9695 to the 1.6.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 58fda2e..55b74d1 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -9,6 +9,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP)
   * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary 
commands in the Mesos agent's namespace.
   * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port 
Resources
   * [MESOS-9692] - Quota may be under allocated for disk resources.
+  * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
   * [MESOS-9707] - Calling link::lo() may cause runtime error
 
 ** Improvement



[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit d627a919c651e8dacd14569c027fe4422ef87828
Author: Qian Zhang 
AuthorDate: Tue Apr 30 13:23:26 2019 +0200

Removed the duplicate pid check in Docker containerizer.

Review: https://reviews.apache.org/r/70561/
---
 src/slave/containerizer/docker.cpp | 27 ++-
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/src/slave/containerizer/docker.cpp 
b/src/slave/containerizer/docker.cpp
index ef468ed..85b22f4 100644
--- a/src/slave/containerizer/docker.cpp
+++ b/src/slave/containerizer/docker.cpp
@@ -937,10 +937,6 @@ Future DockerContainerizerProcess::_recover(
   }
 }
 
-// Collection of pids that we've started reaping in order to
-// detect very unlikely duplicate scenario (see below).
-hashmap pids;
-
 foreachvalue (const FrameworkState& framework, state->frameworks) {
   foreachvalue (const ExecutorState& executor, framework.executors) {
 if (executor.info.isNone()) {
@@ -1019,9 +1015,12 @@ Future DockerContainerizerProcess::_recover(
 
 // Only reap the executor process if the executor can be connected
 // otherwise just set `container->status` to `None()`. This is to
-// avoid reaping an irrelevant process, e.g., after the agent host is
-// rebooted, the executor pid happens to be reused by another process.
-// See MESOS-8125 for details.
+// avoid reaping an irrelevant process, e.g., agent process is stopped
+// for a long time, and during this time executor terminates and its
+// pid happens to be reused by another irrelevant process. When agent
+// is restarted, it still considers this executor not complete (i.e.,
+// `run->completed` is false), so we would reap the irrelevant process
+// if we do not check whether that process can be connected.
 // Note that if both the pid and the port of the executor are reused
 // by another process or two processes respectively after the agent
 // host reboots we will still reap an irrelevant process, but that
@@ -1057,20 +1056,6 @@ Future DockerContainerizerProcess::_recover(
 container->status.future()
   ->onAny(defer(self(), ::reaped, containerId));
 
-if (pids.containsValue(pid)) {
-  // This should (almost) never occur. There is the
-  // possibility that a new executor is launched with the same
-  // pid as one that just exited (highly unlikely) and the
-  // slave dies after the new executor is launched but before
-  // it hears about the termination of the earlier executor
-  // (also unlikely).
-  return Failure(
-  "Detected duplicate pid " + stringify(pid) +
-  " for container " + stringify(containerId));
-}
-
-pids.put(containerId, pid);
-
 const string sandboxDirectory = paths::getExecutorRunPath(
 flags.work_dir,
 state->id,



[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit bd438b797a15724945a60ee8d57cec656e23ae2d
Author: Qian Zhang 
AuthorDate: Tue Apr 30 13:23:26 2019 +0200

Removed the duplicate pid check in Docker containerizer.

Review: https://reviews.apache.org/r/70561/
---
 src/slave/containerizer/docker.cpp | 27 ++-
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/src/slave/containerizer/docker.cpp 
b/src/slave/containerizer/docker.cpp
index 192dc29..dacf4de 100644
--- a/src/slave/containerizer/docker.cpp
+++ b/src/slave/containerizer/docker.cpp
@@ -945,10 +945,6 @@ Future DockerContainerizerProcess::_recover(
   }
 }
 
-// Collection of pids that we've started reaping in order to
-// detect very unlikely duplicate scenario (see below).
-hashmap pids;
-
 foreachvalue (const FrameworkState& framework, state->frameworks) {
   foreachvalue (const ExecutorState& executor, framework.executors) {
 if (executor.info.isNone()) {
@@ -1027,9 +1023,12 @@ Future DockerContainerizerProcess::_recover(
 
 // Only reap the executor process if the executor can be connected
 // otherwise just set `container->status` to `None()`. This is to
-// avoid reaping an irrelevant process, e.g., after the agent host is
-// rebooted, the executor pid happens to be reused by another process.
-// See MESOS-8125 for details.
+// avoid reaping an irrelevant process, e.g., agent process is stopped
+// for a long time, and during this time executor terminates and its
+// pid happens to be reused by another irrelevant process. When agent
+// is restarted, it still considers this executor not complete (i.e.,
+// `run->completed` is false), so we would reap the irrelevant process
+// if we do not check whether that process can be connected.
 // Note that if both the pid and the port of the executor are reused
 // by another process or two processes respectively after the agent
 // host reboots we will still reap an irrelevant process, but that
@@ -1065,20 +1064,6 @@ Future DockerContainerizerProcess::_recover(
 container->status.future()
   ->onAny(defer(self(), ::reaped, containerId));
 
-if (pids.containsValue(pid)) {
-  // This should (almost) never occur. There is the
-  // possibility that a new executor is launched with the same
-  // pid as one that just exited (highly unlikely) and the
-  // slave dies after the new executor is launched but before
-  // it hears about the termination of the earlier executor
-  // (also unlikely).
-  return Failure(
-  "Detected duplicate pid " + stringify(pid) +
-  " for container " + stringify(containerId));
-}
-
-pids.put(containerId, pid);
-
 const string sandboxDirectory = paths::getExecutorRunPath(
 flags.work_dir,
 state->id,



[mesos] 02/02: Added MESOS-9695 to the 1.7.3 CHANGELOG.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 80c9fd79448b1ea6f6367da8def911b75ababafd
Author: Andrei Budnik 
AuthorDate: Tue Apr 30 14:30:32 2019 +0200

Added MESOS-9695 to the 1.7.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index ae40637..369a2c8 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -15,6 +15,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP)
   * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port 
Resources
   * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations.
   * [MESOS-9692] - Quota may be under allocated for disk resources.
+  * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
   * [MESOS-9707] - Calling link::lo() may cause runtime error
 
 ** Improvements



[mesos] branch 1.7.x updated (f0cbafd -> 80c9fd7)

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from f0cbafd  Added MESOS-9536 to the 1.7.3 CHANGELOG.
 new bd438b7  Removed the duplicate pid check in Docker containerizer.
 new 80c9fd7  Added MESOS-9695 to the 1.7.3 CHANGELOG.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG  |  1 +
 src/slave/containerizer/docker.cpp | 27 ++-
 2 files changed, 7 insertions(+), 21 deletions(-)



[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.8.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 02f532ae876196c0c8abad9d6effb75d3ffa5db7
Author: Qian Zhang 
AuthorDate: Tue Apr 30 13:59:54 2019 +0200

Removed the duplicate pid check in Docker containerizer.

Review: https://reviews.apache.org/r/70561/
---
 src/slave/containerizer/docker.cpp | 27 ++-
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/src/slave/containerizer/docker.cpp 
b/src/slave/containerizer/docker.cpp
index 7f1d471..e4ad945 100644
--- a/src/slave/containerizer/docker.cpp
+++ b/src/slave/containerizer/docker.cpp
@@ -936,10 +936,6 @@ Future DockerContainerizerProcess::_recover(
   }
 }
 
-// Collection of pids that we've started reaping in order to
-// detect very unlikely duplicate scenario (see below).
-hashmap pids;
-
 foreachvalue (const FrameworkState& framework, state->frameworks) {
   foreachvalue (const ExecutorState& executor, framework.executors) {
 if (executor.info.isNone()) {
@@ -1018,9 +1014,12 @@ Future DockerContainerizerProcess::_recover(
 
 // Only reap the executor process if the executor can be connected
 // otherwise just set `container->status` to `None()`. This is to
-// avoid reaping an irrelevant process, e.g., after the agent host is
-// rebooted, the executor pid happens to be reused by another process.
-// See MESOS-8125 for details.
+// avoid reaping an irrelevant process, e.g., agent process is stopped
+// for a long time, and during this time executor terminates and its
+// pid happens to be reused by another irrelevant process. When agent
+// is restarted, it still considers this executor not complete (i.e.,
+// `run->completed` is false), so we would reap the irrelevant process
+// if we do not check whether that process can be connected.
 // Note that if both the pid and the port of the executor are reused
 // by another process or two processes respectively after the agent
 // host reboots we will still reap an irrelevant process, but that
@@ -1056,20 +1055,6 @@ Future DockerContainerizerProcess::_recover(
 container->status.future()
   ->onAny(defer(self(), ::reaped, containerId));
 
-if (pids.contains_value(pid)) {
-  // This should (almost) never occur. There is the
-  // possibility that a new executor is launched with the same
-  // pid as one that just exited (highly unlikely) and the
-  // slave dies after the new executor is launched but before
-  // it hears about the termination of the earlier executor
-  // (also unlikely).
-  return Failure(
-  "Detected duplicate pid " + stringify(pid) +
-  " for container " + stringify(containerId));
-}
-
-pids.put(containerId, pid);
-
 const string sandboxDirectory = paths::getExecutorRunPath(
 flags.work_dir,
 state->id,



[mesos] 02/02: Added MESOS-9695 to the 1.8.1 CHANGELOG.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.8.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit b032d4ec81c76277b274150f8027766f0e3d2275
Author: Andrei Budnik 
AuthorDate: Tue Apr 30 14:08:58 2019 +0200

Added MESOS-9695 to the 1.8.1 CHANGELOG.
---
 CHANGELOG | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CHANGELOG b/CHANGELOG
index d19085d..c99523c 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -4,7 +4,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP)
 
 ** Bug
   * [MESOS-9536] - Nested container launched with non-root user may not be 
able to write to its sandbox via the environment variable `MESOS_SANDBOX`.
-
+  * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
 
 Release Notes - Mesos - Version 1.8.0
 -



[mesos] branch 1.8.x updated (6160315 -> b032d4e)

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch 1.8.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from 6160315  Added MESOS-9536 to the 1.8.1 CHANGELOG.
 new 02f532a  Removed the duplicate pid check in Docker containerizer.
 new b032d4e  Added MESOS-9695 to the 1.8.1 CHANGELOG.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG  |  2 +-
 src/slave/containerizer/docker.cpp | 27 ++-
 2 files changed, 7 insertions(+), 22 deletions(-)



[mesos] branch master updated: Removed the duplicate pid check in Docker containerizer.

2019-04-30 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git


The following commit(s) were added to refs/heads/master by this push:
 new c8004ee  Removed the duplicate pid check in Docker containerizer.
c8004ee is described below

commit c8004ee8a0962d0e0f9147718853160bb708f5bc
Author: Qian Zhang 
AuthorDate: Tue Apr 30 13:23:26 2019 +0200

Removed the duplicate pid check in Docker containerizer.

Review: https://reviews.apache.org/r/70561/
---
 src/slave/containerizer/docker.cpp | 27 ++-
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/src/slave/containerizer/docker.cpp 
b/src/slave/containerizer/docker.cpp
index 7f1d471..e4ad945 100644
--- a/src/slave/containerizer/docker.cpp
+++ b/src/slave/containerizer/docker.cpp
@@ -936,10 +936,6 @@ Future DockerContainerizerProcess::_recover(
   }
 }
 
-// Collection of pids that we've started reaping in order to
-// detect very unlikely duplicate scenario (see below).
-hashmap pids;
-
 foreachvalue (const FrameworkState& framework, state->frameworks) {
   foreachvalue (const ExecutorState& executor, framework.executors) {
 if (executor.info.isNone()) {
@@ -1018,9 +1014,12 @@ Future DockerContainerizerProcess::_recover(
 
 // Only reap the executor process if the executor can be connected
 // otherwise just set `container->status` to `None()`. This is to
-// avoid reaping an irrelevant process, e.g., after the agent host is
-// rebooted, the executor pid happens to be reused by another process.
-// See MESOS-8125 for details.
+// avoid reaping an irrelevant process, e.g., agent process is stopped
+// for a long time, and during this time executor terminates and its
+// pid happens to be reused by another irrelevant process. When agent
+// is restarted, it still considers this executor not complete (i.e.,
+// `run->completed` is false), so we would reap the irrelevant process
+// if we do not check whether that process can be connected.
 // Note that if both the pid and the port of the executor are reused
 // by another process or two processes respectively after the agent
 // host reboots we will still reap an irrelevant process, but that
@@ -1056,20 +1055,6 @@ Future DockerContainerizerProcess::_recover(
 container->status.future()
   ->onAny(defer(self(), ::reaped, containerId));
 
-if (pids.contains_value(pid)) {
-  // This should (almost) never occur. There is the
-  // possibility that a new executor is launched with the same
-  // pid as one that just exited (highly unlikely) and the
-  // slave dies after the new executor is launched but before
-  // it hears about the termination of the earlier executor
-  // (also unlikely).
-  return Failure(
-  "Detected duplicate pid " + stringify(pid) +
-  " for container " + stringify(containerId));
-}
-
-pids.put(containerId, pid);
-
 const string sandboxDirectory = paths::getExecutorRunPath(
 flags.work_dir,
 state->id,



[mesos] 01/03: Added MESOS-9536 to the 1.8.1 CHANGELOG.

2019-04-30 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 63391fe89d7581cc2a42a8f5095630a5aa4bd502
Author: Qian Zhang 
AuthorDate: Tue Apr 30 09:46:12 2019 +0800

Added MESOS-9536 to the 1.8.1 CHANGELOG.
---
 CHANGELOG | 8 
 1 file changed, 8 insertions(+)

diff --git a/CHANGELOG b/CHANGELOG
index 799da78..83f7fca 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,3 +1,11 @@
+Release Notes - Mesos - Version 1.8.1 (WIP)
+---
+* This is a bug fix release.
+
+** Bug
+  * [MESOS-9536] - Nested container launched with non-root user may not be 
able to write to its sandbox via the environment variable `MESOS_SANDBOX`.
+
+
 Release Notes - Mesos - Version 1.8.0
 -
 This release contains the following highlights:



[mesos] branch master updated (4fa4f77 -> 977af9b)

2019-04-30 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from 4fa4f77  Documented LIBPROCESS_SSL_ENABLE_TLS_V1_3.
 new 63391fe  Added MESOS-9536 to the 1.8.1 CHANGELOG.
 new 9c20d4e  Added MESOS-9536 to the 1.7.3 CHANGELOG.
 new 977af9b  Added MESOS-9536 to the 1.6.3 CHANGELOG.

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG | 10 ++
 1 file changed, 10 insertions(+)



[mesos] 02/03: Added MESOS-9536 to the 1.7.3 CHANGELOG.

2019-04-30 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 9c20d4ec869a4fb3eb90fda975afc10c1bdb49c3
Author: Qian Zhang 
AuthorDate: Tue Apr 30 09:47:24 2019 +0800

Added MESOS-9536 to the 1.7.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 83f7fca..0870089 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -408,6 +408,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP)
   * [MESOS-8467] - Destroyed executors might be used after 
`Slave::publishResource()`.
   * [MESOS-9507] - Agent could not recover due to empty docker volume 
checkpointed files.
   * [MESOS-9529] - `/proc` should be remounted even if a nested container set 
`share_pid_namespace` to true.
+  * [MESOS-9536] - Nested container launched with non-root user may not be 
able to write to its sandbox via the environment variable `MESOS_SANDBOX`.
   * [MESOS-9549] - nvidia/cuda 10 does not work on GPU isolator.
   * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary 
commands in the Mesos agent's namespace.
   * [MESOS-9568] - SLRP does not clean up mount directories for destroyed 
MOUNT disks.



[mesos] 03/03: Added MESOS-9536 to the 1.6.3 CHANGELOG.

2019-04-30 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 977af9b87f582d6301083c730046d5be32c5fea6
Author: Qian Zhang 
AuthorDate: Tue Apr 30 09:48:33 2019 +0800

Added MESOS-9536 to the 1.6.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 0870089..9c01040 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -881,6 +881,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP)
 ** Bug
   * [MESOS-9507] - Agent could not recover due to empty docker volume 
checkpointed files.
   * [MESOS-9529] - `/proc` should be remounted even if a nested container set 
`share_pid_namespace` to true.
+  * [MESOS-9536] - Nested container launched with non-root user may not be 
able to write to its sandbox via the environment variable `MESOS_SANDBOX`.
   * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary 
commands in the Mesos agent's namespace.
   * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port 
Resources
   * [MESOS-9692] - Quota may be under allocated for disk resources.



[mesos] 02/02: Added MESOS-9536 to the 1.7.3 CHANGELOG.

2019-04-30 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit f0cbafdc82bfa9b93df01053c9b1bd5855dc0dba
Author: Qian Zhang 
AuthorDate: Tue Apr 30 09:47:24 2019 +0800

Added MESOS-9536 to the 1.7.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 1fbeb1b..ae40637 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -6,6 +6,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP)
   * [MESOS-8467] - Destroyed executors might be used after 
`Slave::publishResource()`.
   * [MESOS-9507] - Agent could not recover due to empty docker volume 
checkpointed files.
   * [MESOS-9529] - `/proc` should be remounted even if a nested container set 
`share_pid_namespace` to true.
+  * [MESOS-9536] - Nested container launched with non-root user may not be 
able to write to its sandbox via the environment variable `MESOS_SANDBOX`.
   * [MESOS-9549] - nvidia/cuda 10 does not work on GPU isolator.
   * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary 
commands in the Mesos agent's namespace.
   * [MESOS-9568] - SLRP does not clean up mount directories for destroyed 
MOUNT disks.



[mesos] 01/02: Made nested contaienr can access its sandbox via `MESOS_SANDBOX`.

2019-04-30 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit beaae8df702e51102069b2b0502e924697ae36a2
Author: Qian Zhang 
AuthorDate: Fri Apr 19 17:22:45 2019 +0800

Made nested contaienr can access its sandbox via `MESOS_SANDBOX`.

Previously in MESOS-8332 we narrowed task sandbox permissions from 0755
to 0750 which will cause nested container may not has permission to
access its sandbox via the environment variable `MESOS_SANDBOX`. Now in
this patch, for nested container which does not have its own rootfs, we
bind mount its sandbox to the directory specified via the agent flag
`--sandbox_directory` and set `MESOS_SANDBOX` to `--sandbox_directory`
as well, in this way such nested container will have the permission
to access its sandbox via `MESOS_SANDBOX`.

Review: https://reviews.apache.org/r/70514
---
 src/slave/containerizer/mesos/containerizer.cpp| 24 +++--
 .../mesos/isolators/filesystem/linux.cpp   | 25 ++
 2 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/src/slave/containerizer/mesos/containerizer.cpp 
b/src/slave/containerizer/mesos/containerizer.cpp
index e8a4ab3..1867f3b 100644
--- a/src/slave/containerizer/mesos/containerizer.cpp
+++ b/src/slave/containerizer/mesos/containerizer.cpp
@@ -1779,15 +1779,25 @@ Future 
MesosContainerizerProcess::_launch(
   if (container->containerClass() == ContainerClass::DEFAULT) {
 // TODO(jieyu): Consider moving this to filesystem isolator.
 //
-// NOTE: For the command executor case, although it uses the host
-// filesystem for itself, we still set 'MESOS_SANDBOX' according to
-// the root filesystem of the task (if specified). Command executor
-// itself does not use this environment variable.
+// NOTE: For the command executor case, although it uses the host 
filesystem
+// for itself, we still set `MESOS_SANDBOX` according to the root 
filesystem
+// of the task (if specified). Command executor itself does not use this
+// environment variable. For nested container which does not have its own
+// rootfs, if the `filesystem/linux` isolator is enabled, we will also set
+// `MESOS_SANDBOX` to `flags.sandbox_directory` since in `prepare` method
+// of the `filesystem/linux` isolator we bind mount such nested container's
+// sandbox to `flags.sandbox_directory`. Since such bind mount is only done
+// by the `filesystem/linux` isolator, if another filesystem isolator 
(e.g.,
+// `filesystem/posix`) is enabled instead, nested container may still have
+// no permission to access its sandbox via `MESOS_SANDBOX`.
 Environment::Variable* variable = containerEnvironment.add_variables();
 variable->set_name("MESOS_SANDBOX");
-variable->set_value(container->config->has_rootfs()
-  ? flags.sandbox_directory
-  : container->config->directory());
+variable->set_value(
+(container->config->has_rootfs() ||
+ (strings::contains(flags.isolation, "filesystem/linux") &&
+  containerId.has_parent()))
+  ? flags.sandbox_directory
+  : container->config->directory());
   }
 
   // `launchInfo.environment` contains the environment returned by
diff --git a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp 
b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
index a47899c..93a88a0 100644
--- a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
+++ b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
@@ -202,6 +202,16 @@ Try 
LinuxFilesystemIsolatorProcess::create(const Flags& flags)
 }
   }
 
+  // Create sandbox directory. We will bind mount the sandbox of nested
+  // container which does not have its own rootfs to this directory. See
+  // `prepare` for details.
+  Try mkdir = os::mkdir(flags.sandbox_directory);
+  if (mkdir.isError()) {
+return Error(
+"Failed to create sandbox directory at '" +
+flags.sandbox_directory + "': " + mkdir.error());
+  }
+
   Owned process(
   new LinuxFilesystemIsolatorProcess(flags));
 
@@ -395,6 +405,21 @@ Future> 
LinuxFilesystemIsolatorProcess::prepare(
 mount->set_source(containerConfig.directory());
 mount->set_target(sandbox);
 mount->set_flags(MS_BIND | MS_REC);
+  } else if (containerId.has_parent()) {
+// For nested container which does not have its own rootfs, bind mount its
+// sandbox to the directory specified via `flags.sandbox_directory` (e.g.,
+// `/mnt/mesos/sandbox`) in its own mount namespace and set the environment
+// variable `MESOS_SANDBOX` to `flags.sandbox_directory` (see the `_launch`
+// method of `MesosContainerizerProcess` for details). The reason that we 
do
+// this is, in MESOS-8332 we narrowed task sandbox permissions from 0755 to
+// 0750, since nested 

[mesos] 02/02: Added MESOS-9536 to the 1.6.3 CHANGELOG.

2019-04-30 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 45bfa2aa42c119da6f83b865d4929ee6064c2697
Author: Qian Zhang 
AuthorDate: Tue Apr 30 09:48:33 2019 +0800

Added MESOS-9536 to the 1.6.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index a46b93f..58fda2e 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -5,6 +5,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP)
 ** Bug
   * [MESOS-9507] - Agent could not recover due to empty docker volume 
checkpointed files.
   * [MESOS-9529] - `/proc` should be remounted even if a nested container set 
`share_pid_namespace` to true.
+  * [MESOS-9536] - Nested container launched with non-root user may not be 
able to write to its sandbox via the environment variable `MESOS_SANDBOX`.
   * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary 
commands in the Mesos agent's namespace.
   * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port 
Resources
   * [MESOS-9692] - Quota may be under allocated for disk resources.



[mesos] 01/02: Made nested contaienr can access its sandbox via `MESOS_SANDBOX`.

2019-04-30 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit e5149a4a00625845995e38eaf96c35ef6817be37
Author: Qian Zhang 
AuthorDate: Fri Apr 19 17:22:45 2019 +0800

Made nested contaienr can access its sandbox via `MESOS_SANDBOX`.

Previously in MESOS-8332 we narrowed task sandbox permissions from 0755
to 0750 which will cause nested container may not has permission to
access its sandbox via the environment variable `MESOS_SANDBOX`. Now in
this patch, for nested container which does not have its own rootfs, we
bind mount its sandbox to the directory specified via the agent flag
`--sandbox_directory` and set `MESOS_SANDBOX` to `--sandbox_directory`
as well, in this way such nested container will have the permission
to access its sandbox via `MESOS_SANDBOX`.

Review: https://reviews.apache.org/r/70514
---
 src/slave/containerizer/mesos/containerizer.cpp| 24 +++--
 .../mesos/isolators/filesystem/linux.cpp   | 25 ++
 2 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/src/slave/containerizer/mesos/containerizer.cpp 
b/src/slave/containerizer/mesos/containerizer.cpp
index 6e635d8..a34978a 100644
--- a/src/slave/containerizer/mesos/containerizer.cpp
+++ b/src/slave/containerizer/mesos/containerizer.cpp
@@ -1747,15 +1747,25 @@ Future 
MesosContainerizerProcess::_launch(
   if (container->containerClass() == ContainerClass::DEFAULT) {
 // TODO(jieyu): Consider moving this to filesystem isolator.
 //
-// NOTE: For the command executor case, although it uses the host
-// filesystem for itself, we still set 'MESOS_SANDBOX' according to
-// the root filesystem of the task (if specified). Command executor
-// itself does not use this environment variable.
+// NOTE: For the command executor case, although it uses the host 
filesystem
+// for itself, we still set `MESOS_SANDBOX` according to the root 
filesystem
+// of the task (if specified). Command executor itself does not use this
+// environment variable. For nested container which does not have its own
+// rootfs, if the `filesystem/linux` isolator is enabled, we will also set
+// `MESOS_SANDBOX` to `flags.sandbox_directory` since in `prepare` method
+// of the `filesystem/linux` isolator we bind mount such nested container's
+// sandbox to `flags.sandbox_directory`. Since such bind mount is only done
+// by the `filesystem/linux` isolator, if another filesystem isolator 
(e.g.,
+// `filesystem/posix`) is enabled instead, nested container may still have
+// no permission to access its sandbox via `MESOS_SANDBOX`.
 Environment::Variable* variable = containerEnvironment.add_variables();
 variable->set_name("MESOS_SANDBOX");
-variable->set_value(container->config->has_rootfs()
-  ? flags.sandbox_directory
-  : container->config->directory());
+variable->set_value(
+(container->config->has_rootfs() ||
+ (strings::contains(flags.isolation, "filesystem/linux") &&
+  containerId.has_parent()))
+  ? flags.sandbox_directory
+  : container->config->directory());
   }
 
   // `launchInfo.environment` contains the environment returned by
diff --git a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp 
b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
index 2844327..b3d1d4e 100644
--- a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
+++ b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
@@ -203,6 +203,16 @@ Try 
LinuxFilesystemIsolatorProcess::create(const Flags& flags)
 }
   }
 
+  // Create sandbox directory. We will bind mount the sandbox of nested
+  // container which does not have its own rootfs to this directory. See
+  // `prepare` for details.
+  Try mkdir = os::mkdir(flags.sandbox_directory);
+  if (mkdir.isError()) {
+return Error(
+"Failed to create sandbox directory at '" +
+flags.sandbox_directory + "': " + mkdir.error());
+  }
+
   Owned process(
   new LinuxFilesystemIsolatorProcess(flags));
 
@@ -396,6 +406,21 @@ Future> 
LinuxFilesystemIsolatorProcess::prepare(
 mount->set_source(containerConfig.directory());
 mount->set_target(sandbox);
 mount->set_flags(MS_BIND | MS_REC);
+  } else if (containerId.has_parent()) {
+// For nested container which does not have its own rootfs, bind mount its
+// sandbox to the directory specified via `flags.sandbox_directory` (e.g.,
+// `/mnt/mesos/sandbox`) in its own mount namespace and set the environment
+// variable `MESOS_SANDBOX` to `flags.sandbox_directory` (see the `_launch`
+// method of `MesosContainerizerProcess` for details). The reason that we 
do
+// this is, in MESOS-8332 we narrowed task sandbox permissions from 0755 to
+// 0750, since nested 

[mesos] branch 1.6.x updated (ebf1478 -> 45bfa2a)

2019-04-30 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a change to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from ebf1478  Added MESOS-9619 to the 1.6.3 CHANGELOG.
 new e5149a4  Made nested contaienr can access its sandbox via 
`MESOS_SANDBOX`.
 new 45bfa2a  Added MESOS-9536 to the 1.6.3 CHANGELOG.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG  |  1 +
 src/slave/containerizer/mesos/containerizer.cpp| 24 +++--
 .../mesos/isolators/filesystem/linux.cpp   | 25 ++
 3 files changed, 43 insertions(+), 7 deletions(-)