[mesos] branch master updated: Added Andrei Budnik to the committers list.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 995b243 Added Andrei Budnik to the committers list. 995b243 is described below commit 995b24307a6c679d89af11cb1185fe8b27b5ef86 Author: Andrei Budnik AuthorDate: Thu Apr 4 16:09:02 2019 +0200 Added Andrei Budnik to the committers list. --- docs/committers.md | 7 +++ 1 file changed, 7 insertions(+) diff --git a/docs/committers.md b/docs/committers.md index f3c6220..f7b2d03 100644 --- a/docs/committers.md +++ b/docs/committers.md @@ -95,6 +95,13 @@ canonical list is located on +1 + Andrei Budnik + Mesosphere + + abud...@apache.org + + + +1 Benno Evers Mesosphere
[mesos] branch master updated: Fixed use-after-free bug in Docker provisioner store.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new db917f6 Fixed use-after-free bug in Docker provisioner store. db917f6 is described below commit db917f639e2d05fcf493e87649f42ddd2abfeae0 Author: Andrei Budnik AuthorDate: Fri Apr 5 13:06:49 2019 +0200 Fixed use-after-free bug in Docker provisioner store. Deferred lambda callback of the `moveLayers()` to the `StoreProcess` to prevent use-after-free of the process object since the callback refers to the `StoreProcess` class variable `flags`. Review: https://reviews.apache.org/r/70405 --- src/slave/containerizer/mesos/provisioner/docker/store.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/slave/containerizer/mesos/provisioner/docker/store.cpp b/src/slave/containerizer/mesos/provisioner/docker/store.cpp index 909364d..36b2c7d 100644 --- a/src/slave/containerizer/mesos/provisioner/docker/store.cpp +++ b/src/slave/containerizer/mesos/provisioner/docker/store.cpp @@ -446,7 +446,7 @@ Future StoreProcess::moveLayers( } return collect(futures) -.then([=]() -> Future { +.then(defer(self(), [=]() -> Future { if (image.has_config_digest()) { const string configSource = path::join(staging, image.config_digest()); const string configTarget = paths::getImageLayerPath( @@ -464,7 +464,7 @@ Future StoreProcess::moveLayers( } return image; -}); +})); }
[mesos] branch master updated: Clarified a comment for `SeccompInfo` in mesos.proto.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 5ca5625 Clarified a comment for `SeccompInfo` in mesos.proto. 5ca5625 is described below commit 5ca56257e9377c36c595cc83ce2eeb98a166945e Author: Andrei Budnik AuthorDate: Fri Apr 5 16:14:04 2019 +0200 Clarified a comment for `SeccompInfo` in mesos.proto. Review: https://reviews.apache.org/r/70407 --- include/mesos/mesos.proto| 5 - include/mesos/v1/mesos.proto | 5 - 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/include/mesos/mesos.proto b/include/mesos/mesos.proto index 7aa7d5b..dc6a87f 100644 --- a/include/mesos/mesos.proto +++ b/include/mesos/mesos.proto @@ -3228,7 +3228,9 @@ message SeccompInfo { // If not set or set to `false`, the container is launched with // the profile specified in the `profile_name` field. // - // NOTE: `profile_name` should not be specified if `unconfined` set to `true`. + // NOTE: `profile_name` must not be specified if `unconfined` set to `true`. + // `profile_name` must be specified if `unconfined` is not set or + // is set to `false`. optional bool unconfined = 2; } @@ -3258,6 +3260,7 @@ message LinuxInfo { optional bool share_pid_namespace = 4; // Represents Seccomp configuration, which is used for syscall filtering. + // This field is used to override the agent's default Seccomp configuration. optional SeccompInfo seccomp = 5; } diff --git a/include/mesos/v1/mesos.proto b/include/mesos/v1/mesos.proto index d66f29c..3507909 100644 --- a/include/mesos/v1/mesos.proto +++ b/include/mesos/v1/mesos.proto @@ -3221,7 +3221,9 @@ message SeccompInfo { // If not set or set to `false`, the container is launched with // the profile specified in the `profile_name` field. // - // NOTE: `profile_name` should not be specified if `unconfined` set to `true`. + // NOTE: `profile_name` must not be specified if `unconfined` set to `true`. + // `profile_name` must be specified if `unconfined` is not set or + // is set to `false`. optional bool unconfined = 2; } @@ -3251,6 +3253,7 @@ message LinuxInfo { optional bool share_pid_namespace = 4; // Represents Seccomp configuration, which is used for syscall filtering. + // This field is used to override the agent's default Seccomp configuration. optional SeccompInfo seccomp = 5; }
[mesos] branch master updated: Fixed compilation error on Mac OS.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new f2bedda Fixed compilation error on Mac OS. f2bedda is described below commit f2beddaf04b82c7bf89b4abc6a8f76ab7e41705b Author: Andrei Budnik AuthorDate: Fri Jun 7 18:02:01 2019 +0200 Fixed compilation error on Mac OS. This patch adds missing switch case to fix compilation error introduced in `bc5a57122635`. Review: https://reviews.apache.org/r/70811 --- src/slave/containerizer/mesos/launch.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/slave/containerizer/mesos/launch.cpp b/src/slave/containerizer/mesos/launch.cpp index a69a688..b29ec55 100644 --- a/src/slave/containerizer/mesos/launch.cpp +++ b/src/slave/containerizer/mesos/launch.cpp @@ -496,8 +496,8 @@ static Try executeFileOperation(const ContainerFileOperation& op) return Nothing(); } -#ifdef __linux__ case ContainerFileOperation::MOUNT: { +#ifdef __linux__ Try result = mountContainerFilesystem(op.mount()); if (result.isError()) { return Error( @@ -506,8 +506,10 @@ static Try executeFileOperation(const ContainerFileOperation& op) } return Nothing(); -} +#else + return Error("Container mount is not supported on non-Linux systems"); #endif // __linux__ +} case ContainerFileOperation::RENAME: { Try result =
[mesos] branch master updated: Added MESOS-9695 to the 1.4.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new fc8847d Added MESOS-9695 to the 1.4.4 CHANGELOG. fc8847d is described below commit fc8847d0002a6c937e59341aace12e739102449d Author: Andrei Budnik AuthorDate: Wed May 1 12:25:30 2019 +0200 Added MESOS-9695 to the 1.4.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 1a4d782..f3f6f92 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1979,6 +1979,7 @@ Release Notes - Mesos - Version 1.4.4 (WIP) ** Bug * [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer ** Improvement: * [MESOS-9159] - Support Foreign URLs in docker registry puller.
[mesos] branch master updated: Allowed compiling Seccomp isolator on older kernel versions.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 1954bea Allowed compiling Seccomp isolator on older kernel versions. 1954bea is described below commit 1954beae2133fa55efe53038a020de4309b7ae91 Author: Andrei Budnik AuthorDate: Wed May 1 11:52:49 2019 +0200 Allowed compiling Seccomp isolator on older kernel versions. This patch removes dependency on `linux/seccomp.h` header, which may be missing on some Linux distributions. Review: https://reviews.apache.org/r/70567/ --- configure.ac | 8 src/slave/containerizer/mesos/isolators/linux/seccomp.cpp | 10 -- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/configure.ac b/configure.ac index e5afde4..b4bad57 100644 --- a/configure.ac +++ b/configure.ac @@ -1627,14 +1627,6 @@ The Seccomp isolator is only supported on Linux. --- ])]) - AC_CHECK_HEADERS([linux/seccomp.h], [], - [AC_MSG_ERROR([Cannot find seccomp system headers -Please install the Linux kernel headers and make sure that you have -Linux kernel 3.5+ installed. - ])]) - # Check if libseccomp prefix path was supplied and if so, add it to # CPPFLAGS while extending it by /include and to LDFLAGS while # extending it by /lib. diff --git a/src/slave/containerizer/mesos/isolators/linux/seccomp.cpp b/src/slave/containerizer/mesos/isolators/linux/seccomp.cpp index 5624c24..bb35c6e 100644 --- a/src/slave/containerizer/mesos/isolators/linux/seccomp.cpp +++ b/src/slave/containerizer/mesos/isolators/linux/seccomp.cpp @@ -14,8 +14,6 @@ // See the License for the specific language governing permissions and // limitations under the License. -#include - #include #include @@ -34,6 +32,14 @@ using mesos::slave::ContainerConfig; using mesos::slave::ContainerLaunchInfo; using mesos::slave::Isolator; +// NOTE: The definition below was taken from the Linux Kernel sources. +// +// TODO(abudnik): This definition should be removed in favor of using +// `linux/seccomp.h` once we drop support for kernels older than 3.5. +#if !defined(SECCOMP_MODE_FILTER) +#define SECCOMP_MODE_FILTER 2 +#endif + namespace mesos { namespace internal { namespace slave {
[mesos] branch 1.8.x updated: Allowed compiling Seccomp isolator on older kernel versions.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/1.8.x by this push: new a684f07 Allowed compiling Seccomp isolator on older kernel versions. a684f07 is described below commit a684f073db683d83965fccbc7e5a308e35a7da9c Author: Andrei Budnik AuthorDate: Wed May 1 11:52:49 2019 +0200 Allowed compiling Seccomp isolator on older kernel versions. This patch removes dependency on `linux/seccomp.h` header, which may be missing on some Linux distributions. Review: https://reviews.apache.org/r/70567/ --- configure.ac | 8 src/slave/containerizer/mesos/isolators/linux/seccomp.cpp | 10 -- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/configure.ac b/configure.ac index a367c28..8f3a36e 100644 --- a/configure.ac +++ b/configure.ac @@ -1627,14 +1627,6 @@ The Seccomp isolator is only supported on Linux. --- ])]) - AC_CHECK_HEADERS([linux/seccomp.h], [], - [AC_MSG_ERROR([Cannot find seccomp system headers -Please install the Linux kernel headers and make sure that you have -Linux kernel 3.5+ installed. - ])]) - # Check if libseccomp prefix path was supplied and if so, add it to # CPPFLAGS while extending it by /include and to LDFLAGS while # extending it by /lib. diff --git a/src/slave/containerizer/mesos/isolators/linux/seccomp.cpp b/src/slave/containerizer/mesos/isolators/linux/seccomp.cpp index 5624c24..bb35c6e 100644 --- a/src/slave/containerizer/mesos/isolators/linux/seccomp.cpp +++ b/src/slave/containerizer/mesos/isolators/linux/seccomp.cpp @@ -14,8 +14,6 @@ // See the License for the specific language governing permissions and // limitations under the License. -#include - #include #include @@ -34,6 +32,14 @@ using mesos::slave::ContainerConfig; using mesos::slave::ContainerLaunchInfo; using mesos::slave::Isolator; +// NOTE: The definition below was taken from the Linux Kernel sources. +// +// TODO(abudnik): This definition should be removed in favor of using +// `linux/seccomp.h` once we drop support for kernels older than 3.5. +#if !defined(SECCOMP_MODE_FILTER) +#define SECCOMP_MODE_FILTER 2 +#endif + namespace mesos { namespace internal { namespace slave {
[mesos] branch 1.4.x updated (1a76202 -> f05058d)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.4.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 1a76202 Added MESOS-9159 and MESOS-9675 to the 1.4.4 CHANGELOG. new ca65f99 Removed the duplicate pid check in Docker containerizer. new f05058d Added MESOS-9695 to the 1.4.4 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/containerizer/docker.cpp | 27 ++- 2 files changed, 7 insertions(+), 21 deletions(-)
[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.4.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit ca65f991727494b9b78e64a19306231ac004289f Author: Qian Zhang AuthorDate: Tue Apr 30 13:23:26 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index 31a47f0..97cd75e 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -925,10 +925,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1007,9 +1003,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1045,20 +1044,6 @@ Future DockerContainerizerProcess::_recover( container->status.future().get() .onAny(defer(self(), ::reaped, containerId)); -if (pids.containsValue(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 02/02: Added MESOS-9695 to the 1.4.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.4.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit f05058dad7219950fd9bdc2748ab9c9a79d6e7f1 Author: Andrei Budnik AuthorDate: Wed May 1 12:25:30 2019 +0200 Added MESOS-9695 to the 1.4.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 0f32baa..0ce1715 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,6 +4,7 @@ Release Notes - Mesos - Version 1.4.4 (WIP) ** Bug * [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer ** Improvement: * [MESOS-9159] - Support Foreign URLs in docker registry puller.
[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit bd438b797a15724945a60ee8d57cec656e23ae2d Author: Qian Zhang AuthorDate: Tue Apr 30 13:23:26 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index 192dc29..dacf4de 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -945,10 +945,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1027,9 +1023,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1065,20 +1064,6 @@ Future DockerContainerizerProcess::_recover( container->status.future() ->onAny(defer(self(), ::reaped, containerId)); -if (pids.containsValue(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 02/02: Added MESOS-9695 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 80c9fd79448b1ea6f6367da8def911b75ababafd Author: Andrei Budnik AuthorDate: Tue Apr 30 14:30:32 2019 +0200 Added MESOS-9695 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index ae40637..369a2c8 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -15,6 +15,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations. * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvements
[mesos] branch 1.7.x updated (f0cbafd -> 80c9fd7)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from f0cbafd Added MESOS-9536 to the 1.7.3 CHANGELOG. new bd438b7 Removed the duplicate pid check in Docker containerizer. new 80c9fd7 Added MESOS-9695 to the 1.7.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/containerizer/docker.cpp | 27 ++- 2 files changed, 7 insertions(+), 21 deletions(-)
[mesos] branch 1.6.x updated (45bfa2a -> 13fdaa4)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 45bfa2a Added MESOS-9536 to the 1.6.3 CHANGELOG. new d627a91 Removed the duplicate pid check in Docker containerizer. new 13fdaa4 Added MESOS-9695 to the 1.6.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/containerizer/docker.cpp | 27 ++- 2 files changed, 7 insertions(+), 21 deletions(-)
[mesos] 02/02: Added MESOS-9695 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 13fdaa4e7be41f5129d2b96944569abca2b60905 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:36:47 2019 +0200 Added MESOS-9695 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 58fda2e..55b74d1 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -9,6 +9,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement
[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit d627a919c651e8dacd14569c027fe4422ef87828 Author: Qian Zhang AuthorDate: Tue Apr 30 13:23:26 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index ef468ed..85b22f4 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -937,10 +937,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1019,9 +1015,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1057,20 +1056,6 @@ Future DockerContainerizerProcess::_recover( container->status.future() ->onAny(defer(self(), ::reaped, containerId)); -if (pids.containsValue(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 02/02: Added MESOS-9695 to the 1.5.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 791ac63c72c1c4b868c2b3a126c04075575757c1 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:45:01 2019 +0200 Added MESOS-9695 to the 1.5.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 6e175cf..fd85213 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -5,6 +5,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP) ** Bug * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement
[mesos] branch 1.5.x updated (f8e0e41 -> 791ac63)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from f8e0e41 Added MESOS-9619 to the 1.5.4 CHANGELOG. new f4a6453 Removed the duplicate pid check in Docker containerizer. new 791ac63 Added MESOS-9695 to the 1.5.4 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/containerizer/docker.cpp | 27 ++- 2 files changed, 7 insertions(+), 21 deletions(-)
[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit f4a6453c77719ed531a9287b6c9cdeb7ad268865 Author: Qian Zhang AuthorDate: Tue Apr 30 13:23:26 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index 9dbb286..81f72d4 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -934,10 +934,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1016,9 +1012,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1054,20 +1053,6 @@ Future DockerContainerizerProcess::_recover( container->status.future().get() .onAny(defer(self(), ::reaped, containerId)); -if (pids.containsValue(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 03/04: Added MESOS-9695 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 138bfe41c822823afe6d2a0532e3c95e4f7d3bfe Author: Andrei Budnik AuthorDate: Tue Apr 30 14:36:47 2019 +0200 Added MESOS-9695 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index e7ec1a5..425df0e 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -886,6 +886,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement
[mesos] 04/04: Added MESOS-9695 to the 1.5.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 4eec48e17d08575c18458f713d2e8280faab99d6 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:45:01 2019 +0200 Added MESOS-9695 to the 1.5.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 425df0e..1a4d782 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1350,6 +1350,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP) ** Bug * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement
[mesos] branch master updated (c8004ee -> 4eec48e)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from c8004ee Removed the duplicate pid check in Docker containerizer. new a8fe548 Added MESOS-9695 to the 1.8.1 CHANGELOG. new 592f7c4 Added MESOS-9695 to the 1.7.3 CHANGELOG. new 138bfe4 Added MESOS-9695 to the 1.6.3 CHANGELOG. new 4eec48e Added MESOS-9695 to the 1.5.4 CHANGELOG. The 4 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 5 - 1 file changed, 4 insertions(+), 1 deletion(-)
[mesos] 01/04: Added MESOS-9695 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit a8fe548b091732c2b46f5e6a7d7392de92644f5a Author: Andrei Budnik AuthorDate: Tue Apr 30 14:08:58 2019 +0200 Added MESOS-9695 to the 1.8.1 CHANGELOG. --- CHANGELOG | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG b/CHANGELOG index 9c01040..89452a0 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,7 +4,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) ** Bug * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. - + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer Release Notes - Mesos - Version 1.8.0 -
[mesos] 02/04: Added MESOS-9695 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 592f7c44a304f5c02804a0de98df07ec295ce070 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:30:32 2019 +0200 Added MESOS-9695 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 89452a0..e7ec1a5 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -417,6 +417,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations. * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvements
[mesos] branch master updated: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new c8004ee Removed the duplicate pid check in Docker containerizer. c8004ee is described below commit c8004ee8a0962d0e0f9147718853160bb708f5bc Author: Qian Zhang AuthorDate: Tue Apr 30 13:23:26 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index 7f1d471..e4ad945 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -936,10 +936,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1018,9 +1014,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1056,20 +1055,6 @@ Future DockerContainerizerProcess::_recover( container->status.future() ->onAny(defer(self(), ::reaped, containerId)); -if (pids.contains_value(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 02f532ae876196c0c8abad9d6effb75d3ffa5db7 Author: Qian Zhang AuthorDate: Tue Apr 30 13:59:54 2019 +0200 Removed the duplicate pid check in Docker containerizer. Review: https://reviews.apache.org/r/70561/ --- src/slave/containerizer/docker.cpp | 27 ++- 1 file changed, 6 insertions(+), 21 deletions(-) diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp index 7f1d471..e4ad945 100644 --- a/src/slave/containerizer/docker.cpp +++ b/src/slave/containerizer/docker.cpp @@ -936,10 +936,6 @@ Future DockerContainerizerProcess::_recover( } } -// Collection of pids that we've started reaping in order to -// detect very unlikely duplicate scenario (see below). -hashmap pids; - foreachvalue (const FrameworkState& framework, state->frameworks) { foreachvalue (const ExecutorState& executor, framework.executors) { if (executor.info.isNone()) { @@ -1018,9 +1014,12 @@ Future DockerContainerizerProcess::_recover( // Only reap the executor process if the executor can be connected // otherwise just set `container->status` to `None()`. This is to -// avoid reaping an irrelevant process, e.g., after the agent host is -// rebooted, the executor pid happens to be reused by another process. -// See MESOS-8125 for details. +// avoid reaping an irrelevant process, e.g., agent process is stopped +// for a long time, and during this time executor terminates and its +// pid happens to be reused by another irrelevant process. When agent +// is restarted, it still considers this executor not complete (i.e., +// `run->completed` is false), so we would reap the irrelevant process +// if we do not check whether that process can be connected. // Note that if both the pid and the port of the executor are reused // by another process or two processes respectively after the agent // host reboots we will still reap an irrelevant process, but that @@ -1056,20 +1055,6 @@ Future DockerContainerizerProcess::_recover( container->status.future() ->onAny(defer(self(), ::reaped, containerId)); -if (pids.contains_value(pid)) { - // This should (almost) never occur. There is the - // possibility that a new executor is launched with the same - // pid as one that just exited (highly unlikely) and the - // slave dies after the new executor is launched but before - // it hears about the termination of the earlier executor - // (also unlikely). - return Failure( - "Detected duplicate pid " + stringify(pid) + - " for container " + stringify(containerId)); -} - -pids.put(containerId, pid); - const string sandboxDirectory = paths::getExecutorRunPath( flags.work_dir, state->id,
[mesos] branch 1.8.x updated (6160315 -> b032d4e)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 6160315 Added MESOS-9536 to the 1.8.1 CHANGELOG. new 02f532a Removed the duplicate pid check in Docker containerizer. new b032d4e Added MESOS-9695 to the 1.8.1 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 2 +- src/slave/containerizer/docker.cpp | 27 ++- 2 files changed, 7 insertions(+), 22 deletions(-)
[mesos] 02/02: Added MESOS-9695 to the 1.8.1 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit b032d4ec81c76277b274150f8027766f0e3d2275 Author: Andrei Budnik AuthorDate: Tue Apr 30 14:08:58 2019 +0200 Added MESOS-9695 to the 1.8.1 CHANGELOG. --- CHANGELOG | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG b/CHANGELOG index d19085d..c99523c 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,7 +4,7 @@ Release Notes - Mesos - Version 1.8.1 (WIP) ** Bug * [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`. - + * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer Release Notes - Mesos - Version 1.8.0 -
[mesos] branch master updated: Added license for libseccomp to toplevel LICENSE file.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new d3d83ee Added license for libseccomp to toplevel LICENSE file. d3d83ee is described below commit d3d83eed0df88582885fe10fc9d1f8c6b3a3b41e Author: Benjamin Bannier AuthorDate: Wed May 8 15:47:27 2019 +0200 Added license for libseccomp to toplevel LICENSE file. Review: https://reviews.apache.org/r/70610/ --- LICENSE | 462 1 file changed, 462 insertions(+) diff --git a/LICENSE b/LICENSE index 1d9958d..2c5919d 100644 --- a/LICENSE +++ b/LICENSE @@ -833,3 +833,465 @@ HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +== +For the libseccomp library +(3rdparty/libseccomp-2.3.3.tar.gz): +== + + GNU LESSER GENERAL PUBLIC LICENSE + Version 2.1, February 1999 + + Copyright (C) 1991, 1999 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the Lesser GPL. It also counts + as the successor of the GNU Library Public License, version 2, hence + the version number 2.1.] + +Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Lesser General Public License, applies to some +specially designated software packages--typically libraries--of the +Free Software Foundation and other authors who decide to use it. You +can use it too, but we suggest you first think carefully about whether +this license or the ordinary General Public License is the better +strategy to use in any particular case, based on the explanations below. + + When we speak of free software, we are referring to freedom of use, +not price. Our General Public Licenses are designed to make sure that +you have the freedom to distribute copies of free software (and charge +for this service if you wish); that you receive source code or can get +it if you want it; that you can change the software and use pieces of +it in new free programs; and that you are informed that you can do +these things. + + To protect your rights, we need to make restrictions that forbid +distributors to deny you these rights or to ask you to surrender these +rights. These restrictions translate to certain responsibilities for +you if you distribute copies of the library or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link other code with the library, you must provide +complete object files to the recipients, so that they can relink them +with the library after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + We protect your rights with a two-step method: (1) we copyright the +library, and (2) we offer you this license, which gives you legal +permission to copy, distribute and/or modify the library. + + To protect each distributor, we want to make it very clear that +there is no warranty for the free library. Also, if the library is +modified by someone else and passed on, the recipients should know +that what they have is not the original version, so that the original +author's reputation will not be affected by problems that might be +introduced by others. + + Finally, software patents pose a constant threat to the existence of +any free program. We wish to make sure that a company cannot +effectively restrict the users of a free program by obtaining a +restrictive license from a patent holder. Therefore, we insist that +any patent license obtained for a version of the library must be +consistent with the full freedom of use specified in this license. + + Most GNU software, including some libraries, is covered by the +ordinary GNU General Public License. This license, the GNU Lesser +General Public License, applies to certain designated libraries, and +is quite different from the ordinary General Public License. We use +this license for certain libraries
[mesos] branch 1.8.x updated: Added license for libseccomp to toplevel LICENSE file.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/1.8.x by this push: new ea28002 Added license for libseccomp to toplevel LICENSE file. ea28002 is described below commit ea2800276ceff4154c8c13add01ce79c4d97554a Author: Benjamin Bannier AuthorDate: Wed May 8 15:50:05 2019 +0200 Added license for libseccomp to toplevel LICENSE file. Review: https://reviews.apache.org/r/70610/ --- LICENSE | 462 1 file changed, 462 insertions(+) diff --git a/LICENSE b/LICENSE index 8632389..e5eb300 100644 --- a/LICENSE +++ b/LICENSE @@ -833,3 +833,465 @@ HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +== +For the libseccomp library +(3rdparty/libseccomp-2.3.3.tar.gz): +== + + GNU LESSER GENERAL PUBLIC LICENSE + Version 2.1, February 1999 + + Copyright (C) 1991, 1999 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the Lesser GPL. It also counts + as the successor of the GNU Library Public License, version 2, hence + the version number 2.1.] + +Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Lesser General Public License, applies to some +specially designated software packages--typically libraries--of the +Free Software Foundation and other authors who decide to use it. You +can use it too, but we suggest you first think carefully about whether +this license or the ordinary General Public License is the better +strategy to use in any particular case, based on the explanations below. + + When we speak of free software, we are referring to freedom of use, +not price. Our General Public Licenses are designed to make sure that +you have the freedom to distribute copies of free software (and charge +for this service if you wish); that you receive source code or can get +it if you want it; that you can change the software and use pieces of +it in new free programs; and that you are informed that you can do +these things. + + To protect your rights, we need to make restrictions that forbid +distributors to deny you these rights or to ask you to surrender these +rights. These restrictions translate to certain responsibilities for +you if you distribute copies of the library or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link other code with the library, you must provide +complete object files to the recipients, so that they can relink them +with the library after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + We protect your rights with a two-step method: (1) we copyright the +library, and (2) we offer you this license, which gives you legal +permission to copy, distribute and/or modify the library. + + To protect each distributor, we want to make it very clear that +there is no warranty for the free library. Also, if the library is +modified by someone else and passed on, the recipients should know +that what they have is not the original version, so that the original +author's reputation will not be affected by problems that might be +introduced by others. + + Finally, software patents pose a constant threat to the existence of +any free program. We wish to make sure that a company cannot +effectively restrict the users of a free program by obtaining a +restrictive license from a patent holder. Therefore, we insist that +any patent license obtained for a version of the library must be +consistent with the full freedom of use specified in this license. + + Most GNU software, including some libraries, is covered by the +ordinary GNU General Public License. This license, the GNU Lesser +General Public License, applies to certain designated libraries, and +is quite different from the ordinary General Public License. We use +this license for certain libraries
[mesos] branch master updated: Fix `link::lo()` to deal with `None` value.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 0dbb535 Fix `link::lo()` to deal with `None` value. 0dbb535 is described below commit 0dbb535fa7a6790d824c75df1b79917343403115 Author: Pavel Kirillov <49415588+pashakiril...@users.noreply.github.com> AuthorDate: Tue Apr 9 15:44:24 2019 +0200 Fix `link::lo()` to deal with `None` value. Some network links may cause `link::internal::test(link, IFF_LOOPBACK)` to be `None`. `Link::lo` should deal with `None` values of test var. This closes #330 --- src/linux/routing/link/link.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/linux/routing/link/link.cpp b/src/linux/routing/link/link.cpp index 9dc30c5..bff172d 100644 --- a/src/linux/routing/link/link.cpp +++ b/src/linux/routing/link/link.cpp @@ -99,7 +99,7 @@ Result lo() Result test = link::internal::test(link, IFF_LOOPBACK); if (test.isError()) { return Error("Failed to check the flag on link: " + link); -} else if (test.get()) { +} else if (test.isSome() && test.get()) { return link; } }
[mesos] 02/02: Added MESOS-9707 to the 1.8.0 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 61764bde84577090c8eab504c4b1f823634b36cc Author: Andrei Budnik AuthorDate: Tue Apr 9 17:05:11 2019 +0200 Added MESOS-9707 to the 1.8.0 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index a02cb7f..526be71 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -193,6 +193,7 @@ All Resolved Issues: * [MESOS-9691] - Quota headroom calculation is off when subroles are involved. * [MESOS-9692] - Quota may be under allocated for disk resources. * [MESOS-9696] - Test MasterQuotaTest.AvailableResourcesSingleDisconnectedAgent is flaky + * [MESOS-9707] - Calling link::lo() may cause runtime error ** Epic * [MESOS-8054] - Feedback for operations
[mesos] 01/02: Fix `link::lo()` to deal with `None` value.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 7d7d8db9b9963e9a079bee54f06e3aac97f7910c Author: Pavel Kirillov <49415588+pashakiril...@users.noreply.github.com> AuthorDate: Tue Apr 9 15:44:24 2019 +0200 Fix `link::lo()` to deal with `None` value. Some network links may cause `link::internal::test(link, IFF_LOOPBACK)` to be `None`. `Link::lo` should deal with `None` values of test var. This closes #330 --- src/linux/routing/link/link.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/linux/routing/link/link.cpp b/src/linux/routing/link/link.cpp index 9dc30c5..bff172d 100644 --- a/src/linux/routing/link/link.cpp +++ b/src/linux/routing/link/link.cpp @@ -99,7 +99,7 @@ Result lo() Result test = link::internal::test(link, IFF_LOOPBACK); if (test.isError()) { return Error("Failed to check the flag on link: " + link); -} else if (test.get()) { +} else if (test.isSome() && test.get()) { return link; } }
[mesos] 02/02: Added MESOS-9707 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 9f22a3b8cad9a7e171752c898a0bb65f2a9021a9 Author: Andrei Budnik AuthorDate: Tue Apr 9 17:19:11 2019 +0200 Added MESOS-9707 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 293a6a9..6dee2da 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -12,6 +12,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9610] - Fetcher vulnerability - escaping from sandbox. * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations. * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvements * [MESOS-8880] - Add minimum capabilities in the master.
[mesos] branch 1.7.x updated (fc12da5 -> 9f22a3b)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from fc12da5 Fixed compile error with a built-in logrotate module. new 8ac30a5 Fix `link::lo()` to deal with `None` value. new 9f22a3b Added MESOS-9707 to the 1.7.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/linux/routing/link/link.cpp | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
[mesos] 01/02: Fix `link::lo()` to deal with `None` value.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 8ac30a58373b4575892cd8647bd5f731cd31ffd6 Author: Pavel Kirillov <49415588+pashakiril...@users.noreply.github.com> AuthorDate: Tue Apr 9 15:44:24 2019 +0200 Fix `link::lo()` to deal with `None` value. Some network links may cause `link::internal::test(link, IFF_LOOPBACK)` to be `None`. `Link::lo` should deal with `None` values of test var. This closes #330 --- src/linux/routing/link/link.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/linux/routing/link/link.cpp b/src/linux/routing/link/link.cpp index 9dc30c5..bff172d 100644 --- a/src/linux/routing/link/link.cpp +++ b/src/linux/routing/link/link.cpp @@ -99,7 +99,7 @@ Result lo() Result test = link::internal::test(link, IFF_LOOPBACK); if (test.isError()) { return Error("Failed to check the flag on link: " + link); -} else if (test.get()) { +} else if (test.isSome() && test.get()) { return link; } }
[mesos] branch 1.6.x updated (34ffae5 -> 80c0673)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 34ffae5 Added MESOS-9529 to 1.6.3 CHANGELOG. new ee630c7 Fix `link::lo()` to deal with `None` value. new 80c0673 Added MESOS-9707 to the 1.6.3 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/linux/routing/link/link.cpp | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
[mesos] 01/02: Fix `link::lo()` to deal with `None` value.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit ee630c7f8db2e59ca92f286f246bd1b5b4f39d21 Author: Pavel Kirillov <49415588+pashakiril...@users.noreply.github.com> AuthorDate: Tue Apr 9 15:44:24 2019 +0200 Fix `link::lo()` to deal with `None` value. Some network links may cause `link::internal::test(link, IFF_LOOPBACK)` to be `None`. `Link::lo` should deal with `None` values of test var. This closes #330 --- src/linux/routing/link/link.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/linux/routing/link/link.cpp b/src/linux/routing/link/link.cpp index 9dc30c5..bff172d 100644 --- a/src/linux/routing/link/link.cpp +++ b/src/linux/routing/link/link.cpp @@ -99,7 +99,7 @@ Result lo() Result test = link::internal::test(link, IFF_LOOPBACK); if (test.isError()) { return Error("Failed to check the flag on link: " + link); -} else if (test.get()) { +} else if (test.isSome() && test.get()) { return link; } }
[mesos] 02/02: Added MESOS-9707 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 80c0673f1844771b6f4baf0433ba9ac27479e722 Author: Andrei Budnik AuthorDate: Tue Apr 9 17:22:35 2019 +0200 Added MESOS-9707 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 8d22141..392fd25 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -7,6 +7,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement * [MESOS-8880] - Add minimum capabilities in the master.
[mesos] 04/04: Added MESOS-9707 to the 1.8.0 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 1d4858e991a60d0742c317008ff59f151dcb9d58 Author: Andrei Budnik AuthorDate: Tue Apr 9 17:05:11 2019 +0200 Added MESOS-9707 to the 1.8.0 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index a3d6ff4..0c0e30f 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -193,6 +193,7 @@ All Resolved Issues: * [MESOS-9691] - Quota headroom calculation is off when subroles are involved. * [MESOS-9692] - Quota may be under allocated for disk resources. * [MESOS-9696] - Test MasterQuotaTest.AvailableResourcesSingleDisconnectedAgent is flaky + * [MESOS-9707] - Calling link::lo() may cause runtime error ** Epic * [MESOS-8054] - Feedback for operations
[mesos] 03/04: Added MESOS-9707 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit b997c97a7d16770b012cd29d9a549f3e677a2c99 Author: Andrei Budnik AuthorDate: Tue Apr 9 17:19:11 2019 +0200 Added MESOS-9707 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 51c5423..a3d6ff4 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -320,6 +320,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9610] - Fetcher vulnerability - escaping from sandbox. * [MESOS-9661] - Agent crashes when SLRP recovers dropped operations. * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvements * [MESOS-8880] - Add minimum capabilities in the master.
[mesos] 01/04: Added MESOS-9707 to the 1.5.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 39be632063d650f0017451dd2287f84f9f49ab82 Author: Andrei Budnik AuthorDate: Tue Apr 9 17:26:16 2019 +0200 Added MESOS-9707 to the 1.5.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index a02cb7f..88909b9 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1240,6 +1240,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP) ** Bug * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. + * [MESOS-9707] - Calling link::lo() may cause runtime error Release Notes - Mesos - Version 1.5.3
[mesos] 02/04: Added MESOS-9707 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 806cc6c89a1547bcaf21bef695bb20bbedeba7d3 Author: Andrei Budnik AuthorDate: Tue Apr 9 17:22:35 2019 +0200 Added MESOS-9707 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 88909b9..51c5423 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -782,6 +782,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. * [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace. * [MESOS-9692] - Quota may be under allocated for disk resources. + * [MESOS-9707] - Calling link::lo() may cause runtime error ** Improvement * [MESOS-8880] - Add minimum capabilities in the master.
[mesos] branch master updated (0dbb535 -> 1d4858e)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 0dbb535 Fix `link::lo()` to deal with `None` value. new 39be632 Added MESOS-9707 to the 1.5.4 CHANGELOG. new 806cc6c Added MESOS-9707 to the 1.6.3 CHANGELOG. new b997c97 Added MESOS-9707 to the 1.7.3 CHANGELOG. new 1d4858e Added MESOS-9707 to the 1.8.0 CHANGELOG. The 4 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 4 1 file changed, 4 insertions(+)
[mesos] branch 1.8.x updated (b5fa022 -> 61764bd)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from b5fa022 Added MESOS-9623 to the 1.8.0 CHANGELOG. new 7d7d8db Fix `link::lo()` to deal with `None` value. new 61764bd Added MESOS-9707 to the 1.8.0 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/linux/routing/link/link.cpp | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
[mesos] branch 1.5.x updated (136178e -> 89e9505)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 136178e Added MESOS-9529 to 1.5.4 CHANGELOG. new d24d0c7 Fix `link::lo()` to deal with `None` value. new 89e9505 Added MESOS-9707 to the 1.5.4 CHANGELOG. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/linux/routing/link/link.cpp | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
[mesos] 01/02: Fix `link::lo()` to deal with `None` value.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit d24d0c71f6e4b6f5c8b98c0f90ec2728fc0372b2 Author: Pavel Kirillov <49415588+pashakiril...@users.noreply.github.com> AuthorDate: Tue Apr 9 15:44:24 2019 +0200 Fix `link::lo()` to deal with `None` value. Some network links may cause `link::internal::test(link, IFF_LOOPBACK)` to be `None`. `Link::lo` should deal with `None` values of test var. This closes #330 --- src/linux/routing/link/link.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/linux/routing/link/link.cpp b/src/linux/routing/link/link.cpp index 5388a3d..ef83bc8 100644 --- a/src/linux/routing/link/link.cpp +++ b/src/linux/routing/link/link.cpp @@ -99,7 +99,7 @@ Result lo() Result test = link::internal::test(link, IFF_LOOPBACK); if (test.isError()) { return Error("Failed to check the flag on link: " + link); -} else if (test.get()) { +} else if (test.isSome() && test.get()) { return link; } }
[mesos] 02/02: Added MESOS-9707 to the 1.5.4 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 89e9505637726f38cf115c5ed52dec9f2a4a17e0 Author: Andrei Budnik AuthorDate: Tue Apr 9 17:26:16 2019 +0200 Added MESOS-9707 to the 1.5.4 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index ba2f739..023c927 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -4,6 +4,7 @@ Release Notes - Mesos - Version 1.5.4 (WIP) ** Bug * [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true. + * [MESOS-9707] - Calling link::lo() may cause runtime error Release Notes - Mesos - Version 1.5.3
[mesos] branch 1.9.x updated: Added /containerizer/debug endpoint to 1.9.0 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.9.x in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/1.9.x by this push: new bfcd446 Added /containerizer/debug endpoint to 1.9.0 CHANGELOG. bfcd446 is described below commit bfcd4461aaf51444398783a511f973a3a1d9a101 Author: Andrei Budnik AuthorDate: Wed Aug 28 12:10:34 2019 +0200 Added /containerizer/debug endpoint to 1.9.0 CHANGELOG. Review: https://reviews.apache.org/r/71387 --- CHANGELOG | 5 + 1 file changed, 5 insertions(+) diff --git a/CHANGELOG b/CHANGELOG index aaad58f..307f116 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -36,6 +36,11 @@ This release contains the following highlights: * The Mesos containerizer now includes ephemeral overlayfs storage in the task disk quota as well as sandbox storage. (MESOS-9900) +* A new `/containerizer/debug` HTTP endpoint has been added. This endpoint + exposes debug information for the Mesos containerizer. At the moment, it + returns a list of pending operations related to Isolators and Launchers. + (MESOS-9756) + Additional API Changes: * Mesos components will now forego TLS certificate validation for incoming
[mesos] branch master updated: Added /containerizer/debug endpoint to 1.9.0 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 279af61 Added /containerizer/debug endpoint to 1.9.0 CHANGELOG. 279af61 is described below commit 279af615b3413f2ff42f96c180189f37d05d1f5b Author: Andrei Budnik AuthorDate: Wed Aug 28 12:10:34 2019 +0200 Added /containerizer/debug endpoint to 1.9.0 CHANGELOG. Review: https://reviews.apache.org/r/71387 --- CHANGELOG | 5 + 1 file changed, 5 insertions(+) diff --git a/CHANGELOG b/CHANGELOG index aaad58f..307f116 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -36,6 +36,11 @@ This release contains the following highlights: * The Mesos containerizer now includes ephemeral overlayfs storage in the task disk quota as well as sandbox storage. (MESOS-9900) +* A new `/containerizer/debug` HTTP endpoint has been added. This endpoint + exposes debug information for the Mesos containerizer. At the moment, it + returns a list of pending operations related to Isolators and Launchers. + (MESOS-9756) + Additional API Changes: * Mesos components will now forego TLS certificate validation for incoming
[mesos] branch master updated (48c20bf -> f0be237)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 48c20bf Updated site's dependencies. new 8aae23e Added missing `return` statement in `Slave::statusUpdate`. new f0be237 Fixed out-of-order processing of terminal status updates in agent. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/slave/slave.cpp | 64 ++--- src/slave/slave.hpp | 6 + 2 files changed, 67 insertions(+), 3 deletions(-)
[mesos] 01/02: Added missing `return` statement in `Slave::statusUpdate`.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 8aae23ec7cd4bc50532df0b1d1ea6ec23ce078f8 Author: Andrei Budnik AuthorDate: Fri Aug 23 14:36:18 2019 +0200 Added missing `return` statement in `Slave::statusUpdate`. Previously, if `statusUpdate` was called for a pending task, it would forward the status update and then continue executing `statusUpdate`, which then checks if there is an executor that is aware of this task. Given that a pending task is not known to any executor, it would always handle it by forwarding status update one more time. This patch adds missing `return` statement, which fixes the issue. Review: https://reviews.apache.org/r/71361 --- src/slave/slave.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index 882040d..45f1584 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5879,6 +5879,8 @@ void Slave::statusUpdate(StatusUpdate update, const Option& pid) taskStatusUpdateManager->update(update, info.id()) .onAny(defer(self(), ::___statusUpdate, lambda::_1, update, pid)); + +return; } Executor* executor = framework->getExecutor(status.task_id());
[mesos] 02/02: Fixed out-of-order processing of terminal status updates in agent.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit f0be23765531b05661ed7f1b124faf96744aa80b Author: Andrei Budnik AuthorDate: Tue Aug 20 19:24:44 2019 +0200 Fixed out-of-order processing of terminal status updates in agent. Previously, Mesos agent could send TASK_FAILED status update on executor termination while processing of TASK_FINISHED status update was in progress. Processing of task status updates involves sending requests to the containerizer, which might finish processing of these requests out-of-order, e.g. `MesosContainerizer::status`. Also, the agent does not overwrite status of the terminal status update once it's stored in the `terminatedTasks`. Hence, there was a race condition between two terminal status updates. Note that V1 Executors are not affected by this problem because they wait for an acknowledgement of the terminal status update by the agent before terminating. This patch introduces a new data structure `pendingStatusUpdates`, which holds a list of status updates that are being processed. This data structure allows validating the order of processing of status updates by the agent. Review: https://reviews.apache.org/r/71343 --- src/slave/slave.cpp | 62 ++--- src/slave/slave.hpp | 6 ++ 2 files changed, 65 insertions(+), 3 deletions(-) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index 45f1584..4e93656 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5947,6 +5947,8 @@ void Slave::statusUpdate(StatusUpdate update, const Option& pid) metrics.valid_status_updates++; + executor->addPendingTaskStatus(status); + // Before sending update, we need to retrieve the container status // if the task reached the executor. For tasks that are queued, we // do not need to send the container status and we must @@ -6158,6 +6160,17 @@ void Slave::___statusUpdate( VLOG(1) << "Task status update manager successfully handled status update " << update; + const TaskStatus& status = update.status(); + + Executor* executor = nullptr; + Framework* framework = getFramework(update.framework_id()); + if (framework != nullptr) { +executor = framework->getExecutor(status.task_id()); +if (executor != nullptr) { + executor->removePendingTaskStatus(status); +} + } + if (pid == UPID()) { return; } @@ -6165,7 +6178,7 @@ void Slave::___statusUpdate( StatusUpdateAcknowledgementMessage message; message.mutable_framework_id()->MergeFrom(update.framework_id()); message.mutable_slave_id()->MergeFrom(update.slave_id()); - message.mutable_task_id()->MergeFrom(update.status().task_id()); + message.mutable_task_id()->MergeFrom(status.task_id()); message.set_uuid(update.uuid()); // Task status update manager successfully handled the status update. @@ -6177,14 +6190,12 @@ void Slave::___statusUpdate( send(pid.get(), message); } else { // Acknowledge the HTTP based executor. -Framework* framework = getFramework(update.framework_id()); if (framework == nullptr) { LOG(WARNING) << "Ignoring sending acknowledgement for status update " << update << " of unknown framework"; return; } -Executor* executor = framework->getExecutor(update.status().task_id()); if (executor == nullptr) { // Refer to the comments in 'statusUpdate()' on when this can // happen. @@ -10795,6 +10806,33 @@ void Executor::recoverTask(const TaskState& state, bool recheckpointTask) } +void Executor::addPendingTaskStatus(const TaskStatus& status) +{ + auto uuid = id::UUID::fromBytes(status.uuid()).get(); + pendingStatusUpdates[status.task_id()][uuid] = status; +} + + +void Executor::removePendingTaskStatus(const TaskStatus& status) +{ + const TaskID& taskId = status.task_id(); + + auto uuid = id::UUID::fromBytes(status.uuid()).get(); + + if (!pendingStatusUpdates.contains(taskId) || + !pendingStatusUpdates[taskId].contains(uuid)) { +LOG(WARNING) << "Unknown pending status update (uuid: " << uuid << ")"; +return; + } + + pendingStatusUpdates[taskId].erase(uuid); + + if (pendingStatusUpdates[taskId].empty()) { +pendingStatusUpdates.erase(taskId); + } +} + + Try Executor::updateTaskState(const TaskStatus& status) { bool terminal = protobuf::isTerminalState(status.state()); @@ -10818,6 +10856,24 @@ Try Executor::updateTaskState(const TaskStatus& status) task = launchedTasks.at(status.task_id()); if (terminal) { + if (pendingStatusUpdates.contains(status.task_id())) { +auto statusUpdates = pending
[mesos] 01/03: Added missing `return` statement in `Slave::statusUpdate`.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit cc79f22fb07cfad8f248150d5a3040f846998c3a Author: Andrei Budnik AuthorDate: Fri Aug 23 14:36:18 2019 +0200 Added missing `return` statement in `Slave::statusUpdate`. Previously, if `statusUpdate` was called for a pending task, it would forward the status update and then continue executing `statusUpdate`, which then checks if there is an executor that is aware of this task. Given that a pending task is not known to any executor, it would always handle it by forwarding status update one more time. This patch adds missing `return` statement, which fixes the issue. Review: https://reviews.apache.org/r/71361 --- src/slave/slave.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index 2a90e96..176d3fb 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5388,6 +5388,8 @@ void Slave::statusUpdate(StatusUpdate update, const Option& pid) taskStatusUpdateManager->update(update, info.id()) .onAny(defer(self(), ::___statusUpdate, lambda::_1, update, pid)); + +return; } Executor* executor = framework->getExecutor(status.task_id());
[mesos] 02/03: Fixed out-of-order processing of terminal status updates in agent.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 3ad802ebbe34565a2fa995d834ba4928c20e5e62 Author: Andrei Budnik AuthorDate: Tue Aug 20 19:24:44 2019 +0200 Fixed out-of-order processing of terminal status updates in agent. Previously, Mesos agent could send TASK_FAILED status update on executor termination while processing of TASK_FINISHED status update was in progress. Processing of task status updates involves sending requests to the containerizer, which might finish processing of these requests out-of-order, e.g. `MesosContainerizer::status`. Also, the agent does not overwrite status of the terminal status update once it's stored in the `terminatedTasks`. Hence, there was a race condition between two terminal status updates. Note that V1 Executors are not affected by this problem because they wait for an acknowledgement of the terminal status update by the agent before terminating. This patch introduces a new data structure `pendingStatusUpdates`, which holds a list of status updates that are being processed. This data structure allows validating the order of processing of status updates by the agent. Review: https://reviews.apache.org/r/71343 --- src/slave/slave.cpp | 62 ++--- src/slave/slave.hpp | 6 ++ 2 files changed, 65 insertions(+), 3 deletions(-) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index 176d3fb..0861ac2 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5456,6 +5456,8 @@ void Slave::statusUpdate(StatusUpdate update, const Option& pid) metrics.valid_status_updates++; + executor->addPendingTaskStatus(status); + // Before sending update, we need to retrieve the container status // if the task reached the executor. For tasks that are queued, we // do not need to send the container status and we must @@ -5667,6 +5669,17 @@ void Slave::___statusUpdate( VLOG(1) << "Task status update manager successfully handled status update " << update; + const TaskStatus& status = update.status(); + + Executor* executor = nullptr; + Framework* framework = getFramework(update.framework_id()); + if (framework != nullptr) { +executor = framework->getExecutor(status.task_id()); +if (executor != nullptr) { + executor->removePendingTaskStatus(status); +} + } + if (pid == UPID()) { return; } @@ -5674,7 +5687,7 @@ void Slave::___statusUpdate( StatusUpdateAcknowledgementMessage message; message.mutable_framework_id()->MergeFrom(update.framework_id()); message.mutable_slave_id()->MergeFrom(update.slave_id()); - message.mutable_task_id()->MergeFrom(update.status().task_id()); + message.mutable_task_id()->MergeFrom(status.task_id()); message.set_uuid(update.uuid()); // Task status update manager successfully handled the status update. @@ -5686,14 +5699,12 @@ void Slave::___statusUpdate( send(pid.get(), message); } else { // Acknowledge the HTTP based executor. -Framework* framework = getFramework(update.framework_id()); if (framework == nullptr) { LOG(WARNING) << "Ignoring sending acknowledgement for status update " << update << " of unknown framework"; return; } -Executor* executor = framework->getExecutor(update.status().task_id()); if (executor == nullptr) { // Refer to the comments in 'statusUpdate()' on when this can // happen. @@ -9759,6 +9770,33 @@ void Executor::recoverTask(const TaskState& state, bool recheckpointTask) } +void Executor::addPendingTaskStatus(const TaskStatus& status) +{ + auto uuid = id::UUID::fromBytes(status.uuid()).get(); + pendingStatusUpdates[status.task_id()][uuid] = status; +} + + +void Executor::removePendingTaskStatus(const TaskStatus& status) +{ + const TaskID& taskId = status.task_id(); + + auto uuid = id::UUID::fromBytes(status.uuid()).get(); + + if (!pendingStatusUpdates.contains(taskId) || + !pendingStatusUpdates[taskId].contains(uuid)) { +LOG(WARNING) << "Unknown pending status update (uuid: " << uuid << ")"; +return; + } + + pendingStatusUpdates[taskId].erase(uuid); + + if (pendingStatusUpdates[taskId].empty()) { +pendingStatusUpdates.erase(taskId); + } +} + + Try Executor::updateTaskState(const TaskStatus& status) { bool terminal = protobuf::isTerminalState(status.state()); @@ -9782,6 +9820,24 @@ Try Executor::updateTaskState(const TaskStatus& status) task = launchedTasks.at(status.task_id()); if (terminal) { + if (pendingStatusUpdates.contains(status.task_id())) { +auto statusUpdates = pending
[mesos] branch 1.6.x updated (9badb3b -> d77029f)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 9badb3b Added MESOS-9836 to the 1.6.3 CHANGELOG. new cc79f22 Added missing `return` statement in `Slave::statusUpdate`. new 3ad802e Fixed out-of-order processing of terminal status updates in agent. new d77029f Added MESOS-9887 to the 1.6.3 CHANGELOG. The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/slave.cpp | 64 ++--- src/slave/slave.hpp | 6 + 3 files changed, 68 insertions(+), 3 deletions(-)
[mesos] branch master updated (f0be237 -> 24e989e)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from f0be237 Fixed out-of-order processing of terminal status updates in agent. new 967f105 Added MESOS-9887 to the 1.8.2 CHANGELOG. new 6b2d101 Added MESOS-9887 to the 1.7.3 CHANGELOG. new 24e989e Added MESOS-9887 to the 1.6.3 CHANGELOG. The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 3 +++ 1 file changed, 3 insertions(+)
[mesos] 02/03: Added MESOS-9887 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 6b2d101770ae8853d021b8cc5d0f5ae587302a54 Author: Andrei Budnik AuthorDate: Mon Aug 26 14:58:45 2019 +0200 Added MESOS-9887 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index a215e5c..3cd7661 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -504,6 +504,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9856] - REVIVE call with specified role(s) clears filters for all roles of a framework. * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct. * [MESOS-9870] - Simultaneous adding/removal of a role from framework's roles and its suppressed roles crashes the master. + * [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor. * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed. * [MESOS-9925] - Default executor takes a couple of seconds to start and subscribe Mesos agent.
[mesos] 03/03: Added MESOS-9887 to the 1.6.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 24e989e66507932809c7d852a4a62720de7cb27b Author: Andrei Budnik AuthorDate: Mon Aug 26 14:44:54 2019 +0200 Added MESOS-9887 to the 1.6.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 3cd7661..58cf418 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -986,6 +986,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP) * [MESOS-9856] - REVIVE call with specified role(s) clears filters for all roles of a framework. * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct. * [MESOS-9870] - Simultaneous adding/removal of a role from framework's roles and its suppressed roles crashes the master. + * [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor. * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed. ** Improvement
[mesos] 01/03: Added MESOS-9887 to the 1.8.2 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 967f105cea4bc31780bfe76bd2d62ad71ffae221 Author: Andrei Budnik AuthorDate: Mon Aug 26 15:02:40 2019 +0200 Added MESOS-9887 to the 1.8.2 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index fe08b76..a215e5c 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -49,6 +49,7 @@ Release Notes - Mesos - Version 1.8.2 (WIP) * [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers. * [MESOS-9836] - Docker containerizer overwrites `/mesos/slave` cgroups. * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct. + * [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor. * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed. * [MESOS-9925] - Default executor takes a couple of seconds to start and subscribe Mesos agent.
[mesos] 02/03: Fixed out-of-order processing of terminal status updates in agent.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit b7dcc984476904d6d17f7bf699295dfa9ac8a66e Author: Andrei Budnik AuthorDate: Tue Aug 20 19:24:44 2019 +0200 Fixed out-of-order processing of terminal status updates in agent. Previously, Mesos agent could send TASK_FAILED status update on executor termination while processing of TASK_FINISHED status update was in progress. Processing of task status updates involves sending requests to the containerizer, which might finish processing of these requests out-of-order, e.g. `MesosContainerizer::status`. Also, the agent does not overwrite status of the terminal status update once it's stored in the `terminatedTasks`. Hence, there was a race condition between two terminal status updates. Note that V1 Executors are not affected by this problem because they wait for an acknowledgement of the terminal status update by the agent before terminating. This patch introduces a new data structure `pendingStatusUpdates`, which holds a list of status updates that are being processed. This data structure allows validating the order of processing of status updates by the agent. Review: https://reviews.apache.org/r/71343 --- src/slave/slave.cpp | 62 ++--- src/slave/slave.hpp | 6 ++ 2 files changed, 65 insertions(+), 3 deletions(-) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index edfe3d0..f10aac2 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5486,6 +5486,8 @@ void Slave::statusUpdate(StatusUpdate update, const Option& pid) metrics.valid_status_updates++; + executor->addPendingTaskStatus(status); + // Before sending update, we need to retrieve the container status // if the task reached the executor. For tasks that are queued, we // do not need to send the container status and we must @@ -5697,6 +5699,17 @@ void Slave::___statusUpdate( VLOG(1) << "Task status update manager successfully handled status update " << update; + const TaskStatus& status = update.status(); + + Executor* executor = nullptr; + Framework* framework = getFramework(update.framework_id()); + if (framework != nullptr) { +executor = framework->getExecutor(status.task_id()); +if (executor != nullptr) { + executor->removePendingTaskStatus(status); +} + } + if (pid == UPID()) { return; } @@ -5704,7 +5717,7 @@ void Slave::___statusUpdate( StatusUpdateAcknowledgementMessage message; message.mutable_framework_id()->MergeFrom(update.framework_id()); message.mutable_slave_id()->MergeFrom(update.slave_id()); - message.mutable_task_id()->MergeFrom(update.status().task_id()); + message.mutable_task_id()->MergeFrom(status.task_id()); message.set_uuid(update.uuid()); // Task status update manager successfully handled the status update. @@ -5716,14 +5729,12 @@ void Slave::___statusUpdate( send(pid.get(), message); } else { // Acknowledge the HTTP based executor. -Framework* framework = getFramework(update.framework_id()); if (framework == nullptr) { LOG(WARNING) << "Ignoring sending acknowledgement for status update " << update << " of unknown framework"; return; } -Executor* executor = framework->getExecutor(update.status().task_id()); if (executor == nullptr) { // Refer to the comments in 'statusUpdate()' on when this can // happen. @@ -9861,6 +9872,33 @@ void Executor::recoverTask(const TaskState& state, bool recheckpointTask) } +void Executor::addPendingTaskStatus(const TaskStatus& status) +{ + auto uuid = id::UUID::fromBytes(status.uuid()).get(); + pendingStatusUpdates[status.task_id()][uuid] = status; +} + + +void Executor::removePendingTaskStatus(const TaskStatus& status) +{ + const TaskID& taskId = status.task_id(); + + auto uuid = id::UUID::fromBytes(status.uuid()).get(); + + if (!pendingStatusUpdates.contains(taskId) || + !pendingStatusUpdates[taskId].contains(uuid)) { +LOG(WARNING) << "Unknown pending status update (uuid: " << uuid << ")"; +return; + } + + pendingStatusUpdates[taskId].erase(uuid); + + if (pendingStatusUpdates[taskId].empty()) { +pendingStatusUpdates.erase(taskId); + } +} + + Try Executor::updateTaskState(const TaskStatus& status) { bool terminal = protobuf::isTerminalState(status.state()); @@ -9884,6 +9922,24 @@ Try Executor::updateTaskState(const TaskStatus& status) task = launchedTasks.at(status.task_id()); if (terminal) { + if (pendingStatusUpdates.contains(status.task_id())) { +auto statusUpdates = pending
[mesos] 01/03: Added missing `return` statement in `Slave::statusUpdate`.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 2d62e8ae0ef94f78c9b32be258a08d1e6e2382df Author: Andrei Budnik AuthorDate: Fri Aug 23 14:36:18 2019 +0200 Added missing `return` statement in `Slave::statusUpdate`. Previously, if `statusUpdate` was called for a pending task, it would forward the status update and then continue executing `statusUpdate`, which then checks if there is an executor that is aware of this task. Given that a pending task is not known to any executor, it would always handle it by forwarding status update one more time. This patch adds missing `return` statement, which fixes the issue. Review: https://reviews.apache.org/r/71361 --- src/slave/slave.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index 1c33579..edfe3d0 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5418,6 +5418,8 @@ void Slave::statusUpdate(StatusUpdate update, const Option& pid) taskStatusUpdateManager->update(update, info.id()) .onAny(defer(self(), ::___statusUpdate, lambda::_1, update, pid)); + +return; } Executor* executor = framework->getExecutor(status.task_id());
[mesos] 03/03: Added MESOS-9887 to the 1.7.3 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 80d42b9a2c9223665a82bbaaf3cbc222a094e2ef Author: Andrei Budnik AuthorDate: Mon Aug 26 14:58:45 2019 +0200 Added MESOS-9887 to the 1.7.3 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index 06c88db..1178228 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -29,6 +29,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP) * [MESOS-9856] - REVIVE call with specified role(s) clears filters for all roles of a framework. * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct. * [MESOS-9870] - Simultaneous adding/removal of a role from framework's roles and its suppressed roles crashes the master. + * [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor. * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed. * [MESOS-9925] - Default executor takes a couple of seconds to start and subscribe Mesos agent.
[mesos] 03/03: Added MESOS-9887 to the 1.8.2 CHANGELOG.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit adc958f553c3728aab5529de56b0ddc30c0f9b68 Author: Andrei Budnik AuthorDate: Mon Aug 26 15:02:40 2019 +0200 Added MESOS-9887 to the 1.8.2 CHANGELOG. --- CHANGELOG | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG b/CHANGELOG index b3fca25..ff89605 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -6,6 +6,7 @@ Release Notes - Mesos - Version 1.8.2 (WIP) * [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers. * [MESOS-9836] - Docker containerizer overwrites `/mesos/slave` cgroups. * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct. + * [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor. * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed. * [MESOS-9925] - Default executor takes a couple of seconds to start and subscribe Mesos agent.
[mesos] 01/03: Added missing `return` statement in `Slave::statusUpdate`.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 14abb82925cdbce746238bc20dc7b8c279a96a67 Author: Andrei Budnik AuthorDate: Fri Aug 23 14:36:18 2019 +0200 Added missing `return` statement in `Slave::statusUpdate`. Previously, if `statusUpdate` was called for a pending task, it would forward the status update and then continue executing `statusUpdate`, which then checks if there is an executor that is aware of this task. Given that a pending task is not known to any executor, it would always handle it by forwarding status update one more time. This patch adds missing `return` statement, which fixes the issue. Review: https://reviews.apache.org/r/71361 --- src/slave/slave.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index bf87be0..50a7d68 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5659,6 +5659,8 @@ void Slave::statusUpdate(StatusUpdate update, const Option& pid) taskStatusUpdateManager->update(update, info.id()) .onAny(defer(self(), ::___statusUpdate, lambda::_1, update, pid)); + +return; } Executor* executor = framework->getExecutor(status.task_id());
[mesos] branch 1.8.x updated (f3aa802 -> adc958f)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from f3aa802 Added MESOS-9836 to the 1.8.2 CHANGELOG. new 14abb82 Added missing `return` statement in `Slave::statusUpdate`. new 4bbb037 Fixed out-of-order processing of terminal status updates in agent. new adc958f Added MESOS-9887 to the 1.8.2 CHANGELOG. The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGELOG | 1 + src/slave/slave.cpp | 64 ++--- src/slave/slave.hpp | 6 + 3 files changed, 68 insertions(+), 3 deletions(-)
[mesos] 02/03: Fixed out-of-order processing of terminal status updates in agent.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 4bbb0376cd584a4160a2c5c2f0ac4f3ecaa5e622 Author: Andrei Budnik AuthorDate: Tue Aug 20 19:24:44 2019 +0200 Fixed out-of-order processing of terminal status updates in agent. Previously, Mesos agent could send TASK_FAILED status update on executor termination while processing of TASK_FINISHED status update was in progress. Processing of task status updates involves sending requests to the containerizer, which might finish processing of these requests out-of-order, e.g. `MesosContainerizer::status`. Also, the agent does not overwrite status of the terminal status update once it's stored in the `terminatedTasks`. Hence, there was a race condition between two terminal status updates. Note that V1 Executors are not affected by this problem because they wait for an acknowledgement of the terminal status update by the agent before terminating. This patch introduces a new data structure `pendingStatusUpdates`, which holds a list of status updates that are being processed. This data structure allows validating the order of processing of status updates by the agent. Review: https://reviews.apache.org/r/71343 --- src/slave/slave.cpp | 62 ++--- src/slave/slave.hpp | 6 ++ 2 files changed, 65 insertions(+), 3 deletions(-) diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp index 50a7d68..8d8cef3 100644 --- a/src/slave/slave.cpp +++ b/src/slave/slave.cpp @@ -5727,6 +5727,8 @@ void Slave::statusUpdate(StatusUpdate update, const Option& pid) metrics.valid_status_updates++; + executor->addPendingTaskStatus(status); + // Before sending update, we need to retrieve the container status // if the task reached the executor. For tasks that are queued, we // do not need to send the container status and we must @@ -5938,6 +5940,17 @@ void Slave::___statusUpdate( VLOG(1) << "Task status update manager successfully handled status update " << update; + const TaskStatus& status = update.status(); + + Executor* executor = nullptr; + Framework* framework = getFramework(update.framework_id()); + if (framework != nullptr) { +executor = framework->getExecutor(status.task_id()); +if (executor != nullptr) { + executor->removePendingTaskStatus(status); +} + } + if (pid == UPID()) { return; } @@ -5945,7 +5958,7 @@ void Slave::___statusUpdate( StatusUpdateAcknowledgementMessage message; message.mutable_framework_id()->MergeFrom(update.framework_id()); message.mutable_slave_id()->MergeFrom(update.slave_id()); - message.mutable_task_id()->MergeFrom(update.status().task_id()); + message.mutable_task_id()->MergeFrom(status.task_id()); message.set_uuid(update.uuid()); // Task status update manager successfully handled the status update. @@ -5957,14 +5970,12 @@ void Slave::___statusUpdate( send(pid.get(), message); } else { // Acknowledge the HTTP based executor. -Framework* framework = getFramework(update.framework_id()); if (framework == nullptr) { LOG(WARNING) << "Ignoring sending acknowledgement for status update " << update << " of unknown framework"; return; } -Executor* executor = framework->getExecutor(update.status().task_id()); if (executor == nullptr) { // Refer to the comments in 'statusUpdate()' on when this can // happen. @@ -10520,6 +10531,33 @@ void Executor::recoverTask(const TaskState& state, bool recheckpointTask) } +void Executor::addPendingTaskStatus(const TaskStatus& status) +{ + auto uuid = id::UUID::fromBytes(status.uuid()).get(); + pendingStatusUpdates[status.task_id()][uuid] = status; +} + + +void Executor::removePendingTaskStatus(const TaskStatus& status) +{ + const TaskID& taskId = status.task_id(); + + auto uuid = id::UUID::fromBytes(status.uuid()).get(); + + if (!pendingStatusUpdates.contains(taskId) || + !pendingStatusUpdates[taskId].contains(uuid)) { +LOG(WARNING) << "Unknown pending status update (uuid: " << uuid << ")"; +return; + } + + pendingStatusUpdates[taskId].erase(uuid); + + if (pendingStatusUpdates[taskId].empty()) { +pendingStatusUpdates.erase(taskId); + } +} + + Try Executor::updateTaskState(const TaskStatus& status) { bool terminal = protobuf::isTerminalState(status.state()); @@ -10543,6 +10581,24 @@ Try Executor::updateTaskState(const TaskStatus& status) task = launchedTasks.at(status.task_id()); if (terminal) { + if (pendingStatusUpdates.contains(status.task_id())) { +auto statusUpdates = pending
[mesos] branch master updated: Added `SlaveOptions` for wrapping all parameters of `StartSlave`.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new 6c2a94c Added `SlaveOptions` for wrapping all parameters of `StartSlave`. 6c2a94c is described below commit 6c2a94ca0eca90e6d3517e4400f4529ddce0b84c Author: Andrei Budnik AuthorDate: Mon Sep 2 17:15:52 2019 +0200 Added `SlaveOptions` for wrapping all parameters of `StartSlave`. This patch introduces a `SlaveOptions` struct which holds optional parameters accepted by `cluster::Slave::create`. Added an overload of `StartSlave` that accepts `SlaveOptions`. It's a preferred way of creating and starting an instance of `cluster::Slave` in tests, since underlying `cluster::Slave::create` accepts a long list of optional arguments, which might be extended in the future. Review: https://reviews.apache.org/r/71424 --- src/tests/mesos.cpp | 281 +--- src/tests/mesos.hpp | 104 +++ 2 files changed, 153 insertions(+), 232 deletions(-) diff --git a/src/tests/mesos.cpp b/src/tests/mesos.cpp index 0396ce7..e77db22 100644 --- a/src/tests/mesos.cpp +++ b/src/tests/mesos.cpp @@ -334,25 +334,22 @@ Try> MesosTest::StartMaster( } -Try> MesosTest::StartSlave( -MasterDetector* detector, -const Option& flags, -bool mock) +Try> MesosTest::StartSlave(const SlaveOptions& options) { Try> slave = cluster::Slave::create( - detector, - flags.isNone() ? CreateSlaveFlags() : flags.get(), - None(), - None(), - None(), - None(), - None(), - None(), - None(), - None(), - mock); - - if (slave.isSome() && !mock) { + options.detector, + options.flags.isNone() ? CreateSlaveFlags() : options.flags.get(), + options.id, + options.containerizer, + options.gc, + options.taskStatusUpdateManager, + options.resourceEstimator, + options.qosController, + options.secretGenerator, + options.authorizer, + options.mock); + + if (slave.isSome() && !options.mock) { slave.get()->start(); } @@ -362,28 +359,23 @@ Try> MesosTest::StartSlave( Try> MesosTest::StartSlave( MasterDetector* detector, -slave::Containerizer* containerizer, const Option& flags, bool mock) { - Try> slave = cluster::Slave::create( - detector, - flags.isNone() ? CreateSlaveFlags() : flags.get(), - None(), - containerizer, - None(), - None(), - None(), - None(), - None(), - None(), - mock); + return StartSlave(SlaveOptions(detector, mock) +.withFlags(flags)); +} - if (slave.isSome() && !mock) { -slave.get()->start(); - } - return slave; +Try> MesosTest::StartSlave( +MasterDetector* detector, +slave::Containerizer* containerizer, +const Option& flags, +bool mock) +{ + return StartSlave(SlaveOptions(detector, mock) +.withFlags(flags) +.withContainerizer(containerizer)); } @@ -393,24 +385,9 @@ Try> MesosTest::StartSlave( const Option& flags, bool mock) { - Try> slave = cluster::Slave::create( - detector, - flags.isNone() ? CreateSlaveFlags() : flags.get(), - id, - None(), - None(), - None(), - None(), - None(), - None(), - None(), - mock); - - if (slave.isSome() && !mock) { -slave.get()->start(); - } - - return slave; + return StartSlave(SlaveOptions(detector, mock) +.withFlags(flags) +.withId(id)); } @@ -420,17 +397,10 @@ Try> MesosTest::StartSlave( const string& id, const Option& flags) { - Try> slave = cluster::Slave::create( - detector, - flags.isNone() ? CreateSlaveFlags() : flags.get(), - id, - containerizer); - - if (slave.isSome()) { -slave.get()->start(); - } - - return slave; + return StartSlave(SlaveOptions(detector) +.withFlags(flags) +.withId(id) +.withContainerizer(containerizer)); } @@ -440,24 +410,9 @@ Try> MesosTest::StartSlave( const Option& flags, bool mock) { - Try> slave = cluster::Slave::create( - detector, - flags.isNone() ? CreateSlaveFlags() : flags.get(), - None(), - None(), - gc, - None(), - None(), - None(), - None(), - None(), - mock); - - if (slave.isSome() && !mock) { -slave.get()->start(); - } - - return slave; + return StartSlave(SlaveOptions(detector, mock) +.withFlags(flags) +.withGc(gc)); } @@ -466,20 +421,9 @@ Try> MesosTest::StartSlave( mesos::slave::ResourceEstimator* resourceEstimator, const Option& flags) { - Try> slave = cluster::Slave::
[mesos] 01/02: Added `futureTracker` to the `SlaveOptions` in tests.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit dee4b849c8179ea46947c8ea4dd031f6eb37b659 Author: Andrei Budnik AuthorDate: Fri Sep 6 17:01:56 2019 +0200 Added `futureTracker` to the `SlaveOptions` in tests. `PendingFutureTracker` is shared across both Mesos containerizer and the agent, so we need to add an option to be able to start a slave in tests with an instance of the `futureTrack` as a parameter. Review: https://reviews.apache.org/r/71454 --- src/tests/cluster.cpp | 20 +--- src/tests/cluster.hpp | 5 + src/tests/mesos.cpp | 1 + src/tests/mesos.hpp | 8 4 files changed, 27 insertions(+), 7 deletions(-) diff --git a/src/tests/cluster.cpp b/src/tests/cluster.cpp index 1646516..f7bc882 100644 --- a/src/tests/cluster.cpp +++ b/src/tests/cluster.cpp @@ -416,6 +416,7 @@ Try> Slave::create( const Option& qosController, const Option& secretGenerator, const Option& providedAuthorizer, +const Option& futureTracker, bool mock) { process::Owned slave(new Slave()); @@ -447,10 +448,15 @@ Try> Slave::create( } #endif // __WINDOWS__ - Try futureTracker = PendingFutureTracker::create(); - if (futureTracker.isError()) { -return Error( -"Failed to create pending future tracker: " + futureTracker.error()); + // If the future tracker is not provided, create a default one. + if (futureTracker.isNone()) { +Try _futureTracker = PendingFutureTracker::create(); +if (_futureTracker.isError()) { + return Error( + "Failed to create pending future tracker: " + _futureTracker.error()); +} + +slave->futureTracker.reset(_futureTracker.get()); } // If the containerizer is not provided, create a default one. @@ -468,7 +474,7 @@ Try> Slave::create( gc.getOrElse(slave->gc.get()), nullptr, volumeGidManager, - futureTracker.get()); + futureTracker.getOrElse(slave->futureTracker.get())); if (_containerizer.isError()) { return Error("Failed to create containerizer: " + _containerizer.error()); @@ -621,7 +627,7 @@ Try> Slave::create( qosController.getOrElse(slave->qosController.get()), secretGenerator.getOrElse(slave->secretGenerator.get()), volumeGidManager, -futureTracker.get(), +futureTracker.getOrElse(slave->futureTracker.get()), authorizer)); } else { slave->slave.reset(new slave::Slave( @@ -636,7 +642,7 @@ Try> Slave::create( qosController.getOrElse(slave->qosController.get()), secretGenerator.getOrElse(slave->secretGenerator.get()), volumeGidManager, -futureTracker.get(), +futureTracker.getOrElse(slave->futureTracker.get()), authorizer)); } diff --git a/src/tests/cluster.hpp b/src/tests/cluster.hpp index c04ee14..415a60f 100644 --- a/src/tests/cluster.hpp +++ b/src/tests/cluster.hpp @@ -170,6 +170,7 @@ public: const Option& qosController = None(), const Option& secretGenerator = None(), const Option& authorizer = None(), + const Option& futureTracker = None(), bool mock = false); ~Slave(); @@ -227,6 +228,10 @@ private: // of who created it). slave::Containerizer* containerizer = nullptr; + // Pending future tracker must be destroyed last since there may be + // pending requests related to the dependant objects declared below. + process::Owned futureTracker; + // Dependencies that are created by the factory method. process::Owned authorizer; process::Owned ownedContainerizer; diff --git a/src/tests/mesos.cpp b/src/tests/mesos.cpp index e77db22..664c302 100644 --- a/src/tests/mesos.cpp +++ b/src/tests/mesos.cpp @@ -347,6 +347,7 @@ Try> MesosTest::StartSlave(const SlaveOptions& options) options.qosController, options.secretGenerator, options.authorizer, + options.futureTracker, options.mock); if (slave.isSome() && !options.mock) { diff --git a/src/tests/mesos.hpp b/src/tests/mesos.hpp index ecde518..73b1866 100644 --- a/src/tests/mesos.hpp +++ b/src/tests/mesos.hpp @@ -186,6 +186,13 @@ struct SlaveOptions return *this; } + SlaveOptions& withFutureTracker( + const Option& futureTracker) + { +this->futureTracker = futureTracker; +return *this; + } + mesos::master::detector::MasterDetector* detector; bool mock; Option flags; @@ -197,6 +204,7 @@ struct SlaveOptions Option qosController; Option secretGenerator; Option authorizer; + Option futureTracker; };
[mesos] branch master updated (ce801bd -> 1122674)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from ce801bd Fixed documentation on master fault domains. new dee4b84 Added `futureTracker` to the `SlaveOptions` in tests. new 1122674 Implemented an integration test for /containerizer/debug endpoint. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/tests/cluster.cpp | 20 -- src/tests/cluster.hpp | 5 ++ src/tests/mesos.cpp | 1 + src/tests/mesos.hpp | 8 +++ src/tests/slave_tests.cpp | 158 ++ 5 files changed, 185 insertions(+), 7 deletions(-)
[mesos] 02/02: Implemented an integration test for /containerizer/debug endpoint.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 1122674a5c03894e4552d46cfa26dca0557a8f68 Author: Andrei Budnik AuthorDate: Fri Sep 6 13:25:35 2019 +0200 Implemented an integration test for /containerizer/debug endpoint. This test starts an agent with the MockIsolator to intercept calls to its `prepare` method, then it launches a task, which gets stuck. We check that the /containerizer/debug endpoint returns a non-empty list of pending futures including `MockIsolator::prepare`. After setting the promise for the `prepare`, the task successfully starts and we expect for the /containerizer/debug endpoint to return an empty list of pending operations. Review: https://reviews.apache.org/r/71455 --- src/tests/slave_tests.cpp | 158 ++ 1 file changed, 158 insertions(+) diff --git a/src/tests/slave_tests.cpp b/src/tests/slave_tests.cpp index c147bfc..fd4fd6b 100644 --- a/src/tests/slave_tests.cpp +++ b/src/tests/slave_tests.cpp @@ -63,6 +63,7 @@ #endif // USE_SSL_SOCKET #include "common/build.hpp" +#include "common/future_tracker.hpp" #include "common/http.hpp" #include "common/protobuf_utils.hpp" @@ -83,6 +84,8 @@ #include "slave/containerizer/fetcher_process.hpp" #include "slave/containerizer/mesos/containerizer.hpp" +#include "slave/containerizer/mesos/isolator_tracker.hpp" +#include "slave/containerizer/mesos/launcher.hpp" #include "tests/active_user_test_helper.hpp" #include "tests/containerizer.hpp" @@ -94,6 +97,7 @@ #include "tests/resources_utils.hpp" #include "tests/utils.hpp" +#include "tests/containerizer/isolator.hpp" #include "tests/containerizer/mock_containerizer.hpp" using namespace mesos::internal::slave; @@ -112,7 +116,9 @@ using mesos::master::detector::StandaloneMasterDetector; using mesos::v1::resource_provider::Event; using mesos::slave::ContainerConfig; +using mesos::slave::ContainerLaunchInfo; using mesos::slave::ContainerTermination; +using mesos::slave::Isolator; using mesos::v1::scheduler::Call; using mesos::v1::scheduler::Mesos; @@ -2787,6 +2793,158 @@ TEST_F(SlaveTest, ContainersEndpoint) } +// This test ensures that `/containerizer/debug` endpoint returns a non-empty +// list of pending futures when an isolator becomes unresponsive during +// container launch. +TEST_F(SlaveTest, ROOT_ContainerizerDebugEndpoint) +{ + Try> master = StartMaster(); + ASSERT_SOME(master); + + slave::Flags flags = CreateSlaveFlags(); + + Try _launcher = SubprocessLauncher::create(flags); + ASSERT_SOME(_launcher); + + Owned launcher(_launcher.get()); + + MockIsolator* mockIsolator = new MockIsolator(); + + Future prepare; + Promise> promise; + + EXPECT_CALL(*mockIsolator, recover(_, _)) +.WillOnce(Return(Nothing())); + + // Simulate a long prepare from the isolator. + EXPECT_CALL(*mockIsolator, prepare(_, _)) +.WillOnce(DoAll(FutureSatisfy(), +Return(promise.future(; + + EXPECT_CALL(*mockIsolator, update(_, _)) +.WillOnce(Return(Nothing())); + + // Wrap `mockIsolator` in `PendingFutureTracker`. + Try _futureTracker = PendingFutureTracker::create(); + ASSERT_SOME(_futureTracker); + + Owned futureTracker(_futureTracker.get()); + + Owned isolator = Owned(new IsolatorTracker( + Owned(mockIsolator), "MockIsolator", futureTracker.get())); + + Fetcher fetcher(flags); + + Try> provisioner = Provisioner::create(flags); + ASSERT_SOME(provisioner); + + Try create = MesosContainerizer::create( + flags, + true, + , + nullptr, + launcher, + provisioner->share(), + {isolator}); + + ASSERT_SOME(create); + + Owned containerizer(create.get()); + + Owned detector = master.get()->createDetector(); + + Try> slave = +StartSlave(SlaveOptions(detector.get()) + .withContainerizer(containerizer.get()) + .withFutureTracker(futureTracker.get())); + + ASSERT_SOME(slave); + + MockScheduler sched; + MesosSchedulerDriver driver( + , DEFAULT_FRAMEWORK_INFO, master.get()->pid, DEFAULT_CREDENTIAL); + + EXPECT_CALL(sched, registered(, _, _)); + + Future> offers; + EXPECT_CALL(sched, resourceOffers(, _)) +.WillOnce(FutureArg<1>()) +.WillRepeatedly(Return());// Ignore subsequent offers. + + driver.start(); + + AWAIT_READY(offers); + ASSERT_FALSE(offers->empty()); + + const Offer& offer = offers.get()[0]; + + // Launch a task and wait until it is in RUNNING status. + TaskInfo task = createTask( + offer.slave_id(), + Resources::parse("cpus:1;mem:32").get(), + SLEEP_COMMAND(1000)); + + Future statusStarting; +
[mesos] branch master updated: Fixed broken links to Linux scheduler documentation.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git The following commit(s) were added to refs/heads/master by this push: new afd86db Fixed broken links to Linux scheduler documentation. afd86db is described below commit afd86db2ef651b33188586a5cafe3e532439534d Author: Andrei Budnik AuthorDate: Wed Sep 25 13:55:15 2019 +0200 Fixed broken links to Linux scheduler documentation. --- docs/isolators/cgroups-cpu.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/isolators/cgroups-cpu.md b/docs/isolators/cgroups-cpu.md index 5392132..6c08b1d 100644 --- a/docs/isolators/cgroups-cpu.md +++ b/docs/isolators/cgroups-cpu.md @@ -17,8 +17,8 @@ and assigns `cpu` and `cpuacct` cgroups to each container launched by Mesos Containerizer. Cgroups `cpu` subsystem provides 2 mechanisms of limiting the amount -of CPU time used by cgroups: [CFS shares](https://github.com/torvalds/linux/blob/master/Documentation/scheduler/sched-design-CFS.txt) -and [CFS bandwidth](https://github.com/torvalds/linux/blob/master/Documentation/scheduler/sched-bwc.txt) +of CPU time used by cgroups: [CFS shares](https://github.com/torvalds/linux/blob/master/Documentation/scheduler/sched-design-CFS.rst) +and [CFS bandwidth](https://github.com/torvalds/linux/blob/master/Documentation/scheduler/sched-bwc.rst) control. The first one can guarantee some minimum number of CPU "shares" to a cgroup when the system is under heavy load. It, however, does not limit the amount of CPU time available to a cgroup when the
[mesos] 01/02: Removed reimplementation of `cloexec` from systemd activation code.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit 360e8838d08eb07496b2c3e6bcd511348ce488b0 Author: Andrei Budnik AuthorDate: Fri Feb 21 12:08:47 2020 +0100 Removed reimplementation of `cloexec` from systemd activation code. Review: https://reviews.apache.org/r/72157 --- src/linux/systemd.cpp | 26 +- 1 file changed, 1 insertion(+), 25 deletions(-) diff --git a/src/linux/systemd.cpp b/src/linux/systemd.cpp index 9897473..c7815a0 100644 --- a/src/linux/systemd.cpp +++ b/src/linux/systemd.cpp @@ -337,30 +337,6 @@ Try start(const string& name) namespace socket_activation { -static Try setCloexecFlag(int fd, bool cloexec) -{ - CHECK(fd >= 0) << "Invalid file desciptor was passed"; - - int flags = ::fcntl(fd, F_GETFD, 0); - if (flags < 0) { -return ErrnoError(); - } - - int nflags = cloexec == true ? flags | FD_CLOEXEC - : flags & ~FD_CLOEXEC; - - if (nflags == flags) { -return Nothing(); - } - - if (::fcntl(fd, F_SETFD, nflags) < 0) { -return ErrnoError(); - } - - return Nothing(); -} - - // See `src/libsystemd/sd-daemon/sd-daemon.c` in the systemd source tree // for the reference implementation. We follow that implementation to // decide which conditions should result in errors and which should return @@ -404,7 +380,7 @@ Try> listenFds() } for (int fd = SD_LISTEN_FDS_START; fd < SD_LISTEN_FDS_START + n; ++fd) { -Try cloexec = setCloexecFlag(fd, true); +Try cloexec = os::cloexec(fd); if (cloexec.isError()) { return Error( "Could not set CLOEXEC flag for file descriptor " + stringify(fd) +
[mesos] branch master updated (cb8106f -> ea670ec)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from cb8106f Made `http`, `pid` and `heartbeater` of `Framework` private. new 360e883 Removed reimplementation of `cloexec` from systemd activation code. new ea670ec Fixed systemd socket activation on old systemd versions. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/linux/systemd.cpp | 32 src/linux/systemd.hpp | 3 ++- src/slave/main.cpp| 5 - 3 files changed, 10 insertions(+), 30 deletions(-)
[mesos] 02/02: Fixed systemd socket activation on old systemd versions.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git commit ea670eca55af847bd7fcba1510726dd0059edf32 Author: Andrei Budnik AuthorDate: Thu Feb 20 17:30:18 2020 +0100 Fixed systemd socket activation on old systemd versions. This patch fixes the bug when `listenFdsWithName` function returns an empty set of file descriptors on pre-227 systemd versions when `domain_socket_location` value is not equals to the "systemd:unknown". This happens when a user expects a newer version of systemd and specifies a "systemd/", but the actual systemd version does not support `FileDescriptorName`. In this case, `LISTEN_FDNAMES` env variable is not specified, so all socket FDs, which are specified by the `LISTEN_FDS`, must be used to locate the path to the domain socket. Review: https://reviews.apache.org/r/72158 --- src/linux/systemd.cpp | 6 +++--- src/linux/systemd.hpp | 3 ++- src/slave/main.cpp| 5 - 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/src/linux/systemd.cpp b/src/linux/systemd.cpp index c7815a0..75d3c2b 100644 --- a/src/linux/systemd.cpp +++ b/src/linux/systemd.cpp @@ -395,7 +395,7 @@ Try> listenFds() } -Try> listenFdsWithName(const std::string& name) +Try> listenFdsWithNames(const hashset& names) { Try> fds = listenFds(); if (fds.isError()) { @@ -420,8 +420,8 @@ Try> listenFdsWithName(const std::string& name) } std::vector result; - for (size_t i=0; i < listenFdnames.size(); ++i) { -if (listenFdnames[i] == name) { + for (size_t i = 0; i < listenFdnames.size(); ++i) { +if (names.contains(listenFdnames[i])) { result.push_back(fds->at(i)); } } diff --git a/src/linux/systemd.hpp b/src/linux/systemd.hpp index ba96000..e3644c0 100644 --- a/src/linux/systemd.hpp +++ b/src/linux/systemd.hpp @@ -20,6 +20,7 @@ #include #include +#include #include #include #include @@ -101,7 +102,7 @@ Try> listenFds(); // The names are set by the `FileDescriptorName=` directive in the unit file. // This requires systemd 227 or newer. Since any number of unit files can // specify the same name, this can return more than one file descriptor. -Try> listenFdsWithName(const std::string& name); +Try> listenFdsWithNames(const hashset& names); // Clear the `$LISTEN_PID`, `$LISTEN_FDS` and `$LISTEN_FDNAMES` environment // variables. diff --git a/src/slave/main.cpp b/src/slave/main.cpp index c1e6551..0aa2cc9 100644 --- a/src/slave/main.cpp +++ b/src/slave/main.cpp @@ -651,8 +651,11 @@ int main(int argc, char** argv) // Chop off `systemd:` prefix. std::string name = flags.domain_socket_location->substr(8); #ifdef __linux__ + // NOTE: Some systemd versions do not support FileDescriptorName, + // thus we also need to listen on descriptors with name "unknown", + // which is used for sockets where no name could be determined. Try> socketFds = -systemd::socket_activation::listenFdsWithName(name); +systemd::socket_activation::listenFdsWithNames({name, "unknown"}); #else Try> socketFds = Try>({}); // Dummy to avoid compile errors.
[mesos] branch 1.7.x updated (5b39908 -> 8af5a6d)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 5b39908 Changed termination logic of the default executor. new 0567b31 Moved the Docker executor declaration into a header. new 8af5a6d Fixed compiler error. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/CMakeLists.txt | 2 +- src/Makefile.am | 3 +- src/docker/CMakeLists.txt| 20 --- src/docker/executor.cpp | 348 +-- src/docker/executor.hpp | 53 ++ src/exec/exec.cpp| 2 + src/launcher/CMakeLists.txt | 5 + src/launcher/docker_executor.cpp | 266 ++ 8 files changed, 410 insertions(+), 289 deletions(-) delete mode 100644 src/docker/CMakeLists.txt create mode 100644 src/launcher/docker_executor.cpp
[mesos] 01/02: Moved the Docker executor declaration into a header.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 0567b31212105821d0b37ad049228dab6e98ed63 Author: Greg Mann AuthorDate: Thu Jul 25 12:17:41 2019 -0700 Moved the Docker executor declaration into a header. This moves the declaration of the Docker executor into the Docker executor header file and moves the code for the Docker executor binary into a new launcher implementation file. This change will enable the Mesos executor driver implementation to make use of the `DockerExecutor` symbol. Review: https://reviews.apache.org/r/71033/ --- src/CMakeLists.txt | 2 +- src/Makefile.am | 3 +- src/docker/CMakeLists.txt| 20 --- src/docker/executor.cpp | 348 +-- src/docker/executor.hpp | 53 ++ src/launcher/CMakeLists.txt | 5 + src/launcher/docker_executor.cpp | 266 ++ 7 files changed, 408 insertions(+), 289 deletions(-) diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index 5bfc8db..8b3e84a 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -242,6 +242,7 @@ set(CSI_SRC set(DOCKER_SRC docker/docker.cpp + docker/executor.cpp docker/spec.cpp) set(EXECUTOR_SRC @@ -599,7 +600,6 @@ endif () ## add_subdirectory(checks) add_subdirectory(cli) -add_subdirectory(docker) add_subdirectory(examples) add_subdirectory(launcher) add_subdirectory(local) diff --git a/src/Makefile.am b/src/Makefile.am index fb2a27e..4e99723 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -1006,6 +1006,7 @@ libmesos_no_3rdparty_la_SOURCES += \ common/validation.cpp \ common/values.cpp\ docker/docker.cpp\ + docker/executor.cpp \ docker/spec.cpp \ exec/exec.cpp \ executor/executor.cpp \ @@ -1743,7 +1744,7 @@ mesos_usage_CPPFLAGS = $(MESOS_CPPFLAGS) mesos_usage_LDADD = libmesos.la $(LDADD) pkglibexec_PROGRAMS += mesos-docker-executor -mesos_docker_executor_SOURCES = docker/executor.cpp +mesos_docker_executor_SOURCES = launcher/docker_executor.cpp mesos_docker_executor_CPPFLAGS = $(MESOS_CPPFLAGS) mesos_docker_executor_LDADD = libmesos.la $(LDADD) diff --git a/src/docker/CMakeLists.txt b/src/docker/CMakeLists.txt deleted file mode 100644 index 1196664..000 --- a/src/docker/CMakeLists.txt +++ /dev/null @@ -1,20 +0,0 @@ -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -# THE DOCKER EXECUTOR EXECUTABLE. -# -add_executable(mesos-docker-executor executor.cpp) -target_link_libraries(mesos-docker-executor PRIVATE mesos) diff --git a/src/docker/executor.cpp b/src/docker/executor.cpp index ed1b718..642ed65 100644 --- a/src/docker/executor.cpp +++ b/src/docker/executor.cpp @@ -854,291 +854,105 @@ private: }; -class DockerExecutor : public Executor +DockerExecutor::DockerExecutor( +const Owned& docker, +const string& container, +const string& sandboxDirectory, +const string& mappedDirectory, +const Duration& shutdownGracePeriod, +const string& launcherDir, +const map& taskEnvironment, +const Option& defaultContainerDNS, +bool cgroupsEnableCfs) { -public: - DockerExecutor( - const Owned& docker, - const string& container, - const string& sandboxDirectory, - const string& mappedDirectory, - const Duration& shutdownGracePeriod, - const string& launcherDir, - const map& taskEnvironment, - const Option& defaultContainerDNS, - bool cgroupsEnableCfs) - { -process = Owned(new DockerExecutorProcess( -
[mesos] 01/02: Moved the Docker executor declaration into a header.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 02eb0ceb87dadc0a5ac6f6cd9f141347e852fb80 Author: Greg Mann AuthorDate: Thu Jul 25 12:17:41 2019 -0700 Moved the Docker executor declaration into a header. This moves the declaration of the Docker executor into the Docker executor header file and moves the code for the Docker executor binary into a new launcher implementation file. This change will enable the Mesos executor driver implementation to make use of the `DockerExecutor` symbol. Review: https://reviews.apache.org/r/71033/ --- src/CMakeLists.txt | 2 +- src/Makefile.am | 3 +- src/docker/CMakeLists.txt| 20 --- src/docker/executor.cpp | 323 ++- src/docker/executor.hpp | 53 +++ src/launcher/CMakeLists.txt | 5 + src/launcher/docker_executor.cpp | 266 7 files changed, 408 insertions(+), 264 deletions(-) diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index 8cd56bb..232cf86 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -245,6 +245,7 @@ endif () set(DOCKER_SRC docker/docker.cpp + docker/executor.cpp docker/spec.cpp) set(EXECUTOR_SRC @@ -554,7 +555,6 @@ endif () ## add_subdirectory(checks) add_subdirectory(cli) -add_subdirectory(docker) add_subdirectory(examples) add_subdirectory(launcher) add_subdirectory(local) diff --git a/src/Makefile.am b/src/Makefile.am index 98ba106..6e18cde 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -992,6 +992,7 @@ libmesos_no_3rdparty_la_SOURCES += \ common/validation.cpp \ common/values.cpp\ docker/docker.cpp\ + docker/executor.cpp \ docker/spec.cpp \ exec/exec.cpp \ executor/executor.cpp \ @@ -1762,7 +1763,7 @@ mesos_usage_CPPFLAGS = $(MESOS_CPPFLAGS) mesos_usage_LDADD = libmesos.la $(LDADD) pkglibexec_PROGRAMS += mesos-docker-executor -mesos_docker_executor_SOURCES = docker/executor.cpp +mesos_docker_executor_SOURCES = launcher/docker_executor.cpp mesos_docker_executor_CPPFLAGS = $(MESOS_CPPFLAGS) mesos_docker_executor_LDADD = libmesos.la $(LDADD) diff --git a/src/docker/CMakeLists.txt b/src/docker/CMakeLists.txt deleted file mode 100644 index 1196664..000 --- a/src/docker/CMakeLists.txt +++ /dev/null @@ -1,20 +0,0 @@ -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -# THE DOCKER EXECUTOR EXECUTABLE. -# -add_executable(mesos-docker-executor executor.cpp) -target_link_libraries(mesos-docker-executor PRIVATE mesos) diff --git a/src/docker/executor.cpp b/src/docker/executor.cpp index d5c905d..714ed68 100644 --- a/src/docker/executor.cpp +++ b/src/docker/executor.cpp @@ -847,266 +847,105 @@ private: }; -class DockerExecutor : public Executor +DockerExecutor::DockerExecutor( +const Owned& docker, +const string& container, +const string& sandboxDirectory, +const string& mappedDirectory, +const Duration& shutdownGracePeriod, +const string& launcherDir, +const map& taskEnvironment, +const Option& defaultContainerDNS, +bool cgroupsEnableCfs) { -public: - DockerExecutor( - const Owned& docker, - const string& container, - const string& sandboxDirectory, - const string& mappedDirectory, - const Duration& shutdownGracePeriod, - const string& launcherDir, - const map& taskEnvironment, - const Option& defaultContainerDNS, - bool cgroupsEnableCfs) - { -process = Owned(new DockerExecutorProcess( -
[mesos] branch 1.6.x updated (205525e -> 11cdf04)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 205525e Changed termination logic of the default executor. new 02eb0ce Moved the Docker executor declaration into a header. new 11cdf04 Fixed compiler error. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/CMakeLists.txt | 2 +- src/Makefile.am | 3 +- src/docker/CMakeLists.txt| 20 --- src/docker/executor.cpp | 323 ++- src/docker/executor.hpp | 53 +++ src/exec/exec.cpp| 2 + src/launcher/CMakeLists.txt | 5 + src/launcher/docker_executor.cpp | 266 8 files changed, 410 insertions(+), 264 deletions(-) delete mode 100644 src/docker/CMakeLists.txt create mode 100644 src/launcher/docker_executor.cpp
[mesos] 02/02: Fixed compiler error.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 11cdf04e39f80a44d062627be39b53d05d76f88d Author: Andrei Budnik AuthorDate: Tue Feb 4 13:12:19 2020 +0100 Fixed compiler error. --- src/exec/exec.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/exec/exec.cpp b/src/exec/exec.cpp index adeb413..e5a03d9 100644 --- a/src/exec/exec.cpp +++ b/src/exec/exec.cpp @@ -47,6 +47,8 @@ #include "common/protobuf_utils.hpp" +#include "docker/executor.hpp" + #include "logging/flags.hpp" #include "logging/logging.hpp"
[mesos] 02/02: Fixed compiler error.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 8af5a6d50c64abb683b803f25bbc02e320fe70f6 Author: Andrei Budnik AuthorDate: Tue Feb 4 13:12:19 2020 +0100 Fixed compiler error. --- src/exec/exec.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/exec/exec.cpp b/src/exec/exec.cpp index dee2074..5838565 100644 --- a/src/exec/exec.cpp +++ b/src/exec/exec.cpp @@ -47,6 +47,8 @@ #include "common/protobuf_utils.hpp" +#include "docker/executor.hpp" + #include "logging/flags.hpp" #include "logging/logging.hpp"
[mesos] branch 1.8.x updated (a2ca451 -> b17ed79)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from a2ca451 Changed termination logic of the default executor. new 1995f63 Moved the Docker executor declaration into a header. new b17ed79 Fixed compiler error. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/CMakeLists.txt | 2 +- src/Makefile.am | 3 +- src/docker/CMakeLists.txt| 20 --- src/docker/executor.cpp | 348 +-- src/docker/executor.hpp | 53 ++ src/exec/exec.cpp| 2 + src/launcher/CMakeLists.txt | 5 + src/launcher/docker_executor.cpp | 266 ++ 8 files changed, 410 insertions(+), 289 deletions(-) delete mode 100644 src/docker/CMakeLists.txt create mode 100644 src/launcher/docker_executor.cpp
[mesos] 02/02: Fixed compiler error.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit b17ed79106b11c5f382006c453b4f885721e3f23 Author: Andrei Budnik AuthorDate: Tue Feb 4 13:12:19 2020 +0100 Fixed compiler error. --- src/exec/exec.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/exec/exec.cpp b/src/exec/exec.cpp index dee2074..5838565 100644 --- a/src/exec/exec.cpp +++ b/src/exec/exec.cpp @@ -47,6 +47,8 @@ #include "common/protobuf_utils.hpp" +#include "docker/executor.hpp" + #include "logging/flags.hpp" #include "logging/logging.hpp"
[mesos] 01/02: Moved the Docker executor declaration into a header.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 1995f63352a5a8c2c8e73adefed708a8620a5d47 Author: Greg Mann AuthorDate: Thu Jul 25 12:17:41 2019 -0700 Moved the Docker executor declaration into a header. This moves the declaration of the Docker executor into the Docker executor header file and moves the code for the Docker executor binary into a new launcher implementation file. This change will enable the Mesos executor driver implementation to make use of the `DockerExecutor` symbol. Review: https://reviews.apache.org/r/71033/ --- src/CMakeLists.txt | 2 +- src/Makefile.am | 3 +- src/docker/CMakeLists.txt| 20 --- src/docker/executor.cpp | 348 +-- src/docker/executor.hpp | 53 ++ src/launcher/CMakeLists.txt | 5 + src/launcher/docker_executor.cpp | 266 ++ 7 files changed, 408 insertions(+), 289 deletions(-) diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index 1d4f541..e1a5980 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -249,6 +249,7 @@ set(CSI_SRC set(DOCKER_SRC docker/docker.cpp + docker/executor.cpp docker/spec.cpp) set(EXECUTOR_SRC @@ -620,7 +621,6 @@ endif () ## add_subdirectory(checks) add_subdirectory(cli) -add_subdirectory(docker) add_subdirectory(examples) add_subdirectory(launcher) add_subdirectory(local) diff --git a/src/Makefile.am b/src/Makefile.am index 5f97523..ba4d453 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -1089,6 +1089,7 @@ libmesos_no_3rdparty_la_SOURCES += \ docker/docker.cpp\ docker/docker.hpp\ docker/executor.hpp \ + docker/executor.cpp \ docker/spec.cpp \ examples/flags.hpp \ examples/test_anonymous_module.hpp \ @@ -1809,7 +1810,7 @@ mesos_usage_CPPFLAGS = $(MESOS_CPPFLAGS) mesos_usage_LDADD = libmesos.la $(LDADD) pkglibexec_PROGRAMS += mesos-docker-executor -mesos_docker_executor_SOURCES = docker/executor.cpp +mesos_docker_executor_SOURCES = launcher/docker_executor.cpp mesos_docker_executor_CPPFLAGS = $(MESOS_CPPFLAGS) mesos_docker_executor_LDADD = libmesos.la $(LDADD) diff --git a/src/docker/CMakeLists.txt b/src/docker/CMakeLists.txt deleted file mode 100644 index 1196664..000 --- a/src/docker/CMakeLists.txt +++ /dev/null @@ -1,20 +0,0 @@ -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -# THE DOCKER EXECUTOR EXECUTABLE. -# -add_executable(mesos-docker-executor executor.cpp) -target_link_libraries(mesos-docker-executor PRIVATE mesos) diff --git a/src/docker/executor.cpp b/src/docker/executor.cpp index ed1b718..642ed65 100644 --- a/src/docker/executor.cpp +++ b/src/docker/executor.cpp @@ -854,291 +854,105 @@ private: }; -class DockerExecutor : public Executor +DockerExecutor::DockerExecutor( +const Owned& docker, +const string& container, +const string& sandboxDirectory, +const string& mappedDirectory, +const Duration& shutdownGracePeriod, +const string& launcherDir, +const map& taskEnvironment, +const Option& defaultContainerDNS, +bool cgroupsEnableCfs) { -public: - DockerExecutor( - const Owned& docker, - const string& container, - const string& sandboxDirectory, - const string& mappedDirectory, - const Duration& shutdownGracePeriod, - const string& launcherDir, - const map& taskEnvironment, - const Option& defaultContainerDNS, - bool cgroupsEnableCfs) - { -process = Owned(new DockerExecutorProcess( -docker, -containe
[mesos] branch 1.7.x updated (47f2a11 -> 5b39908)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 47f2a11 Remembered whether an executor was agent-generated. new 6a7da28 Changed termination logic of the Docker executor. new 5b39908 Changed termination logic of the default executor. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ src/launcher/default_executor.cpp | 25 +++-- 3 files changed, 44 insertions(+), 14 deletions(-)
[mesos] 02/02: Changed termination logic of the default executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 5b399080eee11ee03f4bc6c09b791c24670da6c1 Author: Andrei Budnik AuthorDate: Wed Jan 29 19:07:50 2020 +0100 Changed termination logic of the default executor. Previously, the default executor terminated itself after all containers had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgements before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the default executor if all status updates have been acknowledged by the agent and no running containers left. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72029 --- src/launcher/default_executor.cpp | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/src/launcher/default_executor.cpp b/src/launcher/default_executor.cpp index b89b036..6a049b3 100644 --- a/src/launcher/default_executor.cpp +++ b/src/launcher/default_executor.cpp @@ -278,6 +278,12 @@ public: containers.at(taskId)->acknowledged = true; } +// Terminate the executor if all status updates have been acknowledged +// by the agent and no running containers left. +if (containers.empty() && unacknowledgedUpdates.empty()) { + terminate(self()); +} + break; } @@ -1072,14 +1078,21 @@ protected: void _shutdown() { -const Duration duration = Seconds(1); +if (unacknowledgedUpdates.empty()) { + terminate(self()); +} else { + // This is a fail safe in case the agent doesn't send an ACK for + // a status update for some reason. + const Duration duration = Seconds(60); -LOG(INFO) << "Terminating after " << duration; + LOG(INFO) << "Terminating after " << duration; -// TODO(qianzhang): Remove this hack since the executor now receives -// acknowledgements for status updates. The executor can terminate -// after it receives an ACK for a terminal status update. -os::sleep(duration); + delay(duration, self(), ::__shutdown); +} + } + + void __shutdown() + { terminate(self()); }
[mesos] 02/02: Changed termination logic of the default executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 84b7af3409d8af343da0f0420e168a42de4b110f Author: Andrei Budnik AuthorDate: Wed Jan 29 19:07:50 2020 +0100 Changed termination logic of the default executor. Previously, the default executor terminated itself after all containers had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgements before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the default executor if all status updates have been acknowledged by the agent and no running containers left. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72029 --- src/launcher/default_executor.cpp | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/src/launcher/default_executor.cpp b/src/launcher/default_executor.cpp index 8720dad..c6b13e9 100644 --- a/src/launcher/default_executor.cpp +++ b/src/launcher/default_executor.cpp @@ -267,6 +267,12 @@ public: containers.at(taskId)->acknowledged = true; } +// Terminate the executor if all status updates have been acknowledged +// by the agent and no running containers left. +if (containers.empty() && unacknowledgedUpdates.empty()) { + terminate(self()); +} + break; } @@ -1060,14 +1066,21 @@ protected: void _shutdown() { -const Duration duration = Seconds(1); +if (unacknowledgedUpdates.empty()) { + terminate(self()); +} else { + // This is a fail safe in case the agent doesn't send an ACK for + // a status update for some reason. + const Duration duration = Seconds(60); -LOG(INFO) << "Terminating after " << duration; + LOG(INFO) << "Terminating after " << duration; -// TODO(qianzhang): Remove this hack since the executor now receives -// acknowledgements for status updates. The executor can terminate -// after it receives an ACK for a terminal status update. -os::sleep(duration); + delay(duration, self(), ::__shutdown); +} + } + + void __shutdown() + { terminate(self()); }
[mesos] branch 1.5.x updated (2f2146a -> 84b7af3)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 2f2146a Remembered whether an executor was agent-generated. new ff98f12 Changed termination logic of the Docker executor. new 84b7af3 Changed termination logic of the default executor. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ src/launcher/default_executor.cpp | 25 +++-- 3 files changed, 44 insertions(+), 14 deletions(-)
[mesos] 01/02: Changed termination logic of the Docker executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.5.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit ff98f12a50a56c13688b87068a116d1d08142f49 Author: Andrei Budnik AuthorDate: Wed Jan 29 13:35:02 2020 +0100 Changed termination logic of the Docker executor. Previously, the Docker executor terminated itself after a task's container had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgments before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the Docker executor after receiving a terminal status update acknowledgment. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72055 --- src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ 2 files changed, 25 insertions(+), 8 deletions(-) diff --git a/src/docker/executor.cpp b/src/docker/executor.cpp index 7b48f84..0d75801 100644 --- a/src/docker/executor.cpp +++ b/src/docker/executor.cpp @@ -199,7 +199,9 @@ public: driver->sendStatusUpdate(status); - _stop(); + // This is a fail safe in case the agent doesn't send an ACK for + // the terminal update for some reason. + delay(Seconds(60), self(), ::_stop); return; } @@ -691,17 +693,13 @@ private: CHECK_SOME(driver); driver.get()->sendStatusUpdate(taskStatus); -_stop(); +// This is a fail safe in case the agent doesn't send an ACK for +// the terminal update for some reason. +delay(Seconds(60), self(), ::_stop); } void _stop() { -// A hack for now ... but we need to wait until the status update -// is sent to the slave before we shut ourselves down. -// TODO(tnachen): Remove this hack and also the same hack in the -// command executor when we have the new HTTP APIs to wait until -// an ack. -os::sleep(Seconds(1)); driver.get()->stop(); } diff --git a/src/exec/exec.cpp b/src/exec/exec.cpp index 0c76a3f..e964a11 100644 --- a/src/exec/exec.cpp +++ b/src/exec/exec.cpp @@ -395,10 +395,29 @@ protected: return; } +if (!updates.contains(uuid_.get())) { + LOG(WARNING) << "Ignoring unknown status update acknowledgement " + << uuid_.get() << " for task " << taskId + << " of framework " << frameworkId; + return; +} + VLOG(1) << "Executor received status update acknowledgement " << uuid_.get() << " for task " << taskId << " of framework " << frameworkId; +// If this is a terminal status update acknowledgment for the Docker +// executor, stop the driver to terminate the executor. +// +// TODO(abudnik): This is a workaround for MESOS-9847. A better solution +// is to update supported API for the Docker executor from V0 to V1. It +// will allow the executor to handle status update acknowledgments itself. +if (mesos::internal::protobuf::isTerminalState( +updates[uuid_.get()].status().state()) && +dynamic_cast(executor)) { + driver->stop(); +} + // Remove the corresponding update. updates.erase(uuid_.get());
[mesos] 01/02: Changed termination logic of the Docker executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit f511f25be9d850ee9b65fc3ec5f54d149beb2f19 Author: Andrei Budnik AuthorDate: Wed Jan 29 13:35:02 2020 +0100 Changed termination logic of the Docker executor. Previously, the Docker executor terminated itself after a task's container had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgments before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the Docker executor after receiving a terminal status update acknowledgment. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72055 --- src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ 2 files changed, 25 insertions(+), 8 deletions(-) diff --git a/src/docker/executor.cpp b/src/docker/executor.cpp index 57cfbcd..d5c905d 100644 --- a/src/docker/executor.cpp +++ b/src/docker/executor.cpp @@ -217,7 +217,9 @@ public: driver->sendStatusUpdate(status); - _stop(); + // This is a fail safe in case the agent doesn't send an ACK for + // the terminal update for some reason. + delay(Seconds(60), self(), ::_stop); return; } @@ -750,17 +752,13 @@ private: CHECK_SOME(driver); driver.get()->sendStatusUpdate(taskStatus); -_stop(); +// This is a fail safe in case the agent doesn't send an ACK for +// the terminal update for some reason. +delay(Seconds(60), self(), ::_stop); } void _stop() { -// A hack for now ... but we need to wait until the status update -// is sent to the slave before we shut ourselves down. -// TODO(tnachen): Remove this hack and also the same hack in the -// command executor when we have the new HTTP APIs to wait until -// an ack. -os::sleep(Seconds(1)); driver.get()->stop(); } diff --git a/src/exec/exec.cpp b/src/exec/exec.cpp index 65a671d..adeb413 100644 --- a/src/exec/exec.cpp +++ b/src/exec/exec.cpp @@ -395,10 +395,29 @@ protected: return; } +if (!updates.contains(uuid_.get())) { + LOG(WARNING) << "Ignoring unknown status update acknowledgement " + << uuid_.get() << " for task " << taskId + << " of framework " << frameworkId; + return; +} + VLOG(1) << "Executor received status update acknowledgement " << uuid_.get() << " for task " << taskId << " of framework " << frameworkId; +// If this is a terminal status update acknowledgment for the Docker +// executor, stop the driver to terminate the executor. +// +// TODO(abudnik): This is a workaround for MESOS-9847. A better solution +// is to update supported API for the Docker executor from V0 to V1. It +// will allow the executor to handle status update acknowledgments itself. +if (mesos::internal::protobuf::isTerminalState( +updates[uuid_.get()].status().state()) && +dynamic_cast(executor)) { + driver->stop(); +} + // Remove the corresponding update. updates.erase(uuid_.get());
[mesos] branch 1.6.x updated (7e4d380 -> 205525e)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 7e4d380 Remembered whether an executor was agent-generated. new f511f25 Changed termination logic of the Docker executor. new 205525e Changed termination logic of the default executor. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ src/launcher/default_executor.cpp | 25 +++-- 3 files changed, 44 insertions(+), 14 deletions(-)
[mesos] 02/02: Changed termination logic of the default executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.6.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 205525eb56a33e58bed1fc38e0b32189b19d3fbc Author: Andrei Budnik AuthorDate: Wed Jan 29 19:07:50 2020 +0100 Changed termination logic of the default executor. Previously, the default executor terminated itself after all containers had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgements before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the default executor if all status updates have been acknowledged by the agent and no running containers left. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72029 --- src/launcher/default_executor.cpp | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/src/launcher/default_executor.cpp b/src/launcher/default_executor.cpp index 76c6106..ca17c70 100644 --- a/src/launcher/default_executor.cpp +++ b/src/launcher/default_executor.cpp @@ -276,6 +276,12 @@ public: containers.at(taskId)->acknowledged = true; } +// Terminate the executor if all status updates have been acknowledged +// by the agent and no running containers left. +if (containers.empty() && unacknowledgedUpdates.empty()) { + terminate(self()); +} + break; } @@ -1070,14 +1076,21 @@ protected: void _shutdown() { -const Duration duration = Seconds(1); +if (unacknowledgedUpdates.empty()) { + terminate(self()); +} else { + // This is a fail safe in case the agent doesn't send an ACK for + // a status update for some reason. + const Duration duration = Seconds(60); -LOG(INFO) << "Terminating after " << duration; + LOG(INFO) << "Terminating after " << duration; -// TODO(qianzhang): Remove this hack since the executor now receives -// acknowledgements for status updates. The executor can terminate -// after it receives an ACK for a terminal status update. -os::sleep(duration); + delay(duration, self(), ::__shutdown); +} + } + + void __shutdown() + { terminate(self()); }
[mesos] 01/02: Changed termination logic of the Docker executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.7.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 6a7da284d1b89f8a144ed2f896f005a5ee9d4aea Author: Andrei Budnik AuthorDate: Wed Jan 29 13:35:02 2020 +0100 Changed termination logic of the Docker executor. Previously, the Docker executor terminated itself after a task's container had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgments before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the Docker executor after receiving a terminal status update acknowledgment. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72055 --- src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ 2 files changed, 25 insertions(+), 8 deletions(-) diff --git a/src/docker/executor.cpp b/src/docker/executor.cpp index f638e4b..ed1b718 100644 --- a/src/docker/executor.cpp +++ b/src/docker/executor.cpp @@ -224,7 +224,9 @@ public: driver->sendStatusUpdate(status); - _stop(); + // This is a fail safe in case the agent doesn't send an ACK for + // the terminal update for some reason. + delay(Seconds(60), self(), ::_stop); return; } @@ -757,17 +759,13 @@ private: CHECK_SOME(driver); driver.get()->sendStatusUpdate(taskStatus); -_stop(); +// This is a fail safe in case the agent doesn't send an ACK for +// the terminal update for some reason. +delay(Seconds(60), self(), ::_stop); } void _stop() { -// A hack for now ... but we need to wait until the status update -// is sent to the slave before we shut ourselves down. -// TODO(tnachen): Remove this hack and also the same hack in the -// command executor when we have the new HTTP APIs to wait until -// an ack. -os::sleep(Seconds(1)); driver.get()->stop(); } diff --git a/src/exec/exec.cpp b/src/exec/exec.cpp index c0fa3b6..dee2074 100644 --- a/src/exec/exec.cpp +++ b/src/exec/exec.cpp @@ -395,10 +395,29 @@ protected: return; } +if (!updates.contains(uuid_.get())) { + LOG(WARNING) << "Ignoring unknown status update acknowledgement " + << uuid_.get() << " for task " << taskId + << " of framework " << frameworkId; + return; +} + VLOG(1) << "Executor received status update acknowledgement " << uuid_.get() << " for task " << taskId << " of framework " << frameworkId; +// If this is a terminal status update acknowledgment for the Docker +// executor, stop the driver to terminate the executor. +// +// TODO(abudnik): This is a workaround for MESOS-9847. A better solution +// is to update supported API for the Docker executor from V0 to V1. It +// will allow the executor to handle status update acknowledgments itself. +if (mesos::internal::protobuf::isTerminalState( +updates[uuid_.get()].status().state()) && +dynamic_cast(executor)) { + driver->stop(); +} + // Remove the corresponding update. updates.erase(uuid_.get());
[mesos] 02/02: Changed termination logic of the default executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.9.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit f37ae68a8f0d23a2e0f31812b8fe4494109769c6 Author: Andrei Budnik AuthorDate: Wed Jan 29 19:07:50 2020 +0100 Changed termination logic of the default executor. Previously, the default executor terminated itself after all containers had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgements before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the default executor if all status updates have been acknowledged by the agent and no running containers left. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72029 --- src/launcher/default_executor.cpp | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/src/launcher/default_executor.cpp b/src/launcher/default_executor.cpp index 521494a..7781382 100644 --- a/src/launcher/default_executor.cpp +++ b/src/launcher/default_executor.cpp @@ -279,6 +279,12 @@ public: containers.at(taskId)->acknowledged = true; } +// Terminate the executor if all status updates have been acknowledged +// by the agent and no running containers left. +if (containers.empty() && unacknowledgedUpdates.empty()) { + terminate(self()); +} + break; } @@ -1088,14 +1094,21 @@ protected: void _shutdown() { -const Duration duration = Seconds(1); +if (unacknowledgedUpdates.empty()) { + terminate(self()); +} else { + // This is a fail safe in case the agent doesn't send an ACK for + // a status update for some reason. + const Duration duration = Seconds(60); -LOG(INFO) << "Terminating after " << duration; + LOG(INFO) << "Terminating after " << duration; -// TODO(qianzhang): Remove this hack since the executor now receives -// acknowledgements for status updates. The executor can terminate -// after it receives an ACK for a terminal status update. -os::sleep(duration); + delay(duration, self(), ::__shutdown); +} + } + + void __shutdown() + { terminate(self()); }
[mesos] branch 1.9.x updated (298d1d7 -> f37ae68)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.9.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from 298d1d7 Remembered whether an executor was agent-generated. new 3d60cba Changed termination logic of the Docker executor. new f37ae68 Changed termination logic of the default executor. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ src/launcher/default_executor.cpp | 25 +++-- 3 files changed, 44 insertions(+), 14 deletions(-)
[mesos] 01/02: Changed termination logic of the Docker executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.9.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 3d60cba39d0377a7dc19b4c47f3bb0807418fe50 Author: Andrei Budnik AuthorDate: Wed Jan 29 13:35:02 2020 +0100 Changed termination logic of the Docker executor. Previously, the Docker executor terminated itself after a task's container had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgments before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the Docker executor after receiving a terminal status update acknowledgment. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72055 --- src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ 2 files changed, 25 insertions(+), 8 deletions(-) diff --git a/src/docker/executor.cpp b/src/docker/executor.cpp index 132f42b..ebbbc0d 100644 --- a/src/docker/executor.cpp +++ b/src/docker/executor.cpp @@ -224,7 +224,9 @@ public: driver->sendStatusUpdate(status); - _stop(); + // This is a fail safe in case the agent doesn't send an ACK for + // the terminal update for some reason. + delay(Seconds(60), self(), ::_stop); return; } @@ -773,17 +775,13 @@ private: CHECK_SOME(driver); driver.get()->sendStatusUpdate(taskStatus); -_stop(); +// This is a fail safe in case the agent doesn't send an ACK for +// the terminal update for some reason. +delay(Seconds(60), self(), ::_stop); } void _stop() { -// A hack for now ... but we need to wait until the status update -// is sent to the slave before we shut ourselves down. -// TODO(tnachen): Remove this hack and also the same hack in the -// command executor when we have the new HTTP APIs to wait until -// an ack. -os::sleep(Seconds(1)); driver.get()->stop(); } diff --git a/src/exec/exec.cpp b/src/exec/exec.cpp index 67e082e..8ed77f2 100644 --- a/src/exec/exec.cpp +++ b/src/exec/exec.cpp @@ -409,10 +409,29 @@ protected: return; } +if (!updates.contains(uuid_.get())) { + LOG(WARNING) << "Ignoring unknown status update acknowledgement " + << uuid_.get() << " for task " << taskId + << " of framework " << frameworkId; + return; +} + VLOG(1) << "Executor received status update acknowledgement " << uuid_.get() << " for task " << taskId << " of framework " << frameworkId; +// If this is a terminal status update acknowledgment for the Docker +// executor, stop the driver to terminate the executor. +// +// TODO(abudnik): This is a workaround for MESOS-9847. A better solution +// is to update supported API for the Docker executor from V0 to V1. It +// will allow the executor to handle status update acknowledgments itself. +if (mesos::internal::protobuf::isTerminalState( +updates[uuid_.get()].status().state()) && +dynamic_cast(executor)) { + driver->stop(); +} + // Remove the corresponding update. updates.erase(uuid_.get());
[mesos] 01/02: Changed termination logic of the Docker executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit 1bd0b37a7e522d63319db426dae7068b901eaea6 Author: Andrei Budnik AuthorDate: Wed Jan 29 13:35:02 2020 +0100 Changed termination logic of the Docker executor. Previously, the Docker executor terminated itself after a task's container had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgments before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the Docker executor after receiving a terminal status update acknowledgment. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72055 --- src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ 2 files changed, 25 insertions(+), 8 deletions(-) diff --git a/src/docker/executor.cpp b/src/docker/executor.cpp index f638e4b..ed1b718 100644 --- a/src/docker/executor.cpp +++ b/src/docker/executor.cpp @@ -224,7 +224,9 @@ public: driver->sendStatusUpdate(status); - _stop(); + // This is a fail safe in case the agent doesn't send an ACK for + // the terminal update for some reason. + delay(Seconds(60), self(), ::_stop); return; } @@ -757,17 +759,13 @@ private: CHECK_SOME(driver); driver.get()->sendStatusUpdate(taskStatus); -_stop(); +// This is a fail safe in case the agent doesn't send an ACK for +// the terminal update for some reason. +delay(Seconds(60), self(), ::_stop); } void _stop() { -// A hack for now ... but we need to wait until the status update -// is sent to the slave before we shut ourselves down. -// TODO(tnachen): Remove this hack and also the same hack in the -// command executor when we have the new HTTP APIs to wait until -// an ack. -os::sleep(Seconds(1)); driver.get()->stop(); } diff --git a/src/exec/exec.cpp b/src/exec/exec.cpp index c0fa3b6..dee2074 100644 --- a/src/exec/exec.cpp +++ b/src/exec/exec.cpp @@ -395,10 +395,29 @@ protected: return; } +if (!updates.contains(uuid_.get())) { + LOG(WARNING) << "Ignoring unknown status update acknowledgement " + << uuid_.get() << " for task " << taskId + << " of framework " << frameworkId; + return; +} + VLOG(1) << "Executor received status update acknowledgement " << uuid_.get() << " for task " << taskId << " of framework " << frameworkId; +// If this is a terminal status update acknowledgment for the Docker +// executor, stop the driver to terminate the executor. +// +// TODO(abudnik): This is a workaround for MESOS-9847. A better solution +// is to update supported API for the Docker executor from V0 to V1. It +// will allow the executor to handle status update acknowledgments itself. +if (mesos::internal::protobuf::isTerminalState( +updates[uuid_.get()].status().state()) && +dynamic_cast(executor)) { + driver->stop(); +} + // Remove the corresponding update. updates.erase(uuid_.get());
[mesos] branch 1.8.x updated (a66419b -> a2ca451)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git. from a66419b Remembered whether an executor was agent-generated. new 1bd0b37 Changed termination logic of the Docker executor. new a2ca451 Changed termination logic of the default executor. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ src/launcher/default_executor.cpp | 25 +++-- 3 files changed, 44 insertions(+), 14 deletions(-)
[mesos] 02/02: Changed termination logic of the default executor.
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a commit to branch 1.8.x in repository https://gitbox.apache.org/repos/asf/mesos.git commit a2ca451aab4625e126b9e7b470eb9f7c232dd746 Author: Andrei Budnik AuthorDate: Wed Jan 29 19:07:50 2020 +0100 Changed termination logic of the default executor. Previously, the default executor terminated itself after all containers had terminated. This could lead to termination of the executor before processing of a terminal status update by the agent. In order to mitigate this issue, the executor slept for one second to give a chance to send all status updates and receive all status update acknowledgements before terminating itself. This might have led to various race conditions in some circumstances (e.g., on a slow host). This patch terminates the default executor if all status updates have been acknowledged by the agent and no running containers left. Also, this patch increases the timeout from one second to one minute for fail-safety. Review: https://reviews.apache.org/r/72029 --- src/launcher/default_executor.cpp | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/src/launcher/default_executor.cpp b/src/launcher/default_executor.cpp index 5837cfa..af520f7 100644 --- a/src/launcher/default_executor.cpp +++ b/src/launcher/default_executor.cpp @@ -278,6 +278,12 @@ public: containers.at(taskId)->acknowledged = true; } +// Terminate the executor if all status updates have been acknowledged +// by the agent and no running containers left. +if (containers.empty() && unacknowledgedUpdates.empty()) { + terminate(self()); +} + break; } @@ -1076,14 +1082,21 @@ protected: void _shutdown() { -const Duration duration = Seconds(1); +if (unacknowledgedUpdates.empty()) { + terminate(self()); +} else { + // This is a fail safe in case the agent doesn't send an ACK for + // a status update for some reason. + const Duration duration = Seconds(60); -LOG(INFO) << "Terminating after " << duration; + LOG(INFO) << "Terminating after " << duration; -// TODO(qianzhang): Remove this hack since the executor now receives -// acknowledgements for status updates. The executor can terminate -// after it receives an ACK for a terminal status update. -os::sleep(duration); + delay(duration, self(), ::__shutdown); +} + } + + void __shutdown() + { terminate(self()); }
[mesos] branch master updated (3d197d8 -> 683dfc1)
This is an automated email from the ASF dual-hosted git repository. abudnik pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git. from 3d197d8 Fixed the build on Windows. new 457c389 Changed termination logic of the Docker executor. new 683dfc1 Changed termination logic of the default executor. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: src/docker/executor.cpp | 14 ++ src/exec/exec.cpp | 19 +++ src/launcher/default_executor.cpp | 25 +++-- 3 files changed, 44 insertions(+), 14 deletions(-)