[mesos] 02/02: Updated `upgrades.md` for the configurable shared memory project.

2019-08-26 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 5dfa256ac63775b3942b68fdc99f6a58345f1ab8
Author: Qian Zhang 
AuthorDate: Tue Aug 27 10:16:52 2019 +0800

Updated `upgrades.md` for the configurable shared memory project.
---
 docs/upgrades.md | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/docs/upgrades.md b/docs/upgrades.md
index 2be13fb..63eb1bb 100644
--- a/docs/upgrades.md
+++ b/docs/upgrades.md
@@ -51,17 +51,21 @@ We categorize the changes as follows:
   A Linux NNP isolator
   A hostname_validation_scheme
   C TLS certificate 
verification behaviour
+  C Configurable IPC namespace and 
/dev/shm
 
  
 
   
 
   A docker_ignore_runtime
+  A disallow_sharing_agent_ipc_namespace
+  A default_container_shm_size
 
   
 
   
 
+  A LinuxInfo.ipc_mode and 
LinuxInfo.shm_size
 
   
 
@@ -532,6 +536,8 @@ We categorize the changes as follows:
 would have been successfull. Users that rely on incoming connection 
requests presenting valid TLS certificates should make sure that
 the `LIBPROCESS_SSL_REQUIRE_CERT` option is set to true.
 
+
+* The Mesos containerizer now supports configurable IPC namespace and 
/dev/shm. Container can be configured to have a private IPC namespace and 
/dev/shm or share them from its parent via the field `LinuxInfo.ipc_mode`, and 
the size of its private /dev/shm is also configurable via the field 
`LinuxInfo.shm_size`. Operators can control whether it is allowed to share 
host's IPC namespace and /dev/shm with top level containers via the agent flag 
`--disallow_sharing_agent_ipc_namespace`, and s [...]
 
 ## Upgrading from 1.7.x to 1.8.x ##
 



[mesos] branch master updated (50dcd56 -> 5dfa256)

2019-08-26 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from 50dcd56  Added agent reactivations to the existing agent draining 
tests.
 new 9a5b298  Added MESOS-9795 to the 1.9.0 release highlights.
 new 5dfa256  Updated `upgrades.md` for the configurable shared memory 
project.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG| 23 +--
 docs/upgrades.md |  6 ++
 2 files changed, 19 insertions(+), 10 deletions(-)



[mesos] 01/02: Added MESOS-9795 to the 1.9.0 release highlights.

2019-08-26 Thread qianzhang
This is an automated email from the ASF dual-hosted git repository.

qianzhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 9a5b2986a74006cb68e2262b4b2d5f7e22058a27
Author: Qian Zhang 
AuthorDate: Tue Aug 27 09:29:21 2019 +0800

Added MESOS-9795 to the 1.9.0 release highlights.

The style of the Containerization section in the 1.9.0 release
highlights was also updated to be consistent with other sections.
---
 CHANGELOG | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/CHANGELOG b/CHANGELOG
index 58cf418..a5bb8d5 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -14,19 +14,22 @@ This release contains the following highlights:
 
   * Containerization:
 
-* [MESOS-9760] - A new `--docker_ignore_runtime` flag has been
-  added. This causes the agent to ignore any runtime configuration
-  present in Docker images.
+* A new `--docker_ignore_runtime` flag has been added. This causes the 
agent
+  to ignore any runtime configuration present in Docker images. 
(MESOS-9760)
 
-* [MESOS-9770] - Add no-new-privileges isolator. An additional
-  Linux isolator has been added to support enabling the no_new_privs
-  process control flag.
+* Add no-new-privileges isolator. A new Linux isolator has been added to
+  support enabling the no_new_privs process control flag. (MESOS-9770)
 
-* [MESOS-9771] - The Mesos containerizer now masks sensitive paths
-  in `/proc` for containers that do not share the host's PID namespace.
+* The Mesos containerizer now masks sensitive paths in `/proc` for
+  containers that do not share the host's PID namespace. (MESOS-9771)
 
-* [MESOS-9900] - The Mesos containerizer now includes ephemeral
-  overlayfs storage in the task disk quota as well as sandbox storage.
+* The Mesos containerizer now supports configurable IPC namespace and
+  /dev/shm. Container can be configured to have a private IPC namespace
+  and /dev/shm or share them from its parent, and the size of its private
+  /dev/shm is also configurable. (MESOS-9795)
+
+* The Mesos containerizer now includes ephemeral overlayfs storage in the
+  task disk quota as well as sandbox storage. (MESOS-9900)
 
 Additional API Changes:
 



[mesos] 02/05: Refactored master draining test setup.

2019-08-26 Thread josephwu
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 4f078398f7010d982a1c4ee95a1e3f628813e6fe
Author: Joseph Wu 
AuthorDate: Mon Jul 29 19:43:31 2019 -0700

Refactored master draining test setup.

Tests of this feature will generally require a master, agent, framework,
and a single task to be launched at the beginning of the test.
This moves this common code into the test SetUp.

This also changes the `post(...)` helper to return the http::Response
object instead of parsing it.  The response for DRAIN_AGENT calls
does not return an object, so the tests were not checking the
response before.

Review: https://reviews.apache.org/r/71315
---
 src/tests/master_draining_tests.cpp | 494 +---
 1 file changed, 175 insertions(+), 319 deletions(-)

diff --git a/src/tests/master_draining_tests.cpp 
b/src/tests/master_draining_tests.cpp
index 16d0c85..eae809f 100644
--- a/src/tests/master_draining_tests.cpp
+++ b/src/tests/master_draining_tests.cpp
@@ -14,6 +14,7 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.
 
+#include 
 #include 
 
 #include 
@@ -73,6 +74,130 @@ class MasterDrainingTest
 public WithParamInterface
 {
 public:
+  // Creates a master, agent, framework, and launches one sleep task.
+  void SetUp() override
+  {
+MesosTest::SetUp();
+
+Clock::pause();
+
+// Create the master.
+masterFlags = CreateMasterFlags();
+Try> _master = StartMaster(masterFlags);
+ASSERT_SOME(_master);
+master = _master.get();
+
+Future slaveRegisteredMessage =
+  FUTURE_PROTOBUF(SlaveRegisteredMessage(), _, _);
+
+// Create the agent.
+agentFlags = CreateSlaveFlags();
+detector = master.get()->createDetector();
+Try> _slave = StartSlave(detector.get(), agentFlags);
+ASSERT_SOME(_slave);
+slave = _slave.get();
+
+Clock::advance(agentFlags.registration_backoff_factor);
+AWAIT_READY(slaveRegisteredMessage);
+
+// Create the framework.
+scheduler = std::make_shared();
+
+frameworkInfo = v1::DEFAULT_FRAMEWORK_INFO;
+frameworkInfo.set_checkpoint(true);
+frameworkInfo.add_capabilities()->set_type(
+v1::FrameworkInfo::Capability::PARTITION_AWARE);
+
+EXPECT_CALL(*scheduler, connected(_))
+  .WillOnce(v1::scheduler::SendSubscribe(frameworkInfo));
+
+Future subscribed;
+EXPECT_CALL(*scheduler, subscribed(_, _))
+  .WillOnce(FutureArg<1>());
+
+EXPECT_CALL(*scheduler, heartbeat(_))
+  .WillRepeatedly(Return()); // Ignore heartbeats.
+
+Future offers;
+EXPECT_CALL(*scheduler, offers(_, _))
+  .WillOnce(FutureArg<1>())
+  .WillRepeatedly(Return());
+
+mesos = std::make_shared(
+master.get()->pid, ContentType::PROTOBUF, scheduler);
+
+AWAIT_READY(subscribed);
+frameworkId = subscribed->framework_id();
+
+// Launch a sleep task.
+AWAIT_READY(offers);
+ASSERT_FALSE(offers->offers().empty());
+
+const v1::Offer& offer = offers->offers(0);
+agentId = offer.agent_id();
+
+Try resources =
+  v1::Resources::parse("cpus:0.1;mem:64;disk:64");
+
+ASSERT_SOME(resources);
+
+taskInfo = v1::createTask(agentId, resources.get(), SLEEP_COMMAND(1000));
+
+testing::Sequence updateSequence;
+Future startingUpdate;
+Future runningUpdate;
+
+// Make sure the agent receives these two acknowledgements.
+Future startingAck =
+  FUTURE_PROTOBUF(StatusUpdateAcknowledgementMessage(), _, _);
+Future runningAck =
+  FUTURE_PROTOBUF(StatusUpdateAcknowledgementMessage(), _, _);
+
+EXPECT_CALL(
+*scheduler,
+update(_, AllOf(
+TaskStatusUpdateTaskIdEq(taskInfo.task_id()),
+TaskStatusUpdateStateEq(v1::TASK_STARTING
+  .InSequence(updateSequence)
+  .WillOnce(DoAll(
+  FutureArg<1>(),
+  v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+EXPECT_CALL(
+*scheduler,
+update(_, AllOf(
+  TaskStatusUpdateTaskIdEq(taskInfo.task_id()),
+  TaskStatusUpdateStateEq(v1::TASK_RUNNING
+  .InSequence(updateSequence)
+  .WillOnce(DoAll(
+  FutureArg<1>(),
+  v1::scheduler::SendAcknowledge(frameworkId, agentId)));
+
+mesos->send(
+v1::createCallAccept(
+frameworkId,
+offer,
+{v1::LAUNCH({taskInfo})}));
+
+AWAIT_READY(startingUpdate);
+AWAIT_READY(startingAck);
+AWAIT_READY(runningUpdate);
+AWAIT_READY(runningAck);
+  }
+
+  void TearDown() override
+  {
+mesos.reset();
+scheduler.reset();
+slave.reset();
+detector.reset();
+master.reset();
+
+Clock::resume();
+
+MesosTest::TearDown();
+  }
+
   master::Flags CreateMasterFlags() override
   {
 // Turn off 

[mesos] branch master updated (c104977 -> 50dcd56)

2019-08-26 Thread josephwu
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from c104977  Updated site's middleman versions.
 new 5124b29  Moved master-side agent draining tests into a separate file.
 new 4f07839  Refactored master draining test setup.
 new 1e36619  Added draining tests for empty agents.
 new 5c57128  Added draining test for momentarily disconnected agents.
 new 50dcd56  Added agent reactivations to the existing agent draining 
tests.

The 5 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 src/Makefile.am |3 +-
 src/tests/CMakeLists.txt|1 +
 src/tests/api_tests.cpp |  541 ---
 src/tests/master_draining_tests.cpp | 1018 +++
 4 files changed, 1021 insertions(+), 542 deletions(-)
 create mode 100644 src/tests/master_draining_tests.cpp



[mesos] 04/05: Added draining test for momentarily disconnected agents.

2019-08-26 Thread josephwu
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 5c5712869876cad50a34af29cdcbfac9b1e9eb45
Author: Joseph Wu 
AuthorDate: Mon Aug 19 12:11:18 2019 -0700

Added draining test for momentarily disconnected agents.

This exercises the agent draining code when the agent is disconnected
from the master at the time of starting draining.  Draining is expected
to proceed once the agent reregisters.

Review: https://reviews.apache.org/r/71317
---
 src/tests/master_draining_tests.cpp | 201 
 1 file changed, 201 insertions(+)

diff --git a/src/tests/master_draining_tests.cpp 
b/src/tests/master_draining_tests.cpp
index 674f5b5..235bf1b 100644
--- a/src/tests/master_draining_tests.cpp
+++ b/src/tests/master_draining_tests.cpp
@@ -254,6 +254,99 @@ TEST_P(MasterAlreadyDrainedTest, DrainAgentMarkGone)
 }
 
 
+// When an operator submits a DRAIN_AGENT call with an agent that has
+// momentarily disconnected, the call should succeed, and the agent should
+// be drained when it returns to the cluster.
+TEST_P(MasterAlreadyDrainedTest, DrainAgentDisconnected)
+{
+  // Simulate an agent crash, so that it disconnects from the master.
+  slave->terminate();
+  slave.reset();
+
+  ContentType contentType = GetParam();
+
+  // Ensure that the agent is disconnected (not active).
+  {
+v1::master::Call call;
+call.set_type(v1::master::Call::GET_AGENTS);
+
+Future response =
+  post(master->pid, call, contentType);
+AWAIT_ASSERT_RESPONSE_STATUS_EQ(http::OK().status, response);
+
+Try getAgents =
+  deserialize(contentType, response->body);
+ASSERT_SOME(getAgents);
+
+ASSERT_EQ(v1::master::Response::GET_AGENTS, getAgents->type());
+ASSERT_EQ(getAgents->get_agents().agents_size(), 1);
+
+const v1::master::Response::GetAgents::Agent& agent =
+getAgents->get_agents().agents(0);
+
+EXPECT_EQ(agent.active(), false);
+EXPECT_EQ(agent.deactivated(), false);
+  }
+
+  // Start draining the disconnected agent.
+  {
+v1::master::Call::DrainAgent drainAgent;
+drainAgent.mutable_agent_id()->CopyFrom(agentId);
+
+v1::master::Call call;
+call.set_type(v1::master::Call::DRAIN_AGENT);
+call.mutable_drain_agent()->CopyFrom(drainAgent);
+
+AWAIT_EXPECT_RESPONSE_STATUS_EQ(
+http::OK().status,
+post(master->pid, call, contentType));
+  }
+
+  // Bring the agent back.
+  Future slaveReregisteredMessage =
+FUTURE_PROTOBUF(SlaveReregisteredMessage(), _, _);
+
+  Future drainSlaveMesage =
+FUTURE_PROTOBUF(DrainSlaveMessage(), _, _);
+
+  Try> recoveredSlave =
+StartSlave(detector.get(), agentFlags);
+  ASSERT_SOME(recoveredSlave);
+
+  Clock::advance(agentFlags.executor_reregistration_timeout);
+  Clock::settle();
+  Clock::advance(agentFlags.registration_backoff_factor);
+  Clock::settle();
+  AWAIT_READY(slaveReregisteredMessage);
+
+  // The agent should be told to drain once it reregisters.
+  AWAIT_READY(drainSlaveMesage);
+
+  // Ensure that the agent is marked as DRAINED in the master now.
+  {
+v1::master::Call call;
+call.set_type(v1::master::Call::GET_AGENTS);
+
+Future response =
+  post(master->pid, call, contentType);
+AWAIT_ASSERT_RESPONSE_STATUS_EQ(http::OK().status, response);
+
+Try getAgents =
+  deserialize(contentType, response->body);
+ASSERT_SOME(getAgents);
+
+ASSERT_EQ(v1::master::Response::GET_AGENTS, getAgents->type());
+ASSERT_EQ(getAgents->get_agents().agents_size(), 1);
+
+const v1::master::Response::GetAgents::Agent& agent =
+getAgents->get_agents().agents(0);
+
+EXPECT_EQ(agent.deactivated(), true);
+EXPECT_EQ(mesos::v1::DRAINED, agent.drain_info().state());
+  }
+}
+
+
 // When an operator submits a DRAIN_AGENT call for an agent that has gone
 // unreachable, the call should succeed, and the agent should be drained
 // if/when it returns to the cluster.
@@ -627,6 +720,114 @@ TEST_P(MasterDrainingTest, DrainAgentMarkGone)
 }
 
 
+// When an operator submits a DRAIN_AGENT call with an agent that has
+// momentarily disconnected, the call should succeed, and the agent should
+// be drained when it returns to the cluster.
+TEST_P(MasterDrainingTest, DrainAgentDisconnected)
+{
+  // Simulate an agent crash, so that it disconnects from the master.
+  slave->terminate();
+  slave.reset();
+
+  ContentType contentType = GetParam();
+
+  // Ensure that the agent is disconnected (not active).
+  {
+v1::master::Call call;
+call.set_type(v1::master::Call::GET_AGENTS);
+
+Future response =
+  post(master->pid, call, contentType);
+AWAIT_ASSERT_RESPONSE_STATUS_EQ(http::OK().status, response);
+
+Try getAgents =
+  deserialize(contentType, response->body);
+ASSERT_SOME(getAgents);
+
+ASSERT_EQ(v1::master::Response::GET_AGENTS, getAgents->type());

[mesos] 01/05: Moved master-side agent draining tests into a separate file.

2019-08-26 Thread josephwu
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 5124b290ddc368e2e7cc3d56173fb4b3137af620
Author: Joseph Wu 
AuthorDate: Wed Jul 24 15:45:22 2019 -0700

Moved master-side agent draining tests into a separate file.

The test bodies were not changed, besides renaming the test class.

Review: https://reviews.apache.org/r/71314
---
 src/Makefile.am |   3 +-
 src/tests/CMakeLists.txt|   1 +
 src/tests/api_tests.cpp | 541 -
 src/tests/master_draining_tests.cpp | 662 
 4 files changed, 665 insertions(+), 542 deletions(-)

diff --git a/src/Makefile.am b/src/Makefile.am
index a89cd61..577acfd 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -2608,7 +2608,8 @@ mesos_tests_SOURCES = 
\
   tests/master_allocator_tests.cpp \
   tests/master_authorization_tests.cpp \
   tests/master_benchmarks.cpp  \
-  tests/master_contender_detector_tests.cpp \
+  tests/master_contender_detector_tests.cpp\
+  tests/master_draining_tests.cpp  \
   tests/master_load_tests.cpp  \
   tests/master_maintenance_tests.cpp   \
   tests/master_quota_tests.cpp \
diff --git a/src/tests/CMakeLists.txt b/src/tests/CMakeLists.txt
index 04c552a..1e53b39 100644
--- a/src/tests/CMakeLists.txt
+++ b/src/tests/CMakeLists.txt
@@ -105,6 +105,7 @@ set(MESOS_TESTS_SRC
   hook_tests.cpp
   http_authentication_tests.cpp
   http_fault_tolerance_tests.cpp
+  master_draining_tests.cpp
   master_load_tests.cpp
   master_maintenance_tests.cpp
   master_slave_reconciliation_tests.cpp
diff --git a/src/tests/api_tests.cpp b/src/tests/api_tests.cpp
index a735a20..bd207ea 100644
--- a/src/tests/api_tests.cpp
+++ b/src/tests/api_tests.cpp
@@ -5470,547 +5470,6 @@ TEST_P(MasterAPITest, OperationUpdatesUponUnreachable)
 }
 
 
-// When an operator submits a DRAIN_AGENT call, the agent should kill all
-// running tasks.
-TEST_P(MasterAPITest, DrainAgent)
-{
-  Clock::pause();
-
-  master::Flags masterFlags = CreateMasterFlags();
-  Try> master = StartMaster(masterFlags);
-  ASSERT_SOME(master);
-
-  Future slaveRegisteredMessage =
-FUTURE_PROTOBUF(SlaveRegisteredMessage(), _, _);
-
-  slave::Flags agentFlags = CreateSlaveFlags();
-  Owned detector = master.get()->createDetector();
-  Try> slave = StartSlave(detector.get(), agentFlags);
-  ASSERT_SOME(slave);
-
-  Clock::advance(agentFlags.registration_backoff_factor);
-
-  AWAIT_READY(slaveRegisteredMessage);
-
-  auto scheduler = std::make_shared();
-
-  v1::FrameworkInfo frameworkInfo = v1::DEFAULT_FRAMEWORK_INFO;
-  frameworkInfo.add_capabilities()->set_type(
-  v1::FrameworkInfo::Capability::PARTITION_AWARE);
-
-  EXPECT_CALL(*scheduler, connected(_))
-.WillOnce(v1::scheduler::SendSubscribe(frameworkInfo));
-
-  Future subscribed;
-  EXPECT_CALL(*scheduler, subscribed(_, _))
-.WillOnce(FutureArg<1>());
-
-  EXPECT_CALL(*scheduler, heartbeat(_))
-.WillRepeatedly(Return()); // Ignore heartbeats.
-
-  Future offers;
-  EXPECT_CALL(*scheduler, offers(_, _))
-.WillOnce(FutureArg<1>())
-.WillRepeatedly(Return());
-
-  auto mesos = std::make_shared(
-  master.get()->pid, ContentType::PROTOBUF, scheduler);
-
-  AWAIT_READY(subscribed);
-  v1::FrameworkID frameworkId(subscribed->framework_id());
-
-  AWAIT_READY(offers);
-  ASSERT_FALSE(offers->offers().empty());
-
-  const v1::Offer& offer = offers->offers(0);
-  const v1::AgentID& agentId = offer.agent_id();
-
-  Try resources =
-v1::Resources::parse("cpus:0.1;mem:64;disk:64");
-
-  ASSERT_SOME(resources);
-
-  v1::TaskInfo taskInfo =
-v1::createTask(agentId, resources.get(), SLEEP_COMMAND(1000));
-
-  testing::Sequence updateSequence;
-  Future startingUpdate;
-  Future runningUpdate;
-
-  EXPECT_CALL(
-  *scheduler,
-  update(_, AllOf(
-  TaskStatusUpdateTaskIdEq(taskInfo.task_id()),
-  TaskStatusUpdateStateEq(v1::TASK_STARTING
-.InSequence(updateSequence)
-.WillOnce(DoAll(
-FutureArg<1>(),
-v1::scheduler::SendAcknowledge(frameworkId, agentId)));
-
-  EXPECT_CALL(
-  *scheduler,
-  update(_, AllOf(
-TaskStatusUpdateTaskIdEq(taskInfo.task_id()),
-TaskStatusUpdateStateEq(v1::TASK_RUNNING
-.InSequence(updateSequence)
-.WillOnce(DoAll(
-FutureArg<1>(),
-v1::scheduler::SendAcknowledge(frameworkId, agentId)))
-.WillRepeatedly(Return());
-
-  mesos->send(
-  v1::createCallAccept(
-  frameworkId,
-  offer,
-  {v1::LAUNCH({taskInfo})}));
-
-  AWAIT_READY(startingUpdate);
-  

[mesos] 05/05: Added agent reactivations to the existing agent draining tests.

2019-08-26 Thread josephwu
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 50dcd56a42ee03d354f39cb029befe9e60e7f0bf
Author: Joseph Wu 
AuthorDate: Mon Aug 19 14:35:34 2019 -0700

Added agent reactivations to the existing agent draining tests.

This adds an extra step to a couple of the agent draining tests,
which calls REACTIVATE_AGENT at the end.

Review: https://reviews.apache.org/r/71318
---
 src/tests/master_draining_tests.cpp | 93 +
 1 file changed, 93 insertions(+)

diff --git a/src/tests/master_draining_tests.cpp 
b/src/tests/master_draining_tests.cpp
index 235bf1b..f1a00df 100644
--- a/src/tests/master_draining_tests.cpp
+++ b/src/tests/master_draining_tests.cpp
@@ -563,11 +563,18 @@ TEST_P(MasterDrainingTest, DrainAgent)
 FutureArg<1>(),
 v1::scheduler::SendAcknowledge(frameworkId, agentId)));
 
+  Future killedAck =
+FUTURE_PROTOBUF(StatusUpdateAcknowledgementMessage(), _, _);
+
   Future registrarApplyDrained;
+  Future registrarApplyReactivated;
   EXPECT_CALL(*master->registrar, apply(_))
 .WillOnce(DoDefault())
 .WillOnce(DoAll(
 FutureSatisfy(),
+Invoke(master->registrar.get(), ::unmocked_apply)))
+.WillOnce(DoAll(
+FutureSatisfy(),
 Invoke(master->registrar.get(), ::unmocked_apply)));
 
   ContentType contentType = GetParam();
@@ -587,6 +594,7 @@ TEST_P(MasterDrainingTest, DrainAgent)
   }
 
   AWAIT_READY(killedUpdate);
+  AWAIT_READY(killedAck);
   AWAIT_READY(registrarApplyDrained);
 
   // Ensure that the update acknowledgement has been processed.
@@ -676,6 +684,33 @@ TEST_P(MasterDrainingTest, DrainAgent)
 ASSERT_SOME(stateDrainStartTime);
 EXPECT_LT(0, stateDrainStartTime->as());
   }
+
+  // Reactivate the agent and expect to get the agent in an offer.
+  Future offers;
+  EXPECT_CALL(*scheduler, offers(_, _))
+.WillOnce(FutureArg<1>());
+
+  {
+v1::master::Call::ReactivateAgent reactivateAgent;
+reactivateAgent.mutable_agent_id()->CopyFrom(agentId);
+
+v1::master::Call call;
+call.set_type(v1::master::Call::REACTIVATE_AGENT);
+call.mutable_reactivate_agent()->CopyFrom(reactivateAgent);
+
+AWAIT_EXPECT_RESPONSE_STATUS_EQ(
+http::OK().status,
+post(master->pid, call, contentType));
+  }
+
+  AWAIT_READY(registrarApplyReactivated);
+
+  Clock::advance(masterFlags.allocation_interval);
+  Clock::settle();
+
+  AWAIT_READY(offers);
+  ASSERT_FALSE(offers->offers().empty());
+  EXPECT_EQ(agentId, offers->offers(0).agent_id());
 }
 
 
@@ -788,6 +823,9 @@ TEST_P(MasterDrainingTest, DrainAgentDisconnected)
 FutureArg<1>(),
 v1::scheduler::SendAcknowledge(frameworkId, agentId)));
 
+  Future killedAck =
+FUTURE_PROTOBUF(StatusUpdateAcknowledgementMessage(), _, _);
+
   Try> recoveredSlave =
 StartSlave(detector.get(), agentFlags);
   ASSERT_SOME(recoveredSlave);
@@ -802,6 +840,7 @@ TEST_P(MasterDrainingTest, DrainAgentDisconnected)
   // The agent should be told to drain once it reregisters.
   AWAIT_READY(drainSlaveMesage);
   AWAIT_READY(killedUpdate);
+  AWAIT_READY(killedAck);
 
   // Ensure that the agent is marked as DRAINED in the master now.
   {
@@ -825,6 +864,31 @@ TEST_P(MasterDrainingTest, DrainAgentDisconnected)
 EXPECT_EQ(agent.deactivated(), true);
 EXPECT_EQ(mesos::v1::DRAINED, agent.drain_info().state());
   }
+
+  // Reactivate the agent and expect to get the agent in an offer.
+  Future offers;
+  EXPECT_CALL(*scheduler, offers(_, _))
+.WillOnce(FutureArg<1>());
+
+  {
+v1::master::Call::ReactivateAgent reactivateAgent;
+reactivateAgent.mutable_agent_id()->CopyFrom(agentId);
+
+v1::master::Call call;
+call.set_type(v1::master::Call::REACTIVATE_AGENT);
+call.mutable_reactivate_agent()->CopyFrom(reactivateAgent);
+
+AWAIT_EXPECT_RESPONSE_STATUS_EQ(
+http::OK().status,
+post(master->pid, call, contentType));
+  }
+
+  Clock::advance(masterFlags.allocation_interval);
+  Clock::settle();
+
+  AWAIT_READY(offers);
+  ASSERT_FALSE(offers->offers().empty());
+  EXPECT_EQ(agentId, offers->offers(0).agent_id());
 }
 
 
@@ -870,6 +934,9 @@ TEST_P(MasterDrainingTest, DrainAgentUnreachable)
 FutureArg<1>(),
 v1::scheduler::SendAcknowledge(frameworkId, agentId)));
 
+  Future killedAck =
+FUTURE_PROTOBUF(StatusUpdateAcknowledgementMessage(), _, _);
+
   // Simulate an agent crash, so that it disconnects from the master.
   slave->terminate();
   slave.reset();
@@ -918,6 +985,32 @@ TEST_P(MasterDrainingTest, DrainAgentUnreachable)
   AWAIT_READY(drainSlaveMesage);
   AWAIT_READY(runningUpdate);
   AWAIT_READY(killedUpdate);
+  AWAIT_READY(killedAck);
+
+  // Reactivate the agent and expect to get the agent in an offer.
+  Future offers;
+  EXPECT_CALL(*scheduler, offers(_, _))
+.WillOnce(FutureArg<1>());
+
+  {
+   

[mesos] 03/05: Added draining tests for empty agents.

2019-08-26 Thread josephwu
This is an automated email from the ASF dual-hosted git repository.

josephwu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 1e3661982eba6da71a5ca8178472ef762d9fc780
Author: Joseph Wu 
AuthorDate: Wed Aug 7 09:02:01 2019 -0700

Added draining tests for empty agents.

This splits the existing agent draining tests into two variants:
1) where the agent has nothing running, and
2) where the agent has one task running.

Review: https://reviews.apache.org/r/71316
---
 src/tests/master_draining_tests.cpp | 294 ++--
 1 file changed, 250 insertions(+), 44 deletions(-)

diff --git a/src/tests/master_draining_tests.cpp 
b/src/tests/master_draining_tests.cpp
index eae809f..674f5b5 100644
--- a/src/tests/master_draining_tests.cpp
+++ b/src/tests/master_draining_tests.cpp
@@ -42,6 +42,8 @@
 #include "common/protobuf_utils.hpp"
 #include "common/resources_utils.hpp"
 
+#include "master/registry_operations.hpp"
+
 #include "messages/messages.hpp"
 
 #include "tests/cluster.hpp"
@@ -69,12 +71,12 @@ namespace mesos {
 namespace internal {
 namespace tests {
 
-class MasterDrainingTest
+class MasterAlreadyDrainedTest
   : public MesosTest,
 public WithParamInterface
 {
 public:
-  // Creates a master, agent, framework, and launches one sleep task.
+  // Creates a master and agent.
   void SetUp() override
   {
 MesosTest::SetUp();
@@ -99,6 +101,251 @@ public:
 
 Clock::advance(agentFlags.registration_backoff_factor);
 AWAIT_READY(slaveRegisteredMessage);
+agentId = evolve(slaveRegisteredMessage->slave_id());
+  }
+
+  void TearDown() override
+  {
+slave.reset();
+detector.reset();
+master.reset();
+
+Clock::resume();
+
+MesosTest::TearDown();
+  }
+
+  master::Flags CreateMasterFlags() override
+  {
+// Turn off periodic allocations to avoid the race between
+// `HierarchicalAllocator::updateAvailable()` and periodic allocations.
+master::Flags flags = MesosTest::CreateMasterFlags();
+flags.allocation_interval = Seconds(1000);
+return flags;
+  }
+
+  // Helper function to post a request to "/api/v1" master endpoint and return
+  // the response.
+  Future post(
+  const process::PID& pid,
+  const v1::master::Call& call,
+  const ContentType& contentType,
+  const Credential& credential = DEFAULT_CREDENTIAL)
+  {
+http::Headers headers = createBasicAuthHeaders(credential);
+headers["Accept"] = stringify(contentType);
+
+return http::post(
+pid,
+"api/v1",
+headers,
+serialize(contentType, call),
+stringify(contentType));
+  }
+
+protected:
+  master::Flags masterFlags;
+  Owned master;
+  Owned detector;
+
+  slave::Flags agentFlags;
+  Owned slave;
+  v1::AgentID agentId;
+};
+
+
+// These tests are parameterized by the content type of the HTTP request.
+INSTANTIATE_TEST_CASE_P(
+ContentType,
+MasterAlreadyDrainedTest,
+::testing::Values(ContentType::PROTOBUF, ContentType::JSON));
+
+
+// When an operator submits a DRAIN_AGENT call, the agent with nothing running
+// should be immediately transitioned to the DRAINED state.
+TEST_P(MasterAlreadyDrainedTest, DrainAgent)
+{
+  Future registrarApplyDrained;
+  EXPECT_CALL(*master->registrar, apply(_))
+.WillOnce(DoDefault())
+.WillOnce(DoAll(
+FutureSatisfy(),
+Invoke(master->registrar.get(), ::unmocked_apply)));
+
+  ContentType contentType = GetParam();
+
+  {
+v1::master::Call::DrainAgent drainAgent;
+drainAgent.mutable_agent_id()->CopyFrom(agentId);
+drainAgent.mutable_max_grace_period()->set_seconds(10);
+
+v1::master::Call call;
+call.set_type(v1::master::Call::DRAIN_AGENT);
+call.mutable_drain_agent()->CopyFrom(drainAgent);
+
+AWAIT_EXPECT_RESPONSE_STATUS_EQ(
+http::OK().status,
+post(master->pid, call, contentType));
+  }
+
+  AWAIT_READY(registrarApplyDrained);
+
+  mesos::v1::DrainInfo drainInfo;
+  drainInfo.set_state(mesos::v1::DRAINED);
+  drainInfo.mutable_config()->set_mark_gone(false);
+  drainInfo.mutable_config()->mutable_max_grace_period()
+->set_nanoseconds(Seconds(10).ns());
+
+  // Ensure that the agent's drain info is reflected in the master's
+  // GET_AGENTS response.
+  {
+v1::master::Call call;
+call.set_type(v1::master::Call::GET_AGENTS);
+
+Future response =
+  post(master->pid, call, contentType);
+AWAIT_ASSERT_RESPONSE_STATUS_EQ(http::OK().status, response);
+
+Try getAgents =
+  deserialize(contentType, response->body);
+ASSERT_SOME(getAgents);
+
+ASSERT_EQ(v1::master::Response::GET_AGENTS, getAgents->type());
+ASSERT_EQ(getAgents->get_agents().agents_size(), 1);
+
+const v1::master::Response::GetAgents::Agent& agent =
+getAgents->get_agents().agents(0);
+
+EXPECT_EQ(agent.deactivated(), true);
+
+EXPECT_EQ(agent.drain_info(), drainInfo);
+EXPECT_LT(0, 

[mesos] branch master updated: Updated site's middleman versions.

2019-08-26 Thread bbannier
This is an automated email from the ASF dual-hosted git repository.

bbannier pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git


The following commit(s) were added to refs/heads/master by this push:
 new c104977  Updated site's middleman versions.
c104977 is described below

commit c104977894e2abb36aa0a78456c54fb74a20543e
Author: Benjamin Bannier 
AuthorDate: Mon Aug 26 18:18:07 2019 +0200

Updated site's middleman versions.

Review: https://reviews.apache.org/r/71368/
---
 site/Gemfile  |  8 
 site/Gemfile.lock | 20 ++--
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/site/Gemfile b/site/Gemfile
index c492030..c0df4e1 100644
--- a/site/Gemfile
+++ b/site/Gemfile
@@ -1,9 +1,9 @@
 source 'https://rubygems.org'
 
-gem 'middleman', '3.4.0'
-gem 'middleman-livereload', '3.4.6'
-gem 'middleman-syntax', '3.0.0'
-gem 'middleman-blog', '3.5.3'
+gem 'middleman', '~>3'
+gem 'middleman-livereload', '~>3'
+gem 'middleman-syntax', '~>3'
+gem 'middleman-blog', '~>3'
 
 # Middleman has an undeclared dependency on `tzinfo-data` for
 # generating timestamps.
diff --git a/site/Gemfile.lock b/site/Gemfile.lock
index 63c48e7..87d825c 100644
--- a/site/Gemfile.lock
+++ b/site/Gemfile.lock
@@ -52,14 +52,14 @@ GEM
 listen (3.0.8)
   rb-fsevent (~> 0.9, >= 0.9.4)
   rb-inotify (~> 0.9, >= 0.9.7)
-middleman (3.4.0)
+middleman (3.4.1)
   coffee-script (~> 2.2)
   compass (>= 1.0.0, < 2.0.0)
   compass-import-once (= 1.0.5)
   execjs (~> 2.0)
   haml (>= 4.0.5)
   kramdown (~> 1.2)
-  middleman-core (= 3.4.0)
+  middleman-core (= 3.4.1)
   middleman-sprockets (>= 3.1.2)
   sass (>= 3.4.0, < 4.0)
   uglifier (~> 2.5)
@@ -67,7 +67,7 @@ GEM
   addressable (~> 2.3.5)
   middleman-core (~> 3.2)
   tzinfo (>= 0.3.0)
-middleman-core (3.4.0)
+middleman-core (3.4.1)
   activesupport (~> 4.1)
   bundler (~> 1.1)
   capybara (~> 2.4.4)
@@ -88,9 +88,9 @@ GEM
   sprockets (~> 2.12.1)
   sprockets-helpers (~> 1.1.0)
   sprockets-sass (~> 1.3.0)
-middleman-syntax (3.0.0)
+middleman-syntax (3.2.0)
   middleman-core (>= 3.2)
-  rouge (~> 2.0)
+  rouge (~> 3.2)
 mime-types (3.2.2)
   mime-types-data (~> 3.2015)
 mime-types-data (3.2019.0331)
@@ -116,7 +116,7 @@ GEM
   ffi (~> 1.0)
 rdiscount (2.2.0.1)
 ref (2.0.0)
-rouge (2.2.1)
+rouge (3.9.0)
 sass (3.4.25)
 sprockets (2.12.5)
   hike (~> 1.2)
@@ -151,10 +151,10 @@ PLATFORMS
 
 DEPENDENCIES
   htmlentities
-  middleman (= 3.4.0)
-  middleman-blog (= 3.5.3)
-  middleman-livereload (= 3.4.6)
-  middleman-syntax (= 3.0.0)
+  middleman (~> 3)
+  middleman-blog (~> 3)
+  middleman-livereload (~> 3)
+  middleman-syntax (~> 3)
   rake
   rdiscount (= 2.2.0.1)
   therubyracer



[mesos] 01/03: Added MESOS-9887 to the 1.8.2 CHANGELOG.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 967f105cea4bc31780bfe76bd2d62ad71ffae221
Author: Andrei Budnik 
AuthorDate: Mon Aug 26 15:02:40 2019 +0200

Added MESOS-9887 to the 1.8.2 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index fe08b76..a215e5c 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -49,6 +49,7 @@ Release Notes - Mesos - Version 1.8.2 (WIP)
   * [MESOS-9785] - Frameworks recovered from reregistered agents are not 
reported to master `/api/v1` subscribers.
   * [MESOS-9836] - Docker containerizer overwrites `/mesos/slave` cgroups.
   * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct.
+  * [MESOS-9887] - Race condition between two terminal task status updates for 
Docker/Command executor.
   * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret 
from runtime directory when the container is destroyed.
   * [MESOS-9925] - Default executor takes a couple of seconds to start and 
subscribe Mesos agent.
 



[mesos] 03/03: Added MESOS-9887 to the 1.6.3 CHANGELOG.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 24e989e66507932809c7d852a4a62720de7cb27b
Author: Andrei Budnik 
AuthorDate: Mon Aug 26 14:44:54 2019 +0200

Added MESOS-9887 to the 1.6.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 3cd7661..58cf418 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -986,6 +986,7 @@ Release Notes - Mesos - Version 1.6.3 (WIP)
   * [MESOS-9856] - REVIVE call with specified role(s) clears filters for all 
roles of a framework.
   * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct.
   * [MESOS-9870] - Simultaneous adding/removal of a role from framework's 
roles and its suppressed roles crashes the master.
+  * [MESOS-9887] - Race condition between two terminal task status updates for 
Docker/Command executor.
   * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret 
from runtime directory when the container is destroyed.
 
 ** Improvement



[mesos] branch master updated (f0be237 -> 24e989e)

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from f0be237  Fixed out-of-order processing of terminal status updates in 
agent.
 new 967f105  Added MESOS-9887 to the 1.8.2 CHANGELOG.
 new 6b2d101  Added MESOS-9887 to the 1.7.3 CHANGELOG.
 new 24e989e  Added MESOS-9887 to the 1.6.3 CHANGELOG.

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG | 3 +++
 1 file changed, 3 insertions(+)



[mesos] 02/03: Added MESOS-9887 to the 1.7.3 CHANGELOG.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 6b2d101770ae8853d021b8cc5d0f5ae587302a54
Author: Andrei Budnik 
AuthorDate: Mon Aug 26 14:58:45 2019 +0200

Added MESOS-9887 to the 1.7.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index a215e5c..3cd7661 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -504,6 +504,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP)
   * [MESOS-9856] - REVIVE call with specified role(s) clears filters for all 
roles of a framework.
   * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct.
   * [MESOS-9870] - Simultaneous adding/removal of a role from framework's 
roles and its suppressed roles crashes the master.
+  * [MESOS-9887] - Race condition between two terminal task status updates for 
Docker/Command executor.
   * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret 
from runtime directory when the container is destroyed.
   * [MESOS-9925] - Default executor takes a couple of seconds to start and 
subscribe Mesos agent.
 



[mesos] 02/03: Fixed out-of-order processing of terminal status updates in agent.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.8.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 4bbb0376cd584a4160a2c5c2f0ac4f3ecaa5e622
Author: Andrei Budnik 
AuthorDate: Tue Aug 20 19:24:44 2019 +0200

Fixed out-of-order processing of terminal status updates in agent.

Previously, Mesos agent could send TASK_FAILED status update on
executor termination while processing of TASK_FINISHED status update
was in progress. Processing of task status updates involves sending
requests to the containerizer, which might finish processing of these
requests out-of-order, e.g. `MesosContainerizer::status`. Also,
the agent does not overwrite status of the terminal status update once
it's stored in the `terminatedTasks`. Hence, there was a race condition
between two terminal status updates.

Note that V1 Executors are not affected by this problem because they
wait for an acknowledgement of the terminal status update by the agent
before terminating.

This patch introduces a new data structure `pendingStatusUpdates`,
which holds a list of status updates that are being processed. This
data structure allows validating the order of processing of status
updates by the agent.

Review: https://reviews.apache.org/r/71343
---
 src/slave/slave.cpp | 62 ++---
 src/slave/slave.hpp |  6 ++
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 50a7d68..8d8cef3 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -5727,6 +5727,8 @@ void Slave::statusUpdate(StatusUpdate update, const 
Option& pid)
 
   metrics.valid_status_updates++;
 
+  executor->addPendingTaskStatus(status);
+
   // Before sending update, we need to retrieve the container status
   // if the task reached the executor. For tasks that are queued, we
   // do not need to send the container status and we must
@@ -5938,6 +5940,17 @@ void Slave::___statusUpdate(
   VLOG(1) << "Task status update manager successfully handled status update "
   << update;
 
+  const TaskStatus& status = update.status();
+
+  Executor* executor = nullptr;
+  Framework* framework = getFramework(update.framework_id());
+  if (framework != nullptr) {
+executor = framework->getExecutor(status.task_id());
+if (executor != nullptr) {
+  executor->removePendingTaskStatus(status);
+}
+  }
+
   if (pid == UPID()) {
 return;
   }
@@ -5945,7 +5958,7 @@ void Slave::___statusUpdate(
   StatusUpdateAcknowledgementMessage message;
   message.mutable_framework_id()->MergeFrom(update.framework_id());
   message.mutable_slave_id()->MergeFrom(update.slave_id());
-  message.mutable_task_id()->MergeFrom(update.status().task_id());
+  message.mutable_task_id()->MergeFrom(status.task_id());
   message.set_uuid(update.uuid());
 
   // Task status update manager successfully handled the status update.
@@ -5957,14 +5970,12 @@ void Slave::___statusUpdate(
 send(pid.get(), message);
   } else {
 // Acknowledge the HTTP based executor.
-Framework* framework = getFramework(update.framework_id());
 if (framework == nullptr) {
   LOG(WARNING) << "Ignoring sending acknowledgement for status update "
<< update << " of unknown framework";
   return;
 }
 
-Executor* executor = framework->getExecutor(update.status().task_id());
 if (executor == nullptr) {
   // Refer to the comments in 'statusUpdate()' on when this can
   // happen.
@@ -10520,6 +10531,33 @@ void Executor::recoverTask(const TaskState& state, 
bool recheckpointTask)
 }
 
 
+void Executor::addPendingTaskStatus(const TaskStatus& status)
+{
+  auto uuid = id::UUID::fromBytes(status.uuid()).get();
+  pendingStatusUpdates[status.task_id()][uuid] = status;
+}
+
+
+void Executor::removePendingTaskStatus(const TaskStatus& status)
+{
+  const TaskID& taskId = status.task_id();
+
+  auto uuid = id::UUID::fromBytes(status.uuid()).get();
+
+  if (!pendingStatusUpdates.contains(taskId) ||
+  !pendingStatusUpdates[taskId].contains(uuid)) {
+LOG(WARNING) << "Unknown pending status update (uuid: " << uuid << ")";
+return;
+  }
+
+  pendingStatusUpdates[taskId].erase(uuid);
+
+  if (pendingStatusUpdates[taskId].empty()) {
+pendingStatusUpdates.erase(taskId);
+  }
+}
+
+
 Try Executor::updateTaskState(const TaskStatus& status)
 {
   bool terminal = protobuf::isTerminalState(status.state());
@@ -10543,6 +10581,24 @@ Try Executor::updateTaskState(const 
TaskStatus& status)
 task = launchedTasks.at(status.task_id());
 
 if (terminal) {
+  if (pendingStatusUpdates.contains(status.task_id())) {
+auto statusUpdates = pendingStatusUpdates[status.task_id()].values();
+
+auto firstTerminal = std::find_if(
+statusUpdates.begin(),
+statusUpdates.end(),
+   

[mesos] 01/03: Added missing `return` statement in `Slave::statusUpdate`.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.8.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 14abb82925cdbce746238bc20dc7b8c279a96a67
Author: Andrei Budnik 
AuthorDate: Fri Aug 23 14:36:18 2019 +0200

Added missing `return` statement in `Slave::statusUpdate`.

Previously, if `statusUpdate` was called for a pending task, it would
forward the status update and then continue executing `statusUpdate`,
which then checks if there is an executor that is aware of this task.
Given that a pending task is not known to any executor, it would always
handle it by forwarding status update one more time. This patch adds
missing `return` statement, which fixes the issue.

Review: https://reviews.apache.org/r/71361
---
 src/slave/slave.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index bf87be0..50a7d68 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -5659,6 +5659,8 @@ void Slave::statusUpdate(StatusUpdate update, const 
Option& pid)
 
 taskStatusUpdateManager->update(update, info.id())
   .onAny(defer(self(), ::___statusUpdate, lambda::_1, update, pid));
+
+return;
   }
 
   Executor* executor = framework->getExecutor(status.task_id());



[mesos] branch 1.8.x updated (f3aa802 -> adc958f)

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch 1.8.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from f3aa802  Added MESOS-9836 to the 1.8.2 CHANGELOG.
 new 14abb82  Added missing `return` statement in `Slave::statusUpdate`.
 new 4bbb037  Fixed out-of-order processing of terminal status updates in 
agent.
 new adc958f  Added MESOS-9887 to the 1.8.2 CHANGELOG.

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG   |  1 +
 src/slave/slave.cpp | 64 ++---
 src/slave/slave.hpp |  6 +
 3 files changed, 68 insertions(+), 3 deletions(-)



[mesos] 03/03: Added MESOS-9887 to the 1.8.2 CHANGELOG.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.8.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit adc958f553c3728aab5529de56b0ddc30c0f9b68
Author: Andrei Budnik 
AuthorDate: Mon Aug 26 15:02:40 2019 +0200

Added MESOS-9887 to the 1.8.2 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index b3fca25..ff89605 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -6,6 +6,7 @@ Release Notes - Mesos - Version 1.8.2 (WIP)
   * [MESOS-9785] - Frameworks recovered from reregistered agents are not 
reported to master `/api/v1` subscribers.
   * [MESOS-9836] - Docker containerizer overwrites `/mesos/slave` cgroups.
   * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct.
+  * [MESOS-9887] - Race condition between two terminal task status updates for 
Docker/Command executor.
   * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret 
from runtime directory when the container is destroyed.
   * [MESOS-9925] - Default executor takes a couple of seconds to start and 
subscribe Mesos agent.
 



[mesos] 02/03: Fixed out-of-order processing of terminal status updates in agent.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit b7dcc984476904d6d17f7bf699295dfa9ac8a66e
Author: Andrei Budnik 
AuthorDate: Tue Aug 20 19:24:44 2019 +0200

Fixed out-of-order processing of terminal status updates in agent.

Previously, Mesos agent could send TASK_FAILED status update on
executor termination while processing of TASK_FINISHED status update
was in progress. Processing of task status updates involves sending
requests to the containerizer, which might finish processing of these
requests out-of-order, e.g. `MesosContainerizer::status`. Also,
the agent does not overwrite status of the terminal status update once
it's stored in the `terminatedTasks`. Hence, there was a race condition
between two terminal status updates.

Note that V1 Executors are not affected by this problem because they
wait for an acknowledgement of the terminal status update by the agent
before terminating.

This patch introduces a new data structure `pendingStatusUpdates`,
which holds a list of status updates that are being processed. This
data structure allows validating the order of processing of status
updates by the agent.

Review: https://reviews.apache.org/r/71343
---
 src/slave/slave.cpp | 62 ++---
 src/slave/slave.hpp |  6 ++
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index edfe3d0..f10aac2 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -5486,6 +5486,8 @@ void Slave::statusUpdate(StatusUpdate update, const 
Option& pid)
 
   metrics.valid_status_updates++;
 
+  executor->addPendingTaskStatus(status);
+
   // Before sending update, we need to retrieve the container status
   // if the task reached the executor. For tasks that are queued, we
   // do not need to send the container status and we must
@@ -5697,6 +5699,17 @@ void Slave::___statusUpdate(
   VLOG(1) << "Task status update manager successfully handled status update "
   << update;
 
+  const TaskStatus& status = update.status();
+
+  Executor* executor = nullptr;
+  Framework* framework = getFramework(update.framework_id());
+  if (framework != nullptr) {
+executor = framework->getExecutor(status.task_id());
+if (executor != nullptr) {
+  executor->removePendingTaskStatus(status);
+}
+  }
+
   if (pid == UPID()) {
 return;
   }
@@ -5704,7 +5717,7 @@ void Slave::___statusUpdate(
   StatusUpdateAcknowledgementMessage message;
   message.mutable_framework_id()->MergeFrom(update.framework_id());
   message.mutable_slave_id()->MergeFrom(update.slave_id());
-  message.mutable_task_id()->MergeFrom(update.status().task_id());
+  message.mutable_task_id()->MergeFrom(status.task_id());
   message.set_uuid(update.uuid());
 
   // Task status update manager successfully handled the status update.
@@ -5716,14 +5729,12 @@ void Slave::___statusUpdate(
 send(pid.get(), message);
   } else {
 // Acknowledge the HTTP based executor.
-Framework* framework = getFramework(update.framework_id());
 if (framework == nullptr) {
   LOG(WARNING) << "Ignoring sending acknowledgement for status update "
<< update << " of unknown framework";
   return;
 }
 
-Executor* executor = framework->getExecutor(update.status().task_id());
 if (executor == nullptr) {
   // Refer to the comments in 'statusUpdate()' on when this can
   // happen.
@@ -9861,6 +9872,33 @@ void Executor::recoverTask(const TaskState& state, bool 
recheckpointTask)
 }
 
 
+void Executor::addPendingTaskStatus(const TaskStatus& status)
+{
+  auto uuid = id::UUID::fromBytes(status.uuid()).get();
+  pendingStatusUpdates[status.task_id()][uuid] = status;
+}
+
+
+void Executor::removePendingTaskStatus(const TaskStatus& status)
+{
+  const TaskID& taskId = status.task_id();
+
+  auto uuid = id::UUID::fromBytes(status.uuid()).get();
+
+  if (!pendingStatusUpdates.contains(taskId) ||
+  !pendingStatusUpdates[taskId].contains(uuid)) {
+LOG(WARNING) << "Unknown pending status update (uuid: " << uuid << ")";
+return;
+  }
+
+  pendingStatusUpdates[taskId].erase(uuid);
+
+  if (pendingStatusUpdates[taskId].empty()) {
+pendingStatusUpdates.erase(taskId);
+  }
+}
+
+
 Try Executor::updateTaskState(const TaskStatus& status)
 {
   bool terminal = protobuf::isTerminalState(status.state());
@@ -9884,6 +9922,24 @@ Try Executor::updateTaskState(const TaskStatus& 
status)
 task = launchedTasks.at(status.task_id());
 
 if (terminal) {
+  if (pendingStatusUpdates.contains(status.task_id())) {
+auto statusUpdates = pendingStatusUpdates[status.task_id()].values();
+
+auto firstTerminal = std::find_if(
+statusUpdates.begin(),
+statusUpdates.end(),
+   

[mesos] 01/03: Added missing `return` statement in `Slave::statusUpdate`.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 2d62e8ae0ef94f78c9b32be258a08d1e6e2382df
Author: Andrei Budnik 
AuthorDate: Fri Aug 23 14:36:18 2019 +0200

Added missing `return` statement in `Slave::statusUpdate`.

Previously, if `statusUpdate` was called for a pending task, it would
forward the status update and then continue executing `statusUpdate`,
which then checks if there is an executor that is aware of this task.
Given that a pending task is not known to any executor, it would always
handle it by forwarding status update one more time. This patch adds
missing `return` statement, which fixes the issue.

Review: https://reviews.apache.org/r/71361
---
 src/slave/slave.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 1c33579..edfe3d0 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -5418,6 +5418,8 @@ void Slave::statusUpdate(StatusUpdate update, const 
Option& pid)
 
 taskStatusUpdateManager->update(update, info.id())
   .onAny(defer(self(), ::___statusUpdate, lambda::_1, update, pid));
+
+return;
   }
 
   Executor* executor = framework->getExecutor(status.task_id());



[mesos] 03/03: Added MESOS-9887 to the 1.7.3 CHANGELOG.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.7.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 80d42b9a2c9223665a82bbaaf3cbc222a094e2ef
Author: Andrei Budnik 
AuthorDate: Mon Aug 26 14:58:45 2019 +0200

Added MESOS-9887 to the 1.7.3 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 06c88db..1178228 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -29,6 +29,7 @@ Release Notes - Mesos - Version 1.7.3 (WIP)
   * [MESOS-9856] - REVIVE call with specified role(s) clears filters for all 
roles of a framework.
   * [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct.
   * [MESOS-9870] - Simultaneous adding/removal of a role from framework's 
roles and its suppressed roles crashes the master.
+  * [MESOS-9887] - Race condition between two terminal task status updates for 
Docker/Command executor.
   * [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret 
from runtime directory when the container is destroyed.
   * [MESOS-9925] - Default executor takes a couple of seconds to start and 
subscribe Mesos agent.
 



[mesos] 01/03: Added missing `return` statement in `Slave::statusUpdate`.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit cc79f22fb07cfad8f248150d5a3040f846998c3a
Author: Andrei Budnik 
AuthorDate: Fri Aug 23 14:36:18 2019 +0200

Added missing `return` statement in `Slave::statusUpdate`.

Previously, if `statusUpdate` was called for a pending task, it would
forward the status update and then continue executing `statusUpdate`,
which then checks if there is an executor that is aware of this task.
Given that a pending task is not known to any executor, it would always
handle it by forwarding status update one more time. This patch adds
missing `return` statement, which fixes the issue.

Review: https://reviews.apache.org/r/71361
---
 src/slave/slave.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 2a90e96..176d3fb 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -5388,6 +5388,8 @@ void Slave::statusUpdate(StatusUpdate update, const 
Option& pid)
 
 taskStatusUpdateManager->update(update, info.id())
   .onAny(defer(self(), ::___statusUpdate, lambda::_1, update, pid));
+
+return;
   }
 
   Executor* executor = framework->getExecutor(status.task_id());



[mesos] 02/03: Fixed out-of-order processing of terminal status updates in agent.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 3ad802ebbe34565a2fa995d834ba4928c20e5e62
Author: Andrei Budnik 
AuthorDate: Tue Aug 20 19:24:44 2019 +0200

Fixed out-of-order processing of terminal status updates in agent.

Previously, Mesos agent could send TASK_FAILED status update on
executor termination while processing of TASK_FINISHED status update
was in progress. Processing of task status updates involves sending
requests to the containerizer, which might finish processing of these
requests out-of-order, e.g. `MesosContainerizer::status`. Also,
the agent does not overwrite status of the terminal status update once
it's stored in the `terminatedTasks`. Hence, there was a race condition
between two terminal status updates.

Note that V1 Executors are not affected by this problem because they
wait for an acknowledgement of the terminal status update by the agent
before terminating.

This patch introduces a new data structure `pendingStatusUpdates`,
which holds a list of status updates that are being processed. This
data structure allows validating the order of processing of status
updates by the agent.

Review: https://reviews.apache.org/r/71343
---
 src/slave/slave.cpp | 62 ++---
 src/slave/slave.hpp |  6 ++
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 176d3fb..0861ac2 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -5456,6 +5456,8 @@ void Slave::statusUpdate(StatusUpdate update, const 
Option& pid)
 
   metrics.valid_status_updates++;
 
+  executor->addPendingTaskStatus(status);
+
   // Before sending update, we need to retrieve the container status
   // if the task reached the executor. For tasks that are queued, we
   // do not need to send the container status and we must
@@ -5667,6 +5669,17 @@ void Slave::___statusUpdate(
   VLOG(1) << "Task status update manager successfully handled status update "
   << update;
 
+  const TaskStatus& status = update.status();
+
+  Executor* executor = nullptr;
+  Framework* framework = getFramework(update.framework_id());
+  if (framework != nullptr) {
+executor = framework->getExecutor(status.task_id());
+if (executor != nullptr) {
+  executor->removePendingTaskStatus(status);
+}
+  }
+
   if (pid == UPID()) {
 return;
   }
@@ -5674,7 +5687,7 @@ void Slave::___statusUpdate(
   StatusUpdateAcknowledgementMessage message;
   message.mutable_framework_id()->MergeFrom(update.framework_id());
   message.mutable_slave_id()->MergeFrom(update.slave_id());
-  message.mutable_task_id()->MergeFrom(update.status().task_id());
+  message.mutable_task_id()->MergeFrom(status.task_id());
   message.set_uuid(update.uuid());
 
   // Task status update manager successfully handled the status update.
@@ -5686,14 +5699,12 @@ void Slave::___statusUpdate(
 send(pid.get(), message);
   } else {
 // Acknowledge the HTTP based executor.
-Framework* framework = getFramework(update.framework_id());
 if (framework == nullptr) {
   LOG(WARNING) << "Ignoring sending acknowledgement for status update "
<< update << " of unknown framework";
   return;
 }
 
-Executor* executor = framework->getExecutor(update.status().task_id());
 if (executor == nullptr) {
   // Refer to the comments in 'statusUpdate()' on when this can
   // happen.
@@ -9759,6 +9770,33 @@ void Executor::recoverTask(const TaskState& state, bool 
recheckpointTask)
 }
 
 
+void Executor::addPendingTaskStatus(const TaskStatus& status)
+{
+  auto uuid = id::UUID::fromBytes(status.uuid()).get();
+  pendingStatusUpdates[status.task_id()][uuid] = status;
+}
+
+
+void Executor::removePendingTaskStatus(const TaskStatus& status)
+{
+  const TaskID& taskId = status.task_id();
+
+  auto uuid = id::UUID::fromBytes(status.uuid()).get();
+
+  if (!pendingStatusUpdates.contains(taskId) ||
+  !pendingStatusUpdates[taskId].contains(uuid)) {
+LOG(WARNING) << "Unknown pending status update (uuid: " << uuid << ")";
+return;
+  }
+
+  pendingStatusUpdates[taskId].erase(uuid);
+
+  if (pendingStatusUpdates[taskId].empty()) {
+pendingStatusUpdates.erase(taskId);
+  }
+}
+
+
 Try Executor::updateTaskState(const TaskStatus& status)
 {
   bool terminal = protobuf::isTerminalState(status.state());
@@ -9782,6 +9820,24 @@ Try Executor::updateTaskState(const TaskStatus& 
status)
 task = launchedTasks.at(status.task_id());
 
 if (terminal) {
+  if (pendingStatusUpdates.contains(status.task_id())) {
+auto statusUpdates = pendingStatusUpdates[status.task_id()].values();
+
+auto firstTerminal = std::find_if(
+statusUpdates.begin(),
+statusUpdates.end(),
+   

[mesos] branch 1.6.x updated (9badb3b -> d77029f)

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from 9badb3b  Added MESOS-9836 to the 1.6.3 CHANGELOG.
 new cc79f22  Added missing `return` statement in `Slave::statusUpdate`.
 new 3ad802e  Fixed out-of-order processing of terminal status updates in 
agent.
 new d77029f  Added MESOS-9887 to the 1.6.3 CHANGELOG.

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG   |  1 +
 src/slave/slave.cpp | 64 ++---
 src/slave/slave.hpp |  6 +
 3 files changed, 68 insertions(+), 3 deletions(-)



[mesos] branch master updated (48c20bf -> f0be237)

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git.


from 48c20bf  Updated site's dependencies.
 new 8aae23e  Added missing `return` statement in `Slave::statusUpdate`.
 new f0be237  Fixed out-of-order processing of terminal status updates in 
agent.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 src/slave/slave.cpp | 64 ++---
 src/slave/slave.hpp |  6 +
 2 files changed, 67 insertions(+), 3 deletions(-)



[mesos] 01/02: Added missing `return` statement in `Slave::statusUpdate`.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 8aae23ec7cd4bc50532df0b1d1ea6ec23ce078f8
Author: Andrei Budnik 
AuthorDate: Fri Aug 23 14:36:18 2019 +0200

Added missing `return` statement in `Slave::statusUpdate`.

Previously, if `statusUpdate` was called for a pending task, it would
forward the status update and then continue executing `statusUpdate`,
which then checks if there is an executor that is aware of this task.
Given that a pending task is not known to any executor, it would always
handle it by forwarding status update one more time. This patch adds
missing `return` statement, which fixes the issue.

Review: https://reviews.apache.org/r/71361
---
 src/slave/slave.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 882040d..45f1584 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -5879,6 +5879,8 @@ void Slave::statusUpdate(StatusUpdate update, const 
Option& pid)
 
 taskStatusUpdateManager->update(update, info.id())
   .onAny(defer(self(), ::___statusUpdate, lambda::_1, update, pid));
+
+return;
   }
 
   Executor* executor = framework->getExecutor(status.task_id());



[mesos] 02/02: Fixed out-of-order processing of terminal status updates in agent.

2019-08-26 Thread abudnik
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit f0be23765531b05661ed7f1b124faf96744aa80b
Author: Andrei Budnik 
AuthorDate: Tue Aug 20 19:24:44 2019 +0200

Fixed out-of-order processing of terminal status updates in agent.

Previously, Mesos agent could send TASK_FAILED status update on
executor termination while processing of TASK_FINISHED status update
was in progress. Processing of task status updates involves sending
requests to the containerizer, which might finish processing of these
requests out-of-order, e.g. `MesosContainerizer::status`. Also,
the agent does not overwrite status of the terminal status update once
it's stored in the `terminatedTasks`. Hence, there was a race condition
between two terminal status updates.

Note that V1 Executors are not affected by this problem because they
wait for an acknowledgement of the terminal status update by the agent
before terminating.

This patch introduces a new data structure `pendingStatusUpdates`,
which holds a list of status updates that are being processed. This
data structure allows validating the order of processing of status
updates by the agent.

Review: https://reviews.apache.org/r/71343
---
 src/slave/slave.cpp | 62 ++---
 src/slave/slave.hpp |  6 ++
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/src/slave/slave.cpp b/src/slave/slave.cpp
index 45f1584..4e93656 100644
--- a/src/slave/slave.cpp
+++ b/src/slave/slave.cpp
@@ -5947,6 +5947,8 @@ void Slave::statusUpdate(StatusUpdate update, const 
Option& pid)
 
   metrics.valid_status_updates++;
 
+  executor->addPendingTaskStatus(status);
+
   // Before sending update, we need to retrieve the container status
   // if the task reached the executor. For tasks that are queued, we
   // do not need to send the container status and we must
@@ -6158,6 +6160,17 @@ void Slave::___statusUpdate(
   VLOG(1) << "Task status update manager successfully handled status update "
   << update;
 
+  const TaskStatus& status = update.status();
+
+  Executor* executor = nullptr;
+  Framework* framework = getFramework(update.framework_id());
+  if (framework != nullptr) {
+executor = framework->getExecutor(status.task_id());
+if (executor != nullptr) {
+  executor->removePendingTaskStatus(status);
+}
+  }
+
   if (pid == UPID()) {
 return;
   }
@@ -6165,7 +6178,7 @@ void Slave::___statusUpdate(
   StatusUpdateAcknowledgementMessage message;
   message.mutable_framework_id()->MergeFrom(update.framework_id());
   message.mutable_slave_id()->MergeFrom(update.slave_id());
-  message.mutable_task_id()->MergeFrom(update.status().task_id());
+  message.mutable_task_id()->MergeFrom(status.task_id());
   message.set_uuid(update.uuid());
 
   // Task status update manager successfully handled the status update.
@@ -6177,14 +6190,12 @@ void Slave::___statusUpdate(
 send(pid.get(), message);
   } else {
 // Acknowledge the HTTP based executor.
-Framework* framework = getFramework(update.framework_id());
 if (framework == nullptr) {
   LOG(WARNING) << "Ignoring sending acknowledgement for status update "
<< update << " of unknown framework";
   return;
 }
 
-Executor* executor = framework->getExecutor(update.status().task_id());
 if (executor == nullptr) {
   // Refer to the comments in 'statusUpdate()' on when this can
   // happen.
@@ -10795,6 +10806,33 @@ void Executor::recoverTask(const TaskState& state, 
bool recheckpointTask)
 }
 
 
+void Executor::addPendingTaskStatus(const TaskStatus& status)
+{
+  auto uuid = id::UUID::fromBytes(status.uuid()).get();
+  pendingStatusUpdates[status.task_id()][uuid] = status;
+}
+
+
+void Executor::removePendingTaskStatus(const TaskStatus& status)
+{
+  const TaskID& taskId = status.task_id();
+
+  auto uuid = id::UUID::fromBytes(status.uuid()).get();
+
+  if (!pendingStatusUpdates.contains(taskId) ||
+  !pendingStatusUpdates[taskId].contains(uuid)) {
+LOG(WARNING) << "Unknown pending status update (uuid: " << uuid << ")";
+return;
+  }
+
+  pendingStatusUpdates[taskId].erase(uuid);
+
+  if (pendingStatusUpdates[taskId].empty()) {
+pendingStatusUpdates.erase(taskId);
+  }
+}
+
+
 Try Executor::updateTaskState(const TaskStatus& status)
 {
   bool terminal = protobuf::isTerminalState(status.state());
@@ -10818,6 +10856,24 @@ Try Executor::updateTaskState(const 
TaskStatus& status)
 task = launchedTasks.at(status.task_id());
 
 if (terminal) {
+  if (pendingStatusUpdates.contains(status.task_id())) {
+auto statusUpdates = pendingStatusUpdates[status.task_id()].values();
+
+auto firstTerminal = std::find_if(
+statusUpdates.begin(),
+statusUpdates.end(),
+  

[mesos] branch master updated: Updated site's dependencies.

2019-08-26 Thread bbannier
This is an automated email from the ASF dual-hosted git repository.

bbannier pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git


The following commit(s) were added to refs/heads/master by this push:
 new 48c20bf  Updated site's dependencies.
48c20bf is described below

commit 48c20bf257da60eaf714017efec0d4a80c203c04
Author: Benjamin Bannier 
AuthorDate: Mon Aug 26 10:06:18 2019 +0200

Updated site's dependencies.

This bumps e.g., `nokogiri` to a version not affected by CVE-2019-5477
anymore (not that it would have any impact on our use of it).

Review: https://reviews.apache.org/r/71367/
---
 site/Gemfile.lock | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/site/Gemfile.lock b/site/Gemfile.lock
index 343d3e6..63c48e7 100644
--- a/site/Gemfile.lock
+++ b/site/Gemfile.lock
@@ -1,7 +1,7 @@
 GEM
   remote: https://rubygems.org/
   specs:
-activesupport (4.2.10)
+activesupport (4.2.11.1)
   i18n (~> 0.7)
   minitest (~> 5.1)
   thread_safe (~> 0.3, >= 0.3.4)
@@ -13,7 +13,7 @@ GEM
   rack (>= 1.0.0)
   rack-test (>= 0.5.4)
   xpath (~> 2.0)
-chunky_png (1.3.10)
+chunky_png (1.3.11)
 coffee-script (2.4.1)
   coffee-script-source
   execjs
@@ -36,8 +36,8 @@ GEM
 erubis (2.7.0)
 eventmachine (1.2.7)
 execjs (2.7.0)
-ffi (1.9.25)
-haml (5.0.4)
+ffi (1.11.1)
+haml (5.1.2)
   temple (>= 0.8.0)
   tilt
 hike (1.2.3)
@@ -46,7 +46,7 @@ GEM
 htmlentities (4.3.4)
 http_parser.rb (0.6.0)
 i18n (0.7.0)
-json (2.1.0)
+json (2.2.0)
 kramdown (1.17.0)
 libv8 (3.16.14.19)
 listen (3.0.8)
@@ -93,11 +93,11 @@ GEM
   rouge (~> 2.0)
 mime-types (3.2.2)
   mime-types-data (~> 3.2015)
-mime-types-data (3.2018.0812)
+mime-types-data (3.2019.0331)
 mini_portile2 (2.4.0)
 minitest (5.11.3)
 multi_json (1.13.1)
-nokogiri (1.10.2)
+nokogiri (1.10.4)
   mini_portile2 (~> 2.4.0)
 padrino-helpers (0.12.9)
   i18n (~> 0.6, >= 0.6.7)
@@ -110,10 +110,10 @@ GEM
   rack
 rack-test (1.1.0)
   rack (>= 1.0, < 3)
-rake (12.3.1)
+rake (12.3.3)
 rb-fsevent (0.10.3)
-rb-inotify (0.9.10)
-  ffi (>= 0.5.0, < 2)
+rb-inotify (0.10.0)
+  ffi (~> 1.0)
 rdiscount (2.2.0.1)
 ref (2.0.0)
 rouge (2.2.1)
@@ -128,16 +128,16 @@ GEM
 sprockets-sass (1.3.1)
   sprockets (~> 2.0)
   tilt (~> 1.1)
-temple (0.8.0)
+temple (0.8.1)
 therubyracer (0.12.3)
   libv8 (~> 3.16.14.15)
   ref
-thor (0.20.0)
+thor (0.20.3)
 thread_safe (0.3.6)
 tilt (1.4.1)
 tzinfo (1.2.5)
   thread_safe (~> 0.1)
-tzinfo-data (1.2018.5)
+tzinfo-data (1.2019.2)
   tzinfo (>= 1.0.0)
 uber (0.0.15)
 uglifier (2.7.2)
@@ -161,4 +161,4 @@ DEPENDENCIES
   tzinfo-data
 
 BUNDLED WITH
-   1.16.1
+   1.17.2