Re: Review Request 72442: Reverted the changes about `REASON_CONTAINER_MEMORY_REQUEST_EXCEEDED`.

2020-05-05 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72442/
---

(Updated May 5, 2020, 3:53 p.m.)


Review request for mesos, Andrei Budnik and Greg Mann.


Changes
---

Updated description.


Bugs: MESOS-10049
https://issues.apache.org/jira/browse/MESOS-10049


Repository: mesos


Description (updated)
---

The method `MemorySubsystemProcess::oomWaited()` will only be invoked when the
container is OOM killed because it uses more memory than its hard memory limit
(i.e., the task status reason `REASON_CONTAINER_LIMITATION_MEMORY`), it will
NOT be invoked when a burstable container is OOM killed because the agent host
is running out of memory, i.e., we will NOT receive OOM killing notification
via cgroups notification API for this case. So it is not possible for Mesos to
provide a task status reason `REASON_CONTAINER_MEMORY_REQUEST_EXCEEDED` for this
case.


Diffs (updated)
-

  include/mesos/mesos.proto 9412ed736231547b22abc89188316b08d5445e78 
  include/mesos/v1/mesos.proto 194c42cf57e34d803a21cab03db17614855e8692 
  src/common/protobuf_utils.cpp 8d1d5c4cb0af911d8dc13e37a1adb62947513d0d 
  src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp 
60c7a89fb809582723eb50d22f54f4c8ce697584 


Diff: https://reviews.apache.org/r/72442/diff/2/

Changes: https://reviews.apache.org/r/72442/diff/1-2/


Testing
---


Thanks,

Qian Zhang



Re: Review Request 72442: Reverted the changes about `REASON_CONTAINER_MEMORY_REQUEST_EXCEEDED`.

2020-05-05 Thread Qian Zhang


> On May 5, 2020, 12:27 a.m., Greg Mann wrote:
> > I would recommend updating the description so that instead of saying we 
> > "don't need to add" the new reason, say that "it is not possible for Mesos 
> > to provide" the reason, so we must remove it.

Done.


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72442/#review220599
---


On May 5, 2020, 3:53 p.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72442/
> ---
> 
> (Updated May 5, 2020, 3:53 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10049
> https://issues.apache.org/jira/browse/MESOS-10049
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The method `MemorySubsystemProcess::oomWaited()` will only be invoked when the
> container is OOM killed because it uses more memory than its hard memory limit
> (i.e., the task status reason `REASON_CONTAINER_LIMITATION_MEMORY`), it will
> NOT be invoked when a burstable container is OOM killed because the agent host
> is running out of memory, i.e., we will NOT receive OOM killing notification
> via cgroups notification API for this case. So it is not possible for Mesos to
> provide a task status reason `REASON_CONTAINER_MEMORY_REQUEST_EXCEEDED` for 
> this
> case.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 9412ed736231547b22abc89188316b08d5445e78 
>   include/mesos/v1/mesos.proto 194c42cf57e34d803a21cab03db17614855e8692 
>   src/common/protobuf_utils.cpp 8d1d5c4cb0af911d8dc13e37a1adb62947513d0d 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp 
> 60c7a89fb809582723eb50d22f54f4c8ce697584 
> 
> 
> Diff: https://reviews.apache.org/r/72442/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 72305: Sent appropriate task status reason when task over memory request.

2020-05-05 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72305/#review220626
---



The code changes in this patch has been reverted, so 
https://reviews.apache.org/r/72442/ for more details.

- Qian Zhang


On April 7, 2020, 6:02 a.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72305/
> ---
> 
> (Updated April 7, 2020, 6:02 a.m.)
> 
> 
> Review request for mesos and Qian Zhang.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Sent appropriate task status reason when task over memory request.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.cpp 723d85a8656e61f77ab99e5e63f844ec95303ff0 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp 
> 15f87ba8c0a1b44fb3380beb0e739af566ab08fc 
> 
> 
> Diff: https://reviews.apache.org/r/72305/diff/5/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 72399: Updated UCR's `usage()` method to support resource limits.

2020-05-05 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72399/
---

(Updated May 5, 2020, 4:11 p.m.)


Review request for mesos, Andrei Budnik and Greg Mann.


Changes
---

Addressed review comments.


Bugs: MESOS-10117
https://issues.apache.org/jira/browse/MESOS-10117


Repository: mesos


Description
---

Updated UCR's `usage()` method to support resource limits.


Diffs (updated)
-

  src/slave/containerizer/mesos/containerizer.hpp 
2ea033aa1869f07644b10c036fda1698d08aa89b 
  src/slave/containerizer/mesos/containerizer.cpp 
6aa4f3fe0940575aeea6a63cdb1ca3c77cd2359b 
  src/tests/slave_recovery_tests.cpp 0efd3a6ac09ad06d9365b7bb2295157b5175e6b8 
  src/tests/slave_tests.cpp 6b264d067accb9800ba75ea8f5d27c1e4c5593db 


Diff: https://reviews.apache.org/r/72399/diff/7/

Changes: https://reviews.apache.org/r/72399/diff/6-7/


Testing
---


Thanks,

Qian Zhang



Re: Review Request 72364: Added test for reactivation of a disconnected drained agent.

2020-05-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72364/#review220627
---



Patch looks great!

Reviews applied: [72368, 72364]

Passed command: export OS='ubuntu:16.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/jenkins/buildbot.sh

- Mesos Reviewbot


On May 5, 2020, 12:12 a.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72364/
> ---
> 
> (Updated May 5, 2020, 12:12 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Greg Mann.
> 
> 
> Bugs: MESOS-10116
> https://issues.apache.org/jira/browse/MESOS-10116
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added test for reactivation of a disconnected drained agent.
> 
> 
> Diffs
> -
> 
>   src/tests/master_draining_tests.cpp 
> f6b974f5f88f8563bd9cf6cca087b60cb5784e77 
> 
> 
> Diff: https://reviews.apache.org/r/72364/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 72368: Fixed a bug in the agent's draining handler.

2020-05-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72368/
---

(Updated May 5, 2020, 5:51 p.m.)


Review request for mesos, Andrei Sekretenko and Benjamin Mahler.


Bugs: MESOS-10118
https://issues.apache.org/jira/browse/MESOS-10118


Repository: mesos


Description (updated)
---

Previously, when the agent had no tasks or operations and
received a `DrainSlaveMessage`, it would checkpoint the
`DrainConfig` to disk, implicitly placing it into a "draining"
state indefinitely. This patch updates the agent's handler to
avoid checkpointing anything to disk in this case.


Diffs (updated)
-

  src/slave/slave.cpp 1a32c813eeacf21b903db3f9602d034e7fb085c0 
  src/tests/slave_tests.cpp 5ad04b2fc494cb1120d2dc68a8cd40e5a27c129e 


Diff: https://reviews.apache.org/r/72368/diff/4/

Changes: https://reviews.apache.org/r/72368/diff/3-4/


Testing
---

Ran the test in https://reviews.apache.org/r/72364/ with this patch applied and 
verified that the final task reached TASK_RUNNING.


Thanks,

Greg Mann



Re: Review Request 72368: Fixed a bug in the agent's draining handler.

2020-05-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72368/
---

(Updated May 5, 2020, 5:52 p.m.)


Review request for mesos, Andrei Sekretenko and Benjamin Mahler.


Bugs: MESOS-10118
https://issues.apache.org/jira/browse/MESOS-10118


Repository: mesos


Description (updated)
---

Previously, when the agent had no tasks or operations and
received a `DrainSlaveMessage`, it would checkpoint the
`DrainConfig` to disk, implicitly placing it into a "draining"
state indefinitely. This patch updates the agent's handler to
avoid checkpointing anything to disk in this case.

The `SlaveTest.DrainInfoInAPIOutputs` test is also removed
and its functionality is moved into the test
`SlaveTest.DrainAgentKillsRunningTask`. The running task in
the latter test allows us to verify agent API outputs both
before and after the task's terminal update is acknowleged.


Diffs
-

  src/slave/slave.cpp 1a32c813eeacf21b903db3f9602d034e7fb085c0 
  src/tests/slave_tests.cpp 5ad04b2fc494cb1120d2dc68a8cd40e5a27c129e 


Diff: https://reviews.apache.org/r/72368/diff/4/


Testing
---

Ran the test in https://reviews.apache.org/r/72364/ with this patch applied and 
verified that the final task reached TASK_RUNNING.


Thanks,

Greg Mann



Re: Review Request 72267: Added docs for UPDATE_FRAMEWORK call.

2020-05-05 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72267/
---

(Updated May 5, 2020, 7:06 p.m.)


Review request for mesos, Benjamin Mahler and Greg Mann.


Changes
---

Addressed issues and split UPDATE_FRAMEWORK into subsections.


Bugs: MESOS-9979
https://issues.apache.org/jira/browse/MESOS-9979


Repository: mesos


Description
---

Added docs for UPDATE_FRAMEWORK call.


Diffs (updated)
-

  docs/scheduler-http-api.md 9831d527cc1f832a6fb0d0d330ebdc2a0b0f3774 


Diff: https://reviews.apache.org/r/72267/diff/2/

Changes: https://reviews.apache.org/r/72267/diff/1-2/


Testing
---

Checked rendering in Github: 
https://github.com/asekretenko/mesos/blob/update_framework_doc/docs/scheduler-http-api.md


Thanks,

Andrei Sekretenko



Re: Review Request 72368: Fixed a bug in the agent's draining handler.

2020-05-05 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72368/#review220641
---


Ship it!




Ship It!

- Andrei Sekretenko


On May 5, 2020, 5:52 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72368/
> ---
> 
> (Updated May 5, 2020, 5:52 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-10118
> https://issues.apache.org/jira/browse/MESOS-10118
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, when the agent had no tasks or operations and
> received a `DrainSlaveMessage`, it would checkpoint the
> `DrainConfig` to disk, implicitly placing it into a "draining"
> state indefinitely. This patch updates the agent's handler to
> avoid checkpointing anything to disk in this case.
> 
> The `SlaveTest.DrainInfoInAPIOutputs` test is also removed
> and its functionality is moved into the test
> `SlaveTest.DrainAgentKillsRunningTask`. The running task in
> the latter test allows us to verify agent API outputs both
> before and after the task's terminal update is acknowleged.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.cpp 1a32c813eeacf21b903db3f9602d034e7fb085c0 
>   src/tests/slave_tests.cpp 5ad04b2fc494cb1120d2dc68a8cd40e5a27c129e 
> 
> 
> Diff: https://reviews.apache.org/r/72368/diff/4/
> 
> 
> Testing
> ---
> 
> Ran the test in https://reviews.apache.org/r/72364/ with this patch applied 
> and verified that the final task reached TASK_RUNNING.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 72364: Added test for reactivation of a disconnected drained agent.

2020-05-05 Thread Andrei Sekretenko


> On May 5, 2020, 5:41 a.m., Greg Mann wrote:
> > src/tests/master_draining_tests.cpp
> > Lines 368-370 (patched)
> > 
> >
> > I don't usually need to resume the clock to get a task running update 
> > to go through, do you know exactly what's going on here?
> 
> Andrei Sekretenko wrote:
> Weird... after r72364 this test works without resuming the clock.
> 
> That's actually a good question why the task status update manager in 
> this test was not being resumed before your fix for MESOS-10118. I'm 
> investigating that.

For some reason, now I cannot reproduce the update manager being stuck in 
paused state. Neither in the current configuration nor in the previous one.

Removed this `Clock::resume()`.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72364/#review220617
---


On May 5, 2020, 2:05 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72364/
> ---
> 
> (Updated May 5, 2020, 2:05 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Greg Mann.
> 
> 
> Bugs: MESOS-10116
> https://issues.apache.org/jira/browse/MESOS-10116
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added test for reactivation of a disconnected drained agent.
> 
> 
> Diffs
> -
> 
>   src/tests/master_draining_tests.cpp 
> f6b974f5f88f8563bd9cf6cca087b60cb5784e77 
> 
> 
> Diff: https://reviews.apache.org/r/72364/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Review Request 72473: Fixed src/Makefile.am to include Web UI roles tree files.

2020-05-05 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72473/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-10125
https://issues.apache.org/jira/browse/MESOS-10125


Repository: mesos


Description
---

Fixed src/Makefile.am to include Web UI roles tree files.


Diffs
-

  src/Makefile.am 22506f2b0e2d2d71d1dab8b848e4f52aa57083ee 


Diff: https://reviews.apache.org/r/72473/diff/1/


Testing
---


Thanks,

Andrei Sekretenko



Re: Review Request 72368: Fixed a bug in the agent's draining handler.

2020-05-05 Thread Greg Mann


> On May 5, 2020, 12:27 p.m., Andrei Sekretenko wrote:
> > src/tests/slave_tests.cpp
> > Lines 12227-12228 (patched)
> > 
> >
> > I 'm not sure if these assertions add any value: these properties of V1 
> > API are not specific to agent draining and are covered by corresponding 
> > tests. Maybe we can just remove them to improve readability?

Good call, removed.


> On May 5, 2020, 12:27 p.m., Andrei Sekretenko wrote:
> > src/tests/slave_tests.cpp
> > Lines 12229 (patched)
> > 
> >
> > Given that `drainConfig` in `EXPECT_EQ` below is different from default 
> > value of `DrainConfig`, isn't this assertion redundant as well?

I think it's a bit better to explicitly assert here that the field has been 
set, since not doing so would require the reader to know something about the 
field's default value in order to reason about this code.


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72368/#review220634
---


On May 5, 2020, 5:52 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72368/
> ---
> 
> (Updated May 5, 2020, 5:52 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-10118
> https://issues.apache.org/jira/browse/MESOS-10118
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, when the agent had no tasks or operations and
> received a `DrainSlaveMessage`, it would checkpoint the
> `DrainConfig` to disk, implicitly placing it into a "draining"
> state indefinitely. This patch updates the agent's handler to
> avoid checkpointing anything to disk in this case.
> 
> The `SlaveTest.DrainInfoInAPIOutputs` test is also removed
> and its functionality is moved into the test
> `SlaveTest.DrainAgentKillsRunningTask`. The running task in
> the latter test allows us to verify agent API outputs both
> before and after the task's terminal update is acknowleged.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.cpp 1a32c813eeacf21b903db3f9602d034e7fb085c0 
>   src/tests/slave_tests.cpp 5ad04b2fc494cb1120d2dc68a8cd40e5a27c129e 
> 
> 
> Diff: https://reviews.apache.org/r/72368/diff/4/
> 
> 
> Testing
> ---
> 
> Ran the test in https://reviews.apache.org/r/72364/ with this patch applied 
> and verified that the final task reached TASK_RUNNING.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 72364: Added test for reactivation of a disconnected drained agent.

2020-05-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72364/#review220640
---


Ship it!




Ship It!

- Greg Mann


On May 5, 2020, 2:05 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72364/
> ---
> 
> (Updated May 5, 2020, 2:05 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Greg Mann.
> 
> 
> Bugs: MESOS-10116
> https://issues.apache.org/jira/browse/MESOS-10116
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added test for reactivation of a disconnected drained agent.
> 
> 
> Diffs
> -
> 
>   src/tests/master_draining_tests.cpp 
> f6b974f5f88f8563bd9cf6cca087b60cb5784e77 
> 
> 
> Diff: https://reviews.apache.org/r/72364/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 72364: Added test for reactivation of a disconnected drained agent.

2020-05-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72364/#review220637
---



Patch looks great!

Reviews applied: [72368, 72364]

Passed command: export OS='ubuntu:16.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/jenkins/buildbot.sh

- Mesos Reviewbot


On May 5, 2020, 2:05 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72364/
> ---
> 
> (Updated May 5, 2020, 2:05 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Greg Mann.
> 
> 
> Bugs: MESOS-10116
> https://issues.apache.org/jira/browse/MESOS-10116
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added test for reactivation of a disconnected drained agent.
> 
> 
> Diffs
> -
> 
>   src/tests/master_draining_tests.cpp 
> f6b974f5f88f8563bd9cf6cca087b60cb5784e77 
> 
> 
> Diff: https://reviews.apache.org/r/72364/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 72364: Added test for reactivation of a disconnected drained agent.

2020-05-05 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72364/
---

(Updated May 5, 2020, 2:05 p.m.)


Review request for mesos, Benjamin Mahler and Greg Mann.


Bugs: MESOS-10116
https://issues.apache.org/jira/browse/MESOS-10116


Repository: mesos


Description
---

Added test for reactivation of a disconnected drained agent.


Diffs (updated)
-

  src/tests/master_draining_tests.cpp f6b974f5f88f8563bd9cf6cca087b60cb5784e77 


Diff: https://reviews.apache.org/r/72364/diff/3/

Changes: https://reviews.apache.org/r/72364/diff/2-3/


Testing
---


Thanks,

Andrei Sekretenko



Re: Review Request 72364: Added test for reactivation of a disconnected drained agent.

2020-05-05 Thread Andrei Sekretenko


> On May 5, 2020, 5:41 a.m., Greg Mann wrote:
> > src/tests/master_draining_tests.cpp
> > Lines 239 (patched)
> > 
> >
> > Nit: s/lead to crash/lead to a crash/

Rewrote the comment to reflect the fact that this also tests for regression in 
MESOS-10118.


> On May 5, 2020, 5:41 a.m., Greg Mann wrote:
> > src/tests/master_draining_tests.cpp
> > Lines 319-324 (patched)
> > 
> >
> > Nit: it's a little weird to have one update use 
> > `TaskStatusUpdateStateEq` but not the other? And both of these expectations 
> > use `WillRepeatedly`, but I think we could do `WillOnce`.

Changed the first one to use a negated expectation, and removed 
`WillRepeatedly` from the second one.


> On May 5, 2020, 5:41 a.m., Greg Mann wrote:
> > src/tests/master_draining_tests.cpp
> > Lines 368-370 (patched)
> > 
> >
> > I don't usually need to resume the clock to get a task running update 
> > to go through, do you know exactly what's going on here?

Weird... after r72364 this test works without resuming the clock.

That's actually a good question why the task status update manager in this test 
was not being resumed before your fix for MESOS-10118. I'm investigating that.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72364/#review220617
---


On May 5, 2020, 2:05 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72364/
> ---
> 
> (Updated May 5, 2020, 2:05 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Greg Mann.
> 
> 
> Bugs: MESOS-10116
> https://issues.apache.org/jira/browse/MESOS-10116
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added test for reactivation of a disconnected drained agent.
> 
> 
> Diffs
> -
> 
>   src/tests/master_draining_tests.cpp 
> f6b974f5f88f8563bd9cf6cca087b60cb5784e77 
> 
> 
> Diff: https://reviews.apache.org/r/72364/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 72364: Added test for reactivation of a disconnected drained agent.

2020-05-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72364/#review220643
---



Patch looks great!

Reviews applied: [72368, 72364]

Passed command: export OS='ubuntu:16.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/jenkins/buildbot.sh

- Mesos Reviewbot


On May 5, 2020, 2:05 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72364/
> ---
> 
> (Updated May 5, 2020, 2:05 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Greg Mann.
> 
> 
> Bugs: MESOS-10116
> https://issues.apache.org/jira/browse/MESOS-10116
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added test for reactivation of a disconnected drained agent.
> 
> 
> Diffs
> -
> 
>   src/tests/master_draining_tests.cpp 
> f6b974f5f88f8563bd9cf6cca087b60cb5784e77 
> 
> 
> Diff: https://reviews.apache.org/r/72364/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 72368: Fixed a bug in the agent's draining handler.

2020-05-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72368/
---

(Updated May 6, 2020, 12:24 a.m.)


Review request for mesos, Andrei Sekretenko and Benjamin Mahler.


Bugs: MESOS-10118
https://issues.apache.org/jira/browse/MESOS-10118


Repository: mesos


Description
---

Previously, when the agent had no tasks or operations and
received a `DrainSlaveMessage`, it would checkpoint the
`DrainConfig` to disk, implicitly placing it into a "draining"
state indefinitely. This patch updates the agent's handler to
avoid checkpointing anything to disk in this case.

The `SlaveTest.DrainInfoInAPIOutputs` test is also removed
and its functionality is moved into the test
`SlaveTest.DrainAgentKillsRunningTask`. The running task in
the latter test allows us to verify agent API outputs both
before and after the task's terminal update is acknowleged.


Diffs
-

  src/slave/slave.cpp 1a32c813eeacf21b903db3f9602d034e7fb085c0 
  src/tests/slave_tests.cpp 5ad04b2fc494cb1120d2dc68a8cd40e5a27c129e 


Diff: https://reviews.apache.org/r/72368/diff/4/


Testing (updated)
---

This patch is tested in https://reviews.apache.org/r/72364/.


Thanks,

Greg Mann



Re: Review Request 72267: Added docs for UPDATE_FRAMEWORK call.

2020-05-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72267/#review220646
---



Patch looks great!

Reviews applied: [72267]

Passed command: export OS='ubuntu:16.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/jenkins/buildbot.sh

- Mesos Reviewbot


On May 5, 2020, 7:06 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72267/
> ---
> 
> (Updated May 5, 2020, 7:06 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Greg Mann.
> 
> 
> Bugs: MESOS-9979
> https://issues.apache.org/jira/browse/MESOS-9979
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added docs for UPDATE_FRAMEWORK call.
> 
> 
> Diffs
> -
> 
>   docs/scheduler-http-api.md 9831d527cc1f832a6fb0d0d330ebdc2a0b0f3774 
> 
> 
> Diff: https://reviews.apache.org/r/72267/diff/2/
> 
> 
> Testing
> ---
> 
> Checked rendering in Github: 
> https://github.com/asekretenko/mesos/blob/update_framework_doc/docs/scheduler-http-api.md
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 72473: Fixed src/Makefile.am to include Web UI roles tree files.

2020-05-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72473/#review220645
---



Patch looks great!

Reviews applied: [72473]

Passed command: export OS='ubuntu:16.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/jenkins/buildbot.sh

- Mesos Reviewbot


On May 5, 2020, 6:04 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72473/
> ---
> 
> (Updated May 5, 2020, 6:04 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-10125
> https://issues.apache.org/jira/browse/MESOS-10125
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed src/Makefile.am to include Web UI roles tree files.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 22506f2b0e2d2d71d1dab8b848e4f52aa57083ee 
> 
> 
> Diff: https://reviews.apache.org/r/72473/diff/1/
> 
> 
> Testing
> ---
> 
> Ran webui with a multi-role framework in an environment which installs Mesos.
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 72473: Fixed src/Makefile.am to include Web UI roles tree files.

2020-05-05 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72473/#review220642
---


Ship it!




Ship It!

- Benjamin Mahler


On May 5, 2020, 6:04 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72473/
> ---
> 
> (Updated May 5, 2020, 6:04 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-10125
> https://issues.apache.org/jira/browse/MESOS-10125
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed src/Makefile.am to include Web UI roles tree files.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 22506f2b0e2d2d71d1dab8b848e4f52aa57083ee 
> 
> 
> Diff: https://reviews.apache.org/r/72473/diff/1/
> 
> 
> Testing
> ---
> 
> Ran webui with a multi-role framework in an environment which installs Mesos.
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 72267: Added docs for UPDATE_FRAMEWORK call.

2020-05-05 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72267/#review220644
---


Fix it, then Ship it!





docs/scheduler-http-api.md
Lines 544-546 (patched)


Hm.. this note applies to all of the field updates, but since it's only 
specified for capabilities, the reader will assume there is some semantic 
importance of the agent processing a framework capability update. This is 
probably something they have to understand specific to each particular 
capability.

Perhaps instead just a general note up above about the possible response 
codes and when we send them? (i.e. it's just when the master processes it)



docs/scheduler-http-api.md
Lines 547 (patched)


Instead of "not specified", maybe we explain here that we don't disallow 
capability removal **BUT** there are currently no safeguards in place to 
prevent capability removal when the capability semantically can no longer be 
removed (e.g. if the framework started using shared resources, then 
SHARED_RESOURCES became a requirement, removing it should be disallowed but we 
don't currently protect against it). So: don't remove a capability, and if you 
*really really* need to, contact the dev@ or user@ list first.

A link to the minimum capability stuff would be good too.


- Benjamin Mahler


On May 5, 2020, 7:06 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72267/
> ---
> 
> (Updated May 5, 2020, 7:06 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Greg Mann.
> 
> 
> Bugs: MESOS-9979
> https://issues.apache.org/jira/browse/MESOS-9979
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added docs for UPDATE_FRAMEWORK call.
> 
> 
> Diffs
> -
> 
>   docs/scheduler-http-api.md 9831d527cc1f832a6fb0d0d330ebdc2a0b0f3774 
> 
> 
> Diff: https://reviews.apache.org/r/72267/diff/2/
> 
> 
> Testing
> ---
> 
> Checked rendering in Github: 
> https://github.com/asekretenko/mesos/blob/update_framework_doc/docs/scheduler-http-api.md
> 
> 
> Thanks,
> 
> Andrei Sekretenko
> 
>



Re: Review Request 72368: Fixed a bug in the agent's draining handler.

2020-05-05 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72368/#review220634
---


Fix it, then Ship it!





src/tests/slave_tests.cpp
Lines 12227-12228 (patched)


I 'm not sure if these assertions add any value: these properties of V1 API 
are not specific to agent draining and are covered by corresponding tests. 
Maybe we can just remove them to improve readability?



src/tests/slave_tests.cpp
Lines 12229 (patched)


Given that `drainConfig` in `EXPECT_EQ` below is different from default 
value of `DrainConfig`, isn't this assertion redundant as well?



src/tests/slave_tests.cpp
Lines 12252 (patched)


The purpose of waiting for `removeFramework` doesn't look obvious at all - 
can you also explain it in this comment or nearby?


- Andrei Sekretenko


On May 5, 2020, 5:22 a.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72368/
> ---
> 
> (Updated May 5, 2020, 5:22 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-10118
> https://issues.apache.org/jira/browse/MESOS-10118
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, when the agent had no tasks or operations and
> received a `DrainSlaveMessage`, it would checkpoint the
> `DrainConfig` to disk, implicitly placing it into a "draining"
> state indefinitely. This patch updates the agent's handler to
> avoid checkpointing anything to disk in this case.
> 
> The `SlaveTest.DrainInfoInAPIOutputs` test is also removed
> and its functionality is moved into the test
> `SlaveTest.DrainAgentKillsRunningTask`. The running task in
> the latter test allows us to verify agent API outputs both
> before and after the task's terminal update is acknowleged.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.cpp 1a32c813eeacf21b903db3f9602d034e7fb085c0 
>   src/tests/slave_tests.cpp 6b264d067accb9800ba75ea8f5d27c1e4c5593db 
> 
> 
> Diff: https://reviews.apache.org/r/72368/diff/3/
> 
> 
> Testing
> ---
> 
> Ran the test in https://reviews.apache.org/r/72364/ with this patch applied 
> and verified that the final task reached TASK_RUNNING.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>