Re: Review Request 52083: Changed reconciliation for unregistering, reregistering agents.

2017-03-24 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52083/#review170066
---




src/master/master.hpp
Lines 1726-1737 (original), 1727-1734 (patched)


When reviewing an unrelated patch I was confused by this function since it 
no longer represents "transitioning" agents.

We should either:

(1) Remove this function and directly use the particular transitions we 
care about within reconciliation

(2) Preserve the semantics of this function: to return when an agent is 
transitioning from one state to another. And avoid using it in the 
reconciliation code, in favor of directly using the transitions we care about.

In both cases, the comment at the top of this review should really be 
within the code, as I don't think people can easily figure this out.


- Benjamin Mahler


On Oct. 13, 2016, 2:10 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52083/
> ---
> 
> (Updated Oct. 13, 2016, 2:10 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6206
> https://issues.apache.org/jira/browse/MESOS-6206
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, explicit reconciliation for an agent that was in the process
> of reregistering or unregistering returned no results. This degree of
> cleverness seems unwarranted: if the agent hasn't completed the
> reregistration or unregistration process, it seems simpler for the
> master to return the previous state of the agent. This is what the
> framework would observe if their reconcile request lost the race with
> the reregister/unregister operation, anyway.
> 
> Note that since reregistering agents are no longer considered to be "in
> transition", we need to slightly adjust the rules for how we update the
> `slaves.recovered` collection in the master: an agent remains in the
> "recovered" collection until it has been marked reachable in the
> registry (rather than removing it from "recovered" as soon as the
> reregistration process beings). This is more consistent with how we
> manage the other collections in the master anyway: an agent appears in
> the `recovered` list until the registry operation that reregisters it
> has been successfully applied.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 43518b9bf1bfaa54e26acc7f2e70c4161c667a84 
>   src/master/master.cpp 7ef898781ce5c2349ffeaa3ce43e68dede19c852 
>   src/tests/master_tests.cpp 88cf1e612ad8186ef2cea161b3a52b0df9517305 
>   src/tests/reconciliation_tests.cpp 1412090299df388456f04ed58a1d384ce3ff550a 
> 
> 
> Diff: https://reviews.apache.org/r/52083/diff/2/
> 
> 
> Testing
> ---
> 
> `make check` on OSX, Linux.
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Re: Review Request 52083: Changed reconciliation for unregistering, reregistering agents.

2016-10-17 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52083/#review152997
---


Ship it!




Ship It!

- Vinod Kone


On Oct. 13, 2016, 2:10 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52083/
> ---
> 
> (Updated Oct. 13, 2016, 2:10 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6206
> https://issues.apache.org/jira/browse/MESOS-6206
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, explicit reconciliation for an agent that was in the process
> of reregistering or unregistering returned no results. This degree of
> cleverness seems unwarranted: if the agent hasn't completed the
> reregistration or unregistration process, it seems simpler for the
> master to return the previous state of the agent. This is what the
> framework would observe if their reconcile request lost the race with
> the reregister/unregister operation, anyway.
> 
> Note that since reregistering agents are no longer considered to be "in
> transition", we need to slightly adjust the rules for how we update the
> `slaves.recovered` collection in the master: an agent remains in the
> "recovered" collection until it has been marked reachable in the
> registry (rather than removing it from "recovered" as soon as the
> reregistration process beings). This is more consistent with how we
> manage the other collections in the master anyway: an agent appears in
> the `recovered` list until the registry operation that reregisters it
> has been successfully applied.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 43518b9bf1bfaa54e26acc7f2e70c4161c667a84 
>   src/master/master.cpp 7ef898781ce5c2349ffeaa3ce43e68dede19c852 
>   src/tests/master_tests.cpp 88cf1e612ad8186ef2cea161b3a52b0df9517305 
>   src/tests/reconciliation_tests.cpp 1412090299df388456f04ed58a1d384ce3ff550a 
> 
> Diff: https://reviews.apache.org/r/52083/diff/
> 
> 
> Testing
> ---
> 
> `make check` on OSX, Linux.
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Re: Review Request 52083: Changed reconciliation for unregistering, reregistering agents.

2016-10-13 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52083/
---

(Updated Oct. 13, 2016, 2:10 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Clarify comments, commit description.


Bugs: MESOS-6206
https://issues.apache.org/jira/browse/MESOS-6206


Repository: mesos


Description (updated)
---

Previously, explicit reconciliation for an agent that was in the process
of reregistering or unregistering returned no results. This degree of
cleverness seems unwarranted: if the agent hasn't completed the
reregistration or unregistration process, it seems simpler for the
master to return the previous state of the agent. This is what the
framework would observe if their reconcile request lost the race with
the reregister/unregister operation, anyway.

Note that since reregistering agents are no longer considered to be "in
transition", we need to slightly adjust the rules for how we update the
`slaves.recovered` collection in the master: an agent remains in the
"recovered" collection until it has been marked reachable in the
registry (rather than removing it from "recovered" as soon as the
reregistration process beings). This is more consistent with how we
manage the other collections in the master anyway: an agent appears in
the `recovered` list until the registry operation that reregisters it
has been successfully applied.


Diffs (updated)
-

  src/master/master.hpp 43518b9bf1bfaa54e26acc7f2e70c4161c667a84 
  src/master/master.cpp 7ef898781ce5c2349ffeaa3ce43e68dede19c852 
  src/tests/master_tests.cpp 88cf1e612ad8186ef2cea161b3a52b0df9517305 
  src/tests/reconciliation_tests.cpp 1412090299df388456f04ed58a1d384ce3ff550a 

Diff: https://reviews.apache.org/r/52083/diff/


Testing
---

`make check` on OSX, Linux.


Thanks,

Neil Conway



Re: Review Request 52083: Changed reconciliation for unregistering, reregistering agents.

2016-10-13 Thread Neil Conway


> On Oct. 12, 2016, 12:54 a.m., Vinod Kone wrote:
> > don't quite follow the second para in the description. you say agent is no 
> > longer in transitioning while re-registering but then you want keep it in 
> > recovered map which results in the `transitioning()` to return true? am i 
> > reading it right?

The proposal rule is: reconciliation should ignore the effect of in-progress 
operations (since they might not succeed). So if an agent was in `recovered` 
but has not finished reregistration, reconciliation should treat it as 
`recovered` but not yet `reregistered` -- i.e., it should be `transitioning()` 
and reconciliation should return no results. So if the agent was recovered from 
the registry but has not yet finished reregistration, explicit reconciliation 
will be a no-op; if the agent was marked unreachable but has not yet finished 
reregistration, explicit reconciliation will return `TASK_UNREACHABLE`.


> On Oct. 12, 2016, 12:54 a.m., Vinod Kone wrote:
> > src/master/master.cpp, line 1894
> > 
> >
> > hmm. this means an agent could be in `recovered` map and 
> > `reregistering` or `markingUnreachable` maps at the same time. didn't we 
> > decide to not do that? are you still planning to do that cleanup?

This is intentional. An agent appears in `recovered` if it has been recovered 
from the registrar and it has EITHER: (a) completed re-registration, or (b) 
completed being marked unreachable. The `reregistering` and 
`markingUnreachable` collections track whether there is an in-progress 
operation attempting to do (a) or (b), respectively. We only updated 
`recovered` when the in-progress operation has completed successfully.


- Neil


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52083/#review152239
---


On Sept. 20, 2016, 3:26 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52083/
> ---
> 
> (Updated Sept. 20, 2016, 3:26 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6206
> https://issues.apache.org/jira/browse/MESOS-6206
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, explicit reconciliation for an agent that was in the process
> of reregistering or unregistering returned no results. This degree of
> cleverness seems unwarranted: if the agent hasn't completed the
> reregistration or unregistration process, it seems quite reasonable for
> the master to return the previous state of the agent (this is what the
> framework would observe if their reconcile request lost the race with
> the reregister/unregister, anyway).
> 
> Note that since reregistering agents are no longer considered to be "in
> transition", we need to slightly adjust the rules for how we update the
> `slaves.recovered` collection in the master: an agent remains in the
> "recovered" collection until it has been marked reachable in the
> registry (rather than removing it from "recovered" as soon as the
> reregistration process beings). This is more consistent with how we
> manage the other collections in the master anyway.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 35db198748b8652eb53e17f592f6b40d1e6a3ed9 
>   src/master/master.cpp 66a672f6d16233e96b29e330a9e6c474546fa851 
>   src/tests/master_tests.cpp e6c8362da6a5669e2a2d18f6eb4e454365a84f60 
>   src/tests/reconciliation_tests.cpp 1412090299df388456f04ed58a1d384ce3ff550a 
> 
> Diff: https://reviews.apache.org/r/52083/diff/
> 
> 
> Testing
> ---
> 
> `make check` on OSX, Linux.
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Re: Review Request 52083: Changed reconciliation for unregistering, reregistering agents.

2016-10-11 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52083/#review152239
---



don't quite follow the second para in the description. you say agent is no 
longer in transitioning while re-registering but then you want keep it in 
recovered map which results in the `transitioning()` to return true? am i 
reading it right?


src/master/master.cpp 


hmm. this means an agent could be in `recovered` map and `reregistering` or 
`markingUnreachable` maps at the same time. didn't we decide to not do that? 
are you still planning to do that cleanup?



src/tests/reconciliation_tests.cpp (line 466)


s/PartialReregistration/ReregistrationInProgress/



src/tests/reconciliation_tests.cpp (line 566)


s/PartialRemoval/RemovalInProgress/


- Vinod Kone


On Sept. 20, 2016, 3:26 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52083/
> ---
> 
> (Updated Sept. 20, 2016, 3:26 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6206
> https://issues.apache.org/jira/browse/MESOS-6206
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, explicit reconciliation for an agent that was in the process
> of reregistering or unregistering returned no results. This degree of
> cleverness seems unwarranted: if the agent hasn't completed the
> reregistration or unregistration process, it seems quite reasonable for
> the master to return the previous state of the agent (this is what the
> framework would observe if their reconcile request lost the race with
> the reregister/unregister, anyway).
> 
> Note that since reregistering agents are no longer considered to be "in
> transition", we need to slightly adjust the rules for how we update the
> `slaves.recovered` collection in the master: an agent remains in the
> "recovered" collection until it has been marked reachable in the
> registry (rather than removing it from "recovered" as soon as the
> reregistration process beings). This is more consistent with how we
> manage the other collections in the master anyway.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 35db198748b8652eb53e17f592f6b40d1e6a3ed9 
>   src/master/master.cpp 66a672f6d16233e96b29e330a9e6c474546fa851 
>   src/tests/master_tests.cpp e6c8362da6a5669e2a2d18f6eb4e454365a84f60 
>   src/tests/reconciliation_tests.cpp 1412090299df388456f04ed58a1d384ce3ff550a 
> 
> Diff: https://reviews.apache.org/r/52083/diff/
> 
> 
> Testing
> ---
> 
> `make check` on OSX, Linux.
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Re: Review Request 52083: Changed reconciliation for unregistering, reregistering agents.

2016-09-20 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52083/#review149695
---



Patch looks great!

Reviews applied: [50235, 50416, 50417, 50418, 50422, 50699, 50700, 50701, 
50702, 50703, 50704, 50705, 50706, 50707, 50844, 50845, 50846, 51020, 51371, 
51374, 51375, 51376, 51377, 51021, 51706, 51653, 51707, 51805, 51845, 51891, 
51909, 51913, 51953, 51954, 51955, 51956, 51957, 51958, 52039, 52083]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Sept. 20, 2016, 3:26 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52083/
> ---
> 
> (Updated Sept. 20, 2016, 3:26 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6206
> https://issues.apache.org/jira/browse/MESOS-6206
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, explicit reconciliation for an agent that was in the process
> of reregistering or unregistering returned no results. This degree of
> cleverness seems unwarranted: if the agent hasn't completed the
> reregistration or unregistration process, it seems quite reasonable for
> the master to return the previous state of the agent (this is what the
> framework would observe if their reconcile request lost the race with
> the reregister/unregister, anyway).
> 
> Note that since reregistering agents are no longer considered to be "in
> transition", we need to slightly adjust the rules for how we update the
> `slaves.recovered` collection in the master: an agent remains in the
> "recovered" collection until it has been marked reachable in the
> registry (rather than removing it from "recovered" as soon as the
> reregistration process beings). This is more consistent with how we
> manage the other collections in the master anyway.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 35db198748b8652eb53e17f592f6b40d1e6a3ed9 
>   src/master/master.cpp 66a672f6d16233e96b29e330a9e6c474546fa851 
>   src/tests/master_tests.cpp e6c8362da6a5669e2a2d18f6eb4e454365a84f60 
>   src/tests/reconciliation_tests.cpp 1412090299df388456f04ed58a1d384ce3ff550a 
> 
> Diff: https://reviews.apache.org/r/52083/diff/
> 
> 
> Testing
> ---
> 
> `make check` on OSX, Linux.
> 
> 
> Thanks,
> 
> Neil Conway
> 
>