Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Mesos ReviewBot Thu, 25 Jun 2015 22:15:07 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review89471
-----------------------------------------------------------



Patch looks great!

Reviews applied: [35714, 35702]

All tests passed.

- Mesos ReviewBot


On June 26, 2015, 4:44 a.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated June 26, 2015, 4:44 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris 
> Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the 
> various approaches and limitations and deal-breakers of those approaches 
> here: [Master Endpoint Implementation 
> Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management 
> logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state 
> and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like 
> "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` 
> call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered 
> resources.
>     * We greedily rescind one offer at a time until we've rescinded 
> sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than 
> `recoverResources(..., None())` so that we can pretty much always win the 
> race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail 
> to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be 
> optimistic and forward the request to the allocator since there may be 
> available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with 
> `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` 
> maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as 
> little offers as possible.
> The challenges of implementing the ideal solution in the current state is 
> described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> TODO(mpark): Return a `Future<Nothing>` rather than `Future<Try<Nothing>>` 
> and use `Future::repair` to propagate the failure state.
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/allocator.hpp 22992c0c77058af4fcd28aa8e4a1191693a16f44 
>   src/Makefile.am a064d17a6b62e6e3c8e190135bcc8cbbb0051ed5 
>   src/master/allocator/mesos/allocator.hpp 
> 72470ec7f56f84a9a9815c09adb88def90ef672f 
>   src/master/allocator/mesos/hierarchical.hpp 
> 3264d145d52b48852878abf7ab9be29ab98208cc 
>   src/master/http.cpp 350383362311cfbc830965e1155a8515f0dfb332 
>   src/master/master.hpp af83d3e82d2c161b3cc4583e78a8cbbd2f9a4064 
>   src/master/master.cpp 0782b543b451921d2240958c7ef612a9e30972df 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
>   src/tests/mesos.hpp 9157ac079808d2686592e54ea26a26e6a0825ed3 
>   src/tests/reserve_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> Added `src/tests/reserve_tests.cpp`.
> 
> 
> Thanks,
> 
> Michael Park
> 
>

Re: Review Request 35702: [WIP] Added /reserve HTTP endpoint to the master.

Reply via email to