Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Michael Park Tue, 28 Jul 2015 10:29:13 -0700


> On July 13, 2015, 4:46 p.m., Alexander Rukletsov wrote:
> > src/master/http.cpp, lines 515-516
> > <https://reviews.apache.org/r/35702/diff/9/?file=994080#file994080line515>
> >
> >     It looks like we actually have the role, but it's buried in resources. 
> > Do you envision having resources collection with various roles in one 
> > request? Maybe it makes sense to add a validation step which ensures there 
> > is just one role per request and use it here, also avoiding changes in the 
> > `validate()`function.
> 
> Michael Park wrote:
>     I didn't see a good reason to require a "one role per request" condition. 
> The current interface accurately models the fact that an operator does not 
> have a role associated to it like a framework does, and I don't think 
> "avoiding changes in the `validate()` function" should have any influence in 
> deciding how an interface behaves.
>     
>     If we required such a condition, the per-request atomicity guarantee 
> comes with a limitation that it can only be for a single role. While I'm not 
> sure of its value, I'm also not sure what we gain by requiring it from the 
> user's perspective?
> 
> Alexander Rukletsov wrote:
>     I think I'm missing something, my understanding is that each dynamic 
> reservation is associated with a role, regardless, who issues a reservation 
> request. I don't think limiting users to one role per request gives them any 
> benefit, but it looks like we can be closer to framework-issued request if we 
> do so. What am I missing?


Your understanding is correct. Aside from the resources being associated with a 
role, frameworks are also associated with a role. We check that every resource 
is being reserved for the framework's role because a framework is associated 
with a role and it wouldn't make sense to allow a framework to reserve 
resources for a role that does not match its role. On the contrary, the same 
rule doesn't apply for an operator since there's no such thing as an 
"operator's role".


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review91472
-----------------------------------------------------------


On July 27, 2015, 11:30 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 27, 2015, 11:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris 
> Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the 
> various approaches and limitations and deal-breakers of those approaches 
> here: [Master Endpoint Implementation 
> Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management 
> logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state 
> and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like 
> "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` 
> call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered 
> resources.
>     * We greedily rescind one offer at a time until we've rescinded 
> sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than 
> `recoverResources(..., None())` so that we can pretty much always win the 
> race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail 
> to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be 
> optimistic and forward the request to the allocator since there may be 
> available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with 
> `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` 
> maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as 
> little offers as possible.
> The challenges of implementing the ideal solution in the current state is 
> described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>

Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Reply via email to