Folks,

we have updated the allocator section of the design doc [1]. Please take a
look and comment!

[1]
https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I/edit?pli=1#heading=h.cfauc9dedesc

On Thu, Jul 16, 2015 at 11:49 PM, Jörg Schad <jo...@mesosphere.io> wrote:

> Thanks for all your feedback!
> We feel that we gained enough feedback and understanding to start creating
> Jiras for the first part (i.e. HTTP endpoints, validation and persistence
> of quota requests). If you any objections/additional feedback let us know!
>
> For the open question above: For the MVP we are planning to make the master
> responsible for quota validation on a heuristic basis. We will additionally
> provide *force* flag allowing operators to skip validation (imagine the
> case the operator is about to add new agents/rack and wants to request
> quota prior to that).
>
> Thanks again for the feedback!
> Jörg
>
>
>
>
>
> On Wed, Jul 15, 2015 at 9:07 PM, Alex Rukletsov <a...@mesosphere.com>
> wrote:
>
> > Folks,
> >
> > We have updated the design doc [1] based on numerous comments on the v1.
> >
> > There is one substantial design question left: what entity should decide
> > whether a quota can be granted: the Master, an allocator, or a separate
> > "Quota Manager". I have described pros and cons of each in the doc,
> please
> > look inside for more information.
> >
> > Apart of that, I have made the following amendments:
> > * Clarified the absence of authz and how this mitigate that with
> firewall;
> > * Updated confusing naming: quota is a pair of guaranteed resource and
> > limit;
> > * Added a safety design principal;
> > * Updated section explaining granting quota decision;
> > * Updated QuotaInfo protobuf;
> > * Added an alternative quota implementation via dynamic reservations;
> > * Updated HTTP api to be more REST-like;
> > * Added more ideas to the wip Allocator section.
> >
> > Please have a look and check whether your concerns have been addressed.
> >
> > [1]:
> >
> >
> https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I/edit#
> >
> > On Fri, Jul 10, 2015 at 8:52 AM, Jörg Schad <jo...@mesosphere.io> wrote:
> >
> > > I would  like to propose one alternative MVP proposal for the actual
> > quota
> > > implementation.
> > >
> > > Instead of making changes to the allocator we could have an allocator
> > > agnostic “Quota manager” which builds on top the existing dynamic
> > > reservations.
> > >
> > > Beyond MVP, we would still allow for allocator based implementations
> for
> > > more complex quota mechanism, but this quota manager could also offer
> > quota
> > > support for allocators which are not aware of quota (TBD).
> > >
> > >
> > > General Flow:
> > >
> > > - Quota manager leverages dynamic reservations to fulfill quota
> requests.
> > > It basically continuously tries to match the desired state (quota
> > requests)
> > > with the actual state (dynamic reservations).
> > >
> > > - it receives quota requests from master
> > >
> > > - using (approximate) slave usage information it selects target
> > slave/agent
> > >
> > > - requests to dynamic reservation (HTTP endpoint?), if fails retry
> > >
> > > - Master should notify QM of slave failures and reregistrations
> > >
> > > - in case of slave failures needs to request new reservation
> > >
> > > - in case of slave reregistrations might need to unreserve
> > >
> > > Required state of “Quota Manager”
> > >
> > >            - current dynamic reservations
> > >
> > >     - current granted quota
> > >
> > > - approximate slave available resources (in order to decide on which
> > slave
> > > to /reserve, but doesn’t matter if slightly out of sync -> in worst
> case
> > > /reserve fails and we have to retry)
> > >
> > > Failover
> > >
> > >             - current granted quota are persisted in registry by master
> > >
> > > - current dynamic reservations are reconstructed from slave
> > reregistration
> > >
> > > Master needs to propagate information about slave failures to Quota
> > > Manager, as it might need to create new dynamic reservation in that
> case
> > >
> > >
> > > Advantages compared to allocator based implementation:
> > >
> > > - no need to change allocator(s) for MVP
> > >
> > > - quota support for external quota-agnostic allocators
> > >
> > > - re-using existing mechanisms
> > >
> > > - minimal implementation effort for MVP
> > >
> > > - almost free support for quota chunks (even in MVP) as dynamic
> > > reservations are per slave
> > >
> > > Disadvantages
> > >
> > > - allocator based quota implementation still needed for more elaborate
> > > implementations
> > >
> > > - as dynamic reservations do not account towards fair share, so
> wouldn’t
> > > quota based on this implementation. In my opinion this is not a real
> > > problem as a) we did not really define the semantics of quota and b)
> fair
> > > share is a allocator internal (i.e. DRF internal) notion so other
> > allocator
> > > implementations are free to do that differently anyhow.
> > >
> > > Looking forward to feedback!
> > > Jörg
> > >
> > > On Thu, Jul 9, 2015 at 11:52 PM, Tomás Senart <to...@mesosphere.io>
> > wrote:
> > >
> > > > What about "Global Reservations"?
> > > >
> > > > On Thu, Jul 9, 2015 at 3:25 PM, Marco Massenzio <ma...@mesosphere.io
> >
> > > > wrote:
> > > >
> > > > > I've added my twocent in the doc - my vote goes for "Guaranteed
> > > > Allocation"
> > > > > - not as catchy as "Quota" (and will make classes' naming a
> > challenge!)
> > > > but
> > > > > maybe more helpful in the long-term.
> > > > >
> > > > > Anyone has a better suggestion, please do... I can't really say I'm
> > > > > super-excited by Guaranteed Allocation myself!
> > > > >
> > > > > *Marco Massenzio*
> > > > > *Distributed Systems Engineer*
> > > > >
> > > > > On Thu, Jul 9, 2015 at 1:48 AM, Alex Rukletsov <
> a...@mesosphere.com>
> > > > > wrote:
> > > > >
> > > > > > And you're not the only one who were confused by the terminology!
> > One
> > > > of
> > > > > > the alternatives that didn't make it to the public doc was
> > > > "cluster-wide
> > > > > > dynamic reservations". The reason we preferred "quota" to " ...
> > > > > > reservation" is because the latter is already overloaded with
> > > meanings
> > > > in
> > > > > > Mesos world (static reservations, dynamic reservations). I have
> > hoped
> > > > the
> > > > > > Terminology section would have helped to avoid the confusion,
> but I
> > > see
> > > > > it
> > > > > > doesn't. We'll think about how we can solve the problem, we
> > > definitely
> > > > > > don't want to create one more "libprocess process represented as
> a
> > > > thread
> > > > > > in an OS process" ; ).
> > > > > >
> > > > > > I see your point regarding authorization, you're not alone here
> > > either
> > > > :
> > > > > ).
> > > > > > Some folks mentioned that the lack of authz is a blocker and will
> > > > prevent
> > > > > > them from upgrading the cluster. I would propose to treat MVP as
> > > > > > experimental feature: use it at your own risk or disable
> endpoints
> > > > > related
> > > > > > to quota and hence the entire feature. Does it make sense?
> > > > > >
> > > > > > On Wed, Jul 8, 2015 at 7:10 PM, James Peach <jor...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > >
> > > > > > > > On Jul 4, 2015, at 3:15 AM, Alex Rukletsov <
> > a...@mesosphere.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > Folks,
> > > > > > > >
> > > > > > > > Jörg and I are working on adding *quota* support to Mesos.
> > Quota
> > > > can
> > > > > be
> > > > > > > > described as cluster-wide dynamic reservation. I would like
> to
> > > > share
> > > > > > the
> > > > > > > > design doc [1] to gather community feedback early in the
> design
> > > > > phase.
> > > > > > >
> > > > > > > The most confusing part of this document to me was the 'quota'
> > > > > > > terminology. Quotas normally refer to administrative limits
> (esp.
> > > > disk
> > > > > > > quotas with hard and soft limits), not reserving resources.
> Since
> > > > what
> > > > > > you
> > > > > > > are describing is an extension to the resource reservation
> > system,
> > > it
> > > > > > would
> > > > > > > be clearer if it was described in those terms.
> > > > > > >
> > > > > > > I was also concerned that access control / authorization is not
> > > > planned
> > > > > > > for the initial implementation. I think that if Mesos is to
> have
> > an
> > > > > > > authorization policy, it should be applied uniformly following
> > the
> > > > > > > principle of least surprise.
> > > > > > >
> > > > > > > > The doc is work in progress, especially the part related to
> > quota
> > > > > > support
> > > > > > > > in the allocator. We think we can start working on adding
> quota
> > > > > support
> > > > > > > to
> > > > > > > > Mesos Master while fleshing out the design for how quota is
> > > handled
> > > > > by
> > > > > > > the
> > > > > > > > built-in allocator.
> > > > > > > >
> > > > > > > > While working on the design, we faced some challenges and
> > design
> > > > > > > questions.
> > > > > > > > One of them is what decisions should be deferred to allocator
> > and
> > > > > what
> > > > > > > can
> > > > > > > > be decided by the Master. We elaborate on this in the doc.
> > > > > > > >
> > > > > > > > Looking forward to your feedback!
> > > > > > > >
> > > > > > > > [1]:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I/edit?usp=sharing
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to