I read over the docs, it looks like a good start.  Personally I don't see
much of a benefit for dynamically reserved cpu/mem, but I'm excited about
the possibility of building off this for dynamically reserved persistent
volumes.

I would like to see more detail on how a reservation "times out", and the
configuration options per job around that, as I feel like its the most
complicated part of all of this.  Ideally there would also be hooks into
the host maintenance APIs here.

I also didn't see any mention of it, but I believe mesos requires the
framework to reserve resources with a role.  By default aurora runs as the
special "*" role, does this mean aurora will need to have a role specified
now for this to work?  Or does mesos allow reserving resources without a
role?

On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <stephan....@blue-yonder.com>
wrote:

> Hi everyone,
>
> There have been two documents on Dynamic Reservations as a first step
> towards persistent services:
>
> ·         RFC: https://docs.google.com/document/d/
> 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.hcsc8tda08vy
>
> ·         Technical Design Doc:  https://docs.google.com/document/d/
> 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.klg3urfbnq3v
>
> Since a couple of days there are also now two patches online for a MVP by
> Dmitriy:
>
> ·         https://reviews.apache.org/r/56690/
>
> ·         https://reviews.apache.org/r/56691/
>
> From reading the documents, I am under the impression that there is a
> rough consensus on the following points:
>
> ·         We want dynamic reservations. Our general goal is to enable the
> re-scheduling of tasks on the same host they used in a previous run.
>
> ·         Dynamic reservations are a best-effort feature. If in doubt, a
> task will be scheduled somewhere else.
>
> ·         Jobs opt into reserved resources using an appropriate tier
> config.
>
> ·         The tier config in supposed to be neither preemptible nor
> revocable. Reserving resources therefore requires appropriate quota.
>
> ·         Aurora will tag reserved Mesos resources by adding the unique
> instance key of the reserving task instance as a label. Only this task
> instance will be allowed to use those tagged resources.
>
> I am unclear on the following general questions as there is contradicting
> content:
>
> a)       How does the user interact with reservations?  There are several
> proposals in the documents to auto-reserve on `aurora job create` or
> `aurora cron schedule` and to automatically un-reserve on the appropriate
> reverse actions. But will we also allow a user further control over the
> reservations so that they can manage those independent of the task/job
> lifecycle? For example, how does Borg handle this?
>
> b)       The implementation proposal and patches include an
> OfferReconciler, so this implies we don’t want to offer any control for the
> user. The only control mechanism will be the cluster-wide offer wait time
> limiting the number of seconds unused reserved resources can linger before
> they are un-reserved.
>
> c)       Will we allow adhoc/cron jobs to reserve resources? Does it even
> matter if we don’t give control to users and just rely on the
> OfferReconciler?
>
>
> I have a couple of questions on the MVP and some implementation details. I
> will follow up with those in a separate mail.
>
> Thanks and best regards,
> Stephan
>

Reply via email to