Thanks for the feedback!

There have been some discussions for allowing reservations to multiple
roles (or more generally, role expressions), which is essentially what
you've suggested Zhitao. (However, note that what is provided by the GPU
capability filtering is not quite this, it's actually analogous to a
reservation for multiple schedulers, not roles). Reservations to multiple
roles seems to be the right replacement for those who rely on the GPU
filtering behavior.

Since we don't have reservations to multiple roles at this point, we
shouldn't deprecate the GPU_RESOURCES capability until this is in place.

With hierarchical roles, it's possible (although potentially clumsy) to
achieve roughly what is provided by the GPU filtering using sub-roles.
Since reservations made to a "gpu" role would be available to all of the
descendant roles within tree, e.g.
"gpu/analytics", "gpu/forecasting/training", etc. This is equivalent to a
restricted version of reservations to multiple roles, where the roles are
restricted to the descendant roles. This can get clumsy because if
"eng/backend/image-processing" wants to get in on the reserved gpus, the
user would have to place a related role underneath the "gpu" role, e.g.

For the addition of the filter, note that this flag would be a temporary
measure that would be removed when the deprecation cycle of the capability
is complete. It would be good to independently consider the generalized
filtering idea you brought up.

On Mon, May 22, 2017 at 9:15 AM, Zhitao Li <> wrote:

> Hi Kevin,
> Thanks for engaging with the community on this. My 2 cents:
> 1. I feel that this capabilities has a particular useful semantic which is
> lacking in the current reservation system: reserving some scarce resource
> for a* dynamic list of multiple roles:*
> Right now, any reservation (static or dynamic) can only express the
> semantic of "reserving this resource for the given role R". However, in a
> complex cluster, it is possible that we have [R1, R2, ..., RN] which wants
> to share the scarce resource among them but there is another set of roles
> which should never see the given resource.
> The new hierarchical role (and/or multi-role?) might be able to provide a
> better solution, but until that's widely available and adopted, the
> capabilities based hack is the only thing I know that can solve the
> problem.
> In fact, I think if we are going to wo with `--filter-gpu-resources` path,
> I think we should make the filter more powerful (i.e, able to handle all
> known framework <-> resource/host constraints and more types of scarce
> resources) instead of the piecewise patches on a specific use case.
> Happy to chat more on this topic.
> On Sat, May 20, 2017 at 6:45 PM, Kevin Klues <> wrote:
> > Hello GPU users,
> >
> > We are currently considering deprecating the requirement that frameworks
> > register with the GPU _RESOURCES capability in order to receive offers
> that
> > contain GPUs. Going forward, we will recommend that users rely on Mesos's
> > builtin `reservation` mechanism to achieve similar results.
> >
> > Before deprecating it, we wanted to get a sense from the community if
> > anyone is currently relying on this capability and would like to see it
> > persist. If not, we will begin deprecating it in the next Mesos release
> and
> > completely remove it in Mesos 2.0.
> >
> > As background, the original motivation for this capability was to keep
> > “legacy” frameworks from inadvertently scheduling jobs that don’t require
> > GPUs on GPU capable machines and thus starving out other frameworks that
> > legitimately want to place GPU jobs on those machines. The assumption
> here
> > was that most machines in a cluster won't have GPUs installed on them, so
> > some mechanism was necessary to keep legacy frameworks from scheduling
> jobs
> > on those machines. In essence, it provided an implicit reservation of GPU
> > machines for "GPU aware" frameworks, bypassing the traditional
> > `reservation` mechanism already built into Mesos.
> >
> > In such a setup, legacy frameworks would be free to schedule jobs on
> > non-GPU machines, and "GPU aware" frameworks would be free to schedule
> > jobs GPU machines and other types of jobs on other machines (or mix and
> > match them however they please).
> >
> > However, the problem comes when *all* machines in a cluster contain GPUs
> > (or even if most of the machines in a cluster container them). When this
> is
> > the case, we have the opposite problem we were trying to solve by
> > introducing the GPU_RESOURCES capability in the first place. We end up
> > starving out jobs from legacy frameworks that *don’t* require GPU
> resources
> > because there are not enough machines available that don’t have GPUs on
> > them to service those jobs. We've actually seen this problem manifest in
> > the wild at least once.
> >
> > An alternative to completely deprecating the GPU_RESOURCES flag would be
> to
> > add a new flag to the mesos master called `--filter-gpu-resources`. When
> > set to `true`, this flag will cause the mesos master to continue to
> > function as it does today. That is, it would filter offers containing GPU
> > resources and only send them to frameworks that opt into the
> > framework capability. When set to `false`, this flag would cause the
> master
> > to *not* filter offers containing GPU resources, and indiscriminately
> send
> > them to all frameworks whether they set the GPU_RESOURCES capability or
> > not.
> >
> > , this flag would allow them to keep relying on it without disruption.
> >
> > We'd prefer to deprecate the capability completely, but would consider
> > adding this flag if people are currently relying on the GPU_RESOURCES
> > capability and would like to see it persist
> >
> > We welcome any feedback you have.
> >
> > Kevin + Ben
> >
> --
> Cheers,
> Zhitao Li

Reply via email to