I filed https://issues.apache.org/jira/browse/MESOS-7574 for reservations to multiple roles. We'll find one that captures the deprecation of the GPU_RESOURCES capability as well, with reservations to multiple roles as a blocker.
On Fri, May 26, 2017 at 8:54 AM, Zhitao Li <zhitaoli...@gmail.com> wrote: > Hi Benjamin, > > Thanks for getting back. Do you have an issue already filed for > the "reservations to multiple roles" story, or is it folded under another > JIRA story? > > > > On Fri, May 26, 2017 at 12:44 AM, Benjamin Mahler <bmah...@apache.org> > wrote: > > > Thanks for the feedback! > > > > There have been some discussions for allowing reservations to multiple > > roles (or more generally, role expressions), which is essentially what > > you've suggested Zhitao. (However, note that what is provided by the GPU > > capability filtering is not quite this, it's actually analogous to a > > reservation for multiple schedulers, not roles). Reservations to multiple > > roles seems to be the right replacement for those who rely on the GPU > > filtering behavior. > > > > Since we don't have reservations to multiple roles at this point, we > > shouldn't deprecate the GPU_RESOURCES capability until this is in place. > > > > With hierarchical roles, it's possible (although potentially clumsy) to > > achieve roughly what is provided by the GPU filtering using sub-roles. > > Since reservations made to a "gpu" role would be available to all of the > > descendant roles within tree, e.g. > > "gpu/analytics", "gpu/forecasting/training", etc. This is equivalent to a > > restricted version of reservations to multiple roles, where the roles are > > restricted to the descendant roles. This can get clumsy because if > > "eng/backend/image-processing" wants to get in on the reserved gpus, the > > user would have to place a related role underneath the "gpu" role, e.g. > > "gpu/eng/backend/image-processing". > > > > The exact reason you mentioned about the "clumsy" part would effectively > prevent me of implementing this in our org even if it's already available. > > > > > > For the addition of the filter, note that this flag would be a temporary > > measure that would be removed when the deprecation cycle of the > capability > > is complete. It would be good to independently consider the generalized > > filtering idea you brought up. > > > > On Mon, May 22, 2017 at 9:15 AM, Zhitao Li <zhitaoli...@gmail.com> > wrote: > > > > > Hi Kevin, > > > > > > Thanks for engaging with the community on this. My 2 cents: > > > > > > 1. I feel that this capabilities has a particular useful semantic which > > is > > > lacking in the current reservation system: reserving some scarce > resource > > > for a* dynamic list of multiple roles:* > > > > > > Right now, any reservation (static or dynamic) can only express the > > > semantic of "reserving this resource for the given role R". However, > in a > > > complex cluster, it is possible that we have [R1, R2, ..., RN] which > > wants > > > to share the scarce resource among them but there is another set of > roles > > > which should never see the given resource. > > > > > > The new hierarchical role (and/or multi-role?) might be able to > provide a > > > better solution, but until that's widely available and adopted, the > > > capabilities based hack is the only thing I know that can solve the > > > problem. > > > > > > In fact, I think if we are going to wo with `--filter-gpu-resources` > > path, > > > I think we should make the filter more powerful (i.e, able to handle > all > > > known framework <-> resource/host constraints and more types of scarce > > > resources) instead of the piecewise patches on a specific use case. > > > > > > Happy to chat more on this topic. > > > > > > On Sat, May 20, 2017 at 6:45 PM, Kevin Klues <klue...@gmail.com> > wrote: > > > > > > > Hello GPU users, > > > > > > > > We are currently considering deprecating the requirement that > > frameworks > > > > register with the GPU _RESOURCES capability in order to receive > offers > > > that > > > > contain GPUs. Going forward, we will recommend that users rely on > > Mesos's > > > > builtin `reservation` mechanism to achieve similar results. > > > > > > > > Before deprecating it, we wanted to get a sense from the community if > > > > anyone is currently relying on this capability and would like to see > it > > > > persist. If not, we will begin deprecating it in the next Mesos > release > > > and > > > > completely remove it in Mesos 2.0. > > > > > > > > As background, the original motivation for this capability was to > keep > > > > “legacy” frameworks from inadvertently scheduling jobs that don’t > > require > > > > GPUs on GPU capable machines and thus starving out other frameworks > > that > > > > legitimately want to place GPU jobs on those machines. The assumption > > > here > > > > was that most machines in a cluster won't have GPUs installed on > them, > > so > > > > some mechanism was necessary to keep legacy frameworks from > scheduling > > > jobs > > > > on those machines. In essence, it provided an implicit reservation of > > GPU > > > > machines for "GPU aware" frameworks, bypassing the traditional > > > > `reservation` mechanism already built into Mesos. > > > > > > > > In such a setup, legacy frameworks would be free to schedule jobs on > > > > non-GPU machines, and "GPU aware" frameworks would be free to > schedule > > > GPU > > > > jobs GPU machines and other types of jobs on other machines (or mix > and > > > > match them however they please). > > > > > > > > However, the problem comes when *all* machines in a cluster contain > > GPUs > > > > (or even if most of the machines in a cluster container them). When > > this > > > is > > > > the case, we have the opposite problem we were trying to solve by > > > > introducing the GPU_RESOURCES capability in the first place. We end > up > > > > starving out jobs from legacy frameworks that *don’t* require GPU > > > resources > > > > because there are not enough machines available that don’t have GPUs > on > > > > them to service those jobs. We've actually seen this problem manifest > > in > > > > the wild at least once. > > > > > > > > An alternative to completely deprecating the GPU_RESOURCES flag would > > be > > > to > > > > add a new flag to the mesos master called `--filter-gpu-resources`. > > When > > > > set to `true`, this flag will cause the mesos master to continue to > > > > function as it does today. That is, it would filter offers containing > > GPU > > > > resources and only send them to frameworks that opt into the > > > GPU_RESOURCES > > > > framework capability. When set to `false`, this flag would cause the > > > master > > > > to *not* filter offers containing GPU resources, and indiscriminately > > > send > > > > them to all frameworks whether they set the GPU_RESOURCES capability > or > > > > not. > > > > > > > > , this flag would allow them to keep relying on it without > disruption. > > > > > > > > We'd prefer to deprecate the capability completely, but would > consider > > > > adding this flag if people are currently relying on the GPU_RESOURCES > > > > capability and would like to see it persist > > > > > > > > We welcome any feedback you have. > > > > > > > > Kevin + Ben > > > > > > > > > > > > > > > > -- > > > Cheers, > > > > > > Zhitao Li > > > > > > > > > -- > Cheers, > > Zhitao Li >