On Wed, 5 Apr 2023 at 11:11, Tvrtko Ursulin <tvrtko.ursu...@linux.intel.com> wrote: > > > On 05/04/2023 09:28, Daniel Vetter wrote: > > On Tue, 4 Apr 2023 at 12:45, Tvrtko Ursulin > > <tvrtko.ursu...@linux.intel.com> wrote: > >> > >> > >> Hi, > >> > >> On 03/04/2023 20:40, Joshua Ashton wrote: > >>> Hello all! > >>> > >>> I would like to propose a new API for allowing processes to control > >>> the priority of GPU queues similar to RLIMIT_NICE/RLIMIT_RTPRIO. > >>> > >>> The main reason for this is for compositors such as Gamescope and > >>> SteamVR vrcompositor to be able to create realtime async compute > >>> queues on AMD without the need of CAP_SYS_NICE. > >>> > >>> The current situation is bad for a few reasons, one being that in order > >>> to setcap the executable, typically one must run as root which involves > >>> a pretty high privelage escalation in order to achieve one > >>> small feat, a realtime async compute queue queue for VR or a compositor. > >>> The executable cannot be setcap'ed inside a > >>> container nor can the setcap'ed executable be run in a container with > >>> NO_NEW_PRIVS. > >>> > >>> I go into more detail in the description in > >>> `uapi: Add RLIMIT_GPUPRIO`. > >>> > >>> My initial proposal here is to add a new RLIMIT, `RLIMIT_GPUPRIO`, > >>> which seems to make most initial sense to me to solve the problem. > >>> > >>> I am definitely not set that this is the best formulation however > >>> or if this should be linked to DRM (in terms of it's scheduler > >>> priority enum/definitions) in any way and and would really like other > >>> people's opinions across the stack on this. > >>> > >>> Once initial concern is that potentially this RLIMIT could out-live > >>> the lifespan of DRM. It sounds crazy saying it right now, something > >>> that definitely popped into my mind when touching `resource.h`. :-) > >>> > >>> Anyway, please let me know what you think! > >>> Definitely open to any feedback and advice you may have. :D > >> > >> Interesting! I tried to solved the similar problem two times in the past > >> already. > >> > >> First time I was proposing to tie nice to DRM scheduling priority [1] - if > >> the latter has been left at default - drawing the analogy with the > >> nice+ionice handling. That was rejected and I was nudged towards the > >> cgroups route. > >> > >> So with that second attempt I implemented a hierarchical opaque > >> drm.priority cgroup controller [2]. I think it would allow you to solve > >> your use case too by placing your compositor in a cgroup with an elevated > >> priority level. > >> > >> Implementation wise in my proposal it was left to individual drivers to > >> "meld" the opaque cgroup drm.priority with the driver specific priority > >> concept. > >> > >> That too wasn't too popular with the feedback (AFAIR) that the priority is > >> a too subsystem specific concept. > >> > >> Finally I was left with a weight based drm cgroup controller, exactly > >> following the controls of the CPU and IO ones, but with much looser > >> runtime guarantees. [3] > >> > >> I don't think this last one works for your use case, at least not at the > >> current state for drm scheduling capability, where the implementation is a > >> "bit" too reactive for realtime. > >> > >> Depending on how the discussion around your rlimit proposal goes, perhaps > >> one alternative could be to go the cgroup route and add an attribute like > >> drm.realtime. That perhaps sounds abstract and generic enough to be > >> passable. Built as a simplification of [2] it wouldn't be too complicated. > >> > >> On the actual proposal of RLIMIT_GPUPRIO... > >> > >> The name would be problematic since we have generic hw accelerators (not > >> just GPUs) under the DRM subsystem. Perhaps RLIMIT_DRMPRIO would be better > >> but I think you will need to copy some more mailing lists and people on > >> that one. Because I can imagine one or two more fundamental questions this > >> opens up, as you have eluded in your cover letter as well. > > > > So I don't want to get into the bikeshed, I think Tvrtko summarized > > pretty well that this is a hard problem with lots of attempts (I think > > some more from amd too). I think what we need are two pieces here > > really: > > - A solid summary of all the previous attempts from everyone in this > > space of trying to manage gpu compute resources (all the various > > cgroup attempts, sched priority), listening the pros/cons. There's > > also the fdinfo stuff just for reporting gpu usage which blew up kinda > > badly and didn't have much discussion among all the stakeholders. > > - Everyone on cc who's doing new drivers using drm/sched (which I > > think is everyone really, or using that currently. So that's like > > etnaviv, lima, amd, intel with the new xe, probably new nouveau driver > > too, amd ofc, panfrost, asahi. Please cc everyone. > > > > Unless we do have some actual rough consens in this space across all > > stakeholders I think all we'll achieve is just yet another rfc that > > goes nowhere. Or maybe something like the minimal fdinfo stuff > > (minimal I guess to avoid wider discussion) which then blew up because > > it wasn't thought out well enough. > > On the particular point how fdinfo allegedly blew up - are you referring > to client usage stats? If so this would be the first time I hear about > any problems in that space. Which would be "a bit" surprising given it's > the thing I drove standardisation of. All I heard were positive > comments. Both "works for us" from driver implementors and positives > from the users.
The drm/sched implementation blew up. Not the overall spec or the i915 implementation. See the reverts in -rc5 and drm-misc-next. I think a tad more coordination and maybe more shared code for drm/sched using drivers probably what we want for this. Or at least a bit more cross-driver collaboration than here where one side reverts while the other pushes more patches. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch