On Wed, 5 Apr 2023 at 11:11, Tvrtko Ursulin
<tvrtko.ursu...@linux.intel.com> wrote:
>
>
> On 05/04/2023 09:28, Daniel Vetter wrote:
> > On Tue, 4 Apr 2023 at 12:45, Tvrtko Ursulin
> > <tvrtko.ursu...@linux.intel.com> wrote:
> >>
> >>
> >> Hi,
> >>
> >> On 03/04/2023 20:40, Joshua Ashton wrote:
> >>> Hello all!
> >>>
> >>> I would like to propose a new API for allowing processes to control
> >>> the priority of GPU queues similar to RLIMIT_NICE/RLIMIT_RTPRIO.
> >>>
> >>> The main reason for this is for compositors such as Gamescope and
> >>> SteamVR vrcompositor to be able to create realtime async compute
> >>> queues on AMD without the need of CAP_SYS_NICE.
> >>>
> >>> The current situation is bad for a few reasons, one being that in order
> >>> to setcap the executable, typically one must run as root which involves
> >>> a pretty high privelage escalation in order to achieve one
> >>> small feat, a realtime async compute queue queue for VR or a compositor.
> >>> The executable cannot be setcap'ed inside a
> >>> container nor can the setcap'ed executable be run in a container with
> >>> NO_NEW_PRIVS.
> >>>
> >>> I go into more detail in the description in
> >>> `uapi: Add RLIMIT_GPUPRIO`.
> >>>
> >>> My initial proposal here is to add a new RLIMIT, `RLIMIT_GPUPRIO`,
> >>> which seems to make most initial sense to me to solve the problem.
> >>>
> >>> I am definitely not set that this is the best formulation however
> >>> or if this should be linked to DRM (in terms of it's scheduler
> >>> priority enum/definitions) in any way and and would really like other
> >>> people's opinions across the stack on this.
> >>>
> >>> Once initial concern is that potentially this RLIMIT could out-live
> >>> the lifespan of DRM. It sounds crazy saying it right now, something
> >>> that definitely popped into my mind when touching `resource.h`. :-)
> >>>
> >>> Anyway, please let me know what you think!
> >>> Definitely open to any feedback and advice you may have. :D
> >>
> >> Interesting! I tried to solved the similar problem two times in the past 
> >> already.
> >>
> >> First time I was proposing to tie nice to DRM scheduling priority [1] - if 
> >> the latter has been left at default - drawing the analogy with the 
> >> nice+ionice handling. That was rejected and I was nudged towards the 
> >> cgroups route.
> >>
> >> So with that second attempt I implemented a hierarchical opaque 
> >> drm.priority cgroup controller [2]. I think it would allow you to solve 
> >> your use case too by placing your compositor in a cgroup with an elevated 
> >> priority level.
> >>
> >> Implementation wise in my proposal it was left to individual drivers to 
> >> "meld" the opaque cgroup drm.priority with the driver specific priority 
> >> concept.
> >>
> >> That too wasn't too popular with the feedback (AFAIR) that the priority is 
> >> a too subsystem specific concept.
> >>
> >> Finally I was left with a weight based drm cgroup controller, exactly 
> >> following the controls of the CPU and IO ones, but with much looser 
> >> runtime guarantees. [3]
> >>
> >> I don't think this last one works for your use case, at least not at the 
> >> current state for drm scheduling capability, where the implementation is a 
> >> "bit" too reactive for realtime.
> >>
> >> Depending on how the discussion around your rlimit proposal goes, perhaps 
> >> one alternative could be to go the cgroup route and add an attribute like 
> >> drm.realtime. That perhaps sounds abstract and generic enough to be 
> >> passable. Built as a simplification of [2] it wouldn't be too complicated.
> >>
> >> On the actual proposal of RLIMIT_GPUPRIO...
> >>
> >> The name would be problematic since we have generic hw accelerators (not 
> >> just GPUs) under the DRM subsystem. Perhaps RLIMIT_DRMPRIO would be better 
> >> but I think you will need to copy some more mailing lists and people on 
> >> that one. Because I can imagine one or two more fundamental questions this 
> >> opens up, as you have eluded in your cover letter as well.
> >
> > So I don't want to get into the bikeshed, I think Tvrtko summarized
> > pretty well that this is a hard problem with lots of attempts (I think
> > some more from amd too). I think what we need are two pieces here
> > really:
> > - A solid summary of all the previous attempts from everyone in this
> > space of trying to manage gpu compute resources (all the various
> > cgroup attempts, sched priority), listening the pros/cons. There's
> > also the fdinfo stuff just for reporting gpu usage which blew up kinda
> > badly and didn't have much discussion among all the stakeholders.
> > - Everyone on cc who's doing new drivers using drm/sched (which I
> > think is everyone really, or using that currently. So that's like
> > etnaviv, lima, amd, intel with the new xe, probably new nouveau driver
> > too, amd ofc, panfrost, asahi. Please cc everyone.
> >
> > Unless we do have some actual rough consens in this space across all
> > stakeholders I think all we'll achieve is just yet another rfc that
> > goes nowhere. Or maybe something like the minimal fdinfo stuff
> > (minimal I guess to avoid wider discussion) which then blew up because
> > it wasn't thought out well enough.
>
> On the particular point how fdinfo allegedly blew up - are you referring
> to client usage stats? If so this would be the first time I hear about
> any problems in that space. Which would be "a bit" surprising given it's
> the thing I drove standardisation of. All I heard were positive
> comments. Both "works for us" from driver implementors and positives
> from the users.

The drm/sched implementation blew up. Not the overall spec or the i915
implementation. See the reverts in -rc5 and drm-misc-next.

I think a tad more coordination and maybe more shared code for
drm/sched using drivers probably what we want for this. Or at least a
bit more cross-driver collaboration than here where one side reverts
while the other pushes more patches.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Reply via email to