sched: add drm_sched_prealloc_dependency_slots v3

Philipp Stanner Fri, 16 May 2025 06:38:46 -0700

On Fri, 2025-05-16 at 13:10 +0100, Tvrtko Ursulin wrote:
> 
> On 16/05/2025 12:53, Tvrtko Ursulin wrote:
> > 
> > On 16/05/2025 08:28, Philipp Stanner wrote:
> > > On Thu, 2025-05-15 at 17:17 +0100, Tvrtko Ursulin wrote:
> > > > 
> > > > On 15/05/2025 16:00, Christian König wrote:
> > > > > Sometimes drivers need to be able to submit multiple jobs
> > > > > which
> > > > > depend on
> > > > > each other to different schedulers at the same time, but
> > > > > using
> > > > > drm_sched_job_add_dependency() can't fail any more after the
> > > > > first
> > > > > job is
> > > > > initialized.
> > > > > 
> > > > > This function preallocate memory for dependency slots so that
> > > > > no
> > > > > ENOMEM
> > > > > can come later while adding dependencies.
> > > > > 
> > > > > v2: rework implementation an documentation
> > > > > v3: rework from scratch, use separate function to add
> > > > > preallocated
> > > > > deps
> > > 
> > > I think we agreed to not put change logs into commit messages
> > > anymore
> > > :)
> > > 
> > > They aren't useful for any reader. Who needs the changelog
> > > afterwards
> > > can retreive it through the mail thread link that we add.
> > > 
> > > > > 
> > > > > Signed-off-by: Christian König <christian.koe...@amd.com>
> > > > > ---
> > > > >    drivers/gpu/drm/scheduler/sched_main.c | 45
> > > > > ++++++++++++++++++++++++++
> > > > >    include/drm/gpu_scheduler.h            |  4 +++
> > > > >    2 files changed, 49 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > index f7118497e47a..b95e7089aa70 100644
> > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > @@ -858,6 +858,51 @@ void drm_sched_job_arm(struct
> > > > > drm_sched_job
> > > > > *job)
> > > > >    }
> > > > >    EXPORT_SYMBOL(drm_sched_job_arm);
> > > > > +/**
> > > > > + * drm_sched_job_prealloc_dependency_slot - avoid ENOMEM on
> > > > > adding
> > > > > dependencies
> > > > > + * @job: scheduler job where dependencies will be added
> > > > > + * @id: id for the allocated slot
> > > > > +  *
> > > > > + * Sometimes drivers need to be able to submit multiple jobs
> > > > > which
> > > > > depend on
> > > > > + * each other to different schedulers at the same time, but
> > > > > using
> > > > > + * drm_sched_job_add_dependency() can't fail any more after
> > > > > the
> > > > > first job is
> > > > > + * initialized.
> > > > > + *
> > > > > + * This function preallocate memory for a dependency slot so
> > > > > that
> > > > > no ENOMEM can
> > > > > + * come later while adding dependencies. The index of the
> > > > > preallocated slot is
> > > > > + * returned in @id.
> > > > > + *
> > > > > + * Return:
> > > > > + * 0 on success, or an error on failing to expand the array.
> > > > > + */
> > > > > +int drm_sched_job_prealloc_dependency_slot(struct
> > > > > drm_sched_job
> > > > > *job,
> > > > > +                       u32 *id)
> > > > > +{
> > > > > +    return xa_alloc(&job->dependencies, id, NULL,
> > > > > xa_limit_32b, GFP_KERNEL);
> > > > > +}
> > > > > +EXPORT_SYMBOL(drm_sched_job_prealloc_dependency_slot);
> > > > > +
> > > > > +/**
> > > > > + * drm_sched_job_add_prealloc_dep - add dependency to
> > > > > preallocated
> > > > > slot
> > > > > + * @job: scheduler job where dependencies will be added
> > > > > + * @id: the preallocated slot index
> > > > > + * @fence: the dependency to add
> > > > > + *
> > > > > + * Consumes @fence and adds it to the preallocated slot
> > > > > dependency.
> > > > > + */
> > > > > +void drm_sched_job_add_prealloc_dep(struct drm_sched_job
> > > > > *job, u32
> > > > > id,
> > > > > +                    struct dma_fence *fence)
> > > > > +{
> > > > > +    fence = xa_store(&job->dependencies, id, fence,
> > > > > GFP_ATOMIC);
> > > > 
> > > > Add assert that the passed id exists (was preallocated) and is
> > > > NULL?
> > > 
> > > You
> > 
> > Hm?
> > 
> > > > 
> > > > Also, if someone preallocates and does not consume the slot
> > > > will that
> > > > confuse the iteration in drm_sched_job_dependency()?
> > > 
> > > drm_sched_job_add_dependency() you mean.
> > 
> > I was actually thinking of drm_sched_job_dependency() because that 
> > looked it would skip dependencies upon encountering an unconsumed 
> > preallocated slot, but yes, drm_sched_job_add_dependency() could
> > explode 
> > even earlier if adding a normal dependency after preallocating a
> > slot.
> > 
> > > Yes, it would. All operations simply give you NULL for those
> > > slots. So
> > > seems to me you have to check for NULL wherever a preallocated
> > > slot
> > > might drop out. That would then be a bug.
> > > 
> > > It's kind of tricky, all that. It's a pity that Wilcox didn't
> > > answer
> > > our questions about the idiomatic way to do it.
> > > 
> > > Maybe reserving slots with already signaled fences wasn't such a
> > > bad
> > > idea after all?
> > > 
> > > If we go for the NULL approach, it's probably the only sane way
> > > to then
> > > check for NULL wherever dependencies are accessed :(
> > > 
> > > Opinions?
> > 
> > Well if the xarray API returns the NULL consistently the approach
> > from 
> > this patch is fine I think.
> > 
> > We just need to add two more checks to the above mentioned
> > functions, 
> 
> I need to correct myself, drm_sched_job_dependency() wouldn't be able
> to 
> just skip NULLs since it relies on NULL for "no more dependencies".
> We 
> would need to track something like job->max_dependency and terminate
> on 
> job->last_dependency > job->max_dependency or so.


Agreed, that would have to be fixed.

I believe we should reconsider Christian's first idea [1].

Thinking about it some more:
 * With the NULL version, suddenly the xarray containing only valid
   dependencies can sometimes contain NULL entries.
 * If we could create our own tag, entries could be returned that were
   neither NULL nor valid fences, also requiring checks 'everywhere'.
 * Only the "signaled fence as prealloc reservation" approach is fully
   backwards compatible and will never cause anyone to block after
   later reworks.

So maybe it's actually the best idea?

Sorry for the zigg-zagg. No hard requirements intended from my side,
I'm willing to go with what you guys think.

Just saying, at least now I think that the already-signaled fence seems
the most elegant solution. And since there's a function
(dma_fence_get_stub()) for that, it seems to be in alignment with
official dma_fence rules.


Philipp


[1]https://lore.kernel.org/all/20250318120313.19099-2-christian.koe...@amd.com/


> 
> Regards,
> 
> Tvrtko
> 
> > some more unit tests probably to make sure, and that should be fine
> > for 
> > now.
> > 
> > On the bikeshedding front I would perhaps suggest:
> > 
> >   - drm_sched_job_preallocate_dependency()
> >   - drm_sched_job_replace_dependency()
> > 
> > Reads a little bit more aligned with the rest of the API and a bit 
> > easier on the eyes, to my eyes at least.
> > 
> > Regards,
> > 
> > Tvrtko
> > 
>

Re: [PATCH 1/3] drm/sched: add drm_sched_prealloc_dependency_slots v3

Reply via email to