On Fri, 2025-05-16 at 15:30 +0100, Tvrtko Ursulin wrote: > > On 16/05/2025 14:38, Philipp Stanner wrote: > > On Fri, 2025-05-16 at 13:10 +0100, Tvrtko Ursulin wrote: > > > > > > On 16/05/2025 12:53, Tvrtko Ursulin wrote: > > > > > > > > On 16/05/2025 08:28, Philipp Stanner wrote: > > > > > On Thu, 2025-05-15 at 17:17 +0100, Tvrtko Ursulin wrote: > > > > > > > > > > > > On 15/05/2025 16:00, Christian König wrote: > > > > > > > Sometimes drivers need to be able to submit multiple jobs > > > > > > > which > > > > > > > depend on > > > > > > > each other to different schedulers at the same time, but > > > > > > > using > > > > > > > drm_sched_job_add_dependency() can't fail any more after > > > > > > > the > > > > > > > first > > > > > > > job is > > > > > > > initialized. > > > > > > > > > > > > > > This function preallocate memory for dependency slots so > > > > > > > that > > > > > > > no > > > > > > > ENOMEM > > > > > > > can come later while adding dependencies. > > > > > > > > > > > > > > v2: rework implementation an documentation > > > > > > > v3: rework from scratch, use separate function to add > > > > > > > preallocated > > > > > > > deps > > > > > > > > > > I think we agreed to not put change logs into commit messages > > > > > anymore > > > > > :) > > > > > > > > > > They aren't useful for any reader. Who needs the changelog > > > > > afterwards > > > > > can retreive it through the mail thread link that we add. > > > > > > > > > > > > > > > > > > > Signed-off-by: Christian König <christian.koe...@amd.com> > > > > > > > --- > > > > > > > drivers/gpu/drm/scheduler/sched_main.c | 45 > > > > > > > ++++++++++++++++++++++++++ > > > > > > > include/drm/gpu_scheduler.h | 4 +++ > > > > > > > 2 files changed, 49 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > index f7118497e47a..b95e7089aa70 100644 > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > @@ -858,6 +858,51 @@ void drm_sched_job_arm(struct > > > > > > > drm_sched_job > > > > > > > *job) > > > > > > > } > > > > > > > EXPORT_SYMBOL(drm_sched_job_arm); > > > > > > > +/** > > > > > > > + * drm_sched_job_prealloc_dependency_slot - avoid ENOMEM > > > > > > > on > > > > > > > adding > > > > > > > dependencies > > > > > > > + * @job: scheduler job where dependencies will be added > > > > > > > + * @id: id for the allocated slot > > > > > > > + * > > > > > > > + * Sometimes drivers need to be able to submit multiple > > > > > > > jobs > > > > > > > which > > > > > > > depend on > > > > > > > + * each other to different schedulers at the same time, > > > > > > > but > > > > > > > using > > > > > > > + * drm_sched_job_add_dependency() can't fail any more > > > > > > > after > > > > > > > the > > > > > > > first job is > > > > > > > + * initialized. > > > > > > > + * > > > > > > > + * This function preallocate memory for a dependency > > > > > > > slot so > > > > > > > that > > > > > > > no ENOMEM can > > > > > > > + * come later while adding dependencies. The index of > > > > > > > the > > > > > > > preallocated slot is > > > > > > > + * returned in @id. > > > > > > > + * > > > > > > > + * Return: > > > > > > > + * 0 on success, or an error on failing to expand the > > > > > > > array. > > > > > > > + */ > > > > > > > +int drm_sched_job_prealloc_dependency_slot(struct > > > > > > > drm_sched_job > > > > > > > *job, > > > > > > > + u32 *id) > > > > > > > +{ > > > > > > > + return xa_alloc(&job->dependencies, id, NULL, > > > > > > > xa_limit_32b, GFP_KERNEL); > > > > > > > +} > > > > > > > +EXPORT_SYMBOL(drm_sched_job_prealloc_dependency_slot); > > > > > > > + > > > > > > > +/** > > > > > > > + * drm_sched_job_add_prealloc_dep - add dependency to > > > > > > > preallocated > > > > > > > slot > > > > > > > + * @job: scheduler job where dependencies will be added > > > > > > > + * @id: the preallocated slot index > > > > > > > + * @fence: the dependency to add > > > > > > > + * > > > > > > > + * Consumes @fence and adds it to the preallocated slot > > > > > > > dependency. > > > > > > > + */ > > > > > > > +void drm_sched_job_add_prealloc_dep(struct drm_sched_job > > > > > > > *job, u32 > > > > > > > id, > > > > > > > + struct dma_fence *fence) > > > > > > > +{ > > > > > > > + fence = xa_store(&job->dependencies, id, fence, > > > > > > > GFP_ATOMIC); > > > > > > > > > > > > Add assert that the passed id exists (was preallocated) and > > > > > > is > > > > > > NULL? > > > > > > > > > > You > > > > > > > > Hm? > > > > > > > > > > > > > > > > Also, if someone preallocates and does not consume the slot > > > > > > will that > > > > > > confuse the iteration in drm_sched_job_dependency()? > > > > > > > > > > drm_sched_job_add_dependency() you mean. > > > > > > > > I was actually thinking of drm_sched_job_dependency() because > > > > that > > > > looked it would skip dependencies upon encountering an > > > > unconsumed > > > > preallocated slot, but yes, drm_sched_job_add_dependency() > > > > could > > > > explode > > > > even earlier if adding a normal dependency after preallocating > > > > a > > > > slot. > > > > > > > > > Yes, it would. All operations simply give you NULL for those > > > > > slots. So > > > > > seems to me you have to check for NULL wherever a > > > > > preallocated > > > > > slot > > > > > might drop out. That would then be a bug. > > > > > > > > > > It's kind of tricky, all that. It's a pity that Wilcox didn't > > > > > answer > > > > > our questions about the idiomatic way to do it. > > > > > > > > > > Maybe reserving slots with already signaled fences wasn't > > > > > such a > > > > > bad > > > > > idea after all? > > > > > > > > > > If we go for the NULL approach, it's probably the only sane > > > > > way > > > > > to then > > > > > check for NULL wherever dependencies are accessed :( > > > > > > > > > > Opinions? > > > > > > > > Well if the xarray API returns the NULL consistently the > > > > approach > > > > from > > > > this patch is fine I think. > > > > > > > > We just need to add two more checks to the above mentioned > > > > functions, > > > > > > I need to correct myself, drm_sched_job_dependency() wouldn't be > > > able > > > to > > > just skip NULLs since it relies on NULL for "no more > > > dependencies". > > > We > > > would need to track something like job->max_dependency and > > > terminate > > > on > > > job->last_dependency > job->max_dependency or so. > > > > Agreed, that would have to be fixed. > > > > I believe we should reconsider Christian's first idea [1]. > > > > Thinking about it some more: > > * With the NULL version, suddenly the xarray containing only > > valid > > dependencies can sometimes contain NULL entries. > > * If we could create our own tag, entries could be returned that > > were > > neither NULL nor valid fences, also requiring checks > > 'everywhere'. > > * Only the "signaled fence as prealloc reservation" approach is > > fully > > backwards compatible and will never cause anyone to block after > > later reworks. > > > > So maybe it's actually the best idea? > > > > Sorry for the zigg-zagg. No hard requirements intended from my > > side, > > I'm willing to go with what you guys think. > > > > Just saying, at least now I think that the already-signaled fence > > seems > > the most elegant solution. And since there's a function > > (dma_fence_get_stub()) for that, it seems to be in alignment with > > official dma_fence rules. > > Potential problem there was dma_fence_is_signaled() and fence > signaling > annotations. In case some driver is holding a lock over the arm+push > pair. I wish we had a non-signaling is_signaled helper.. >
Yes! +1! But Christian doesn't like that direction: https://lore.kernel.org/all/20250409120640.106408-2-pha...@kernel.org/ P. > > > > > Anyway, I think both options are passable. I even like the NULL entry > slightly more since it is simpler in a way and I don't mind some > extra > checks completely hidden in scheduler internals. > > Regards, > > Tvrtko > > > > > > > Philipp > > > > > > [1] > > https://lore.kernel.org/all/20250318120313.19099-2-christian.koe...@amd.com > > / > > > > > > > > > > Regards, > > > > > > Tvrtko > > > > > > > some more unit tests probably to make sure, and that should be > > > > fine > > > > for > > > > now. > > > > > > > > On the bikeshedding front I would perhaps suggest: > > > > > > > > - drm_sched_job_preallocate_dependency() > > > > - drm_sched_job_replace_dependency() > > > > > > > > Reads a little bit more aligned with the rest of the API and a > > > > bit > > > > easier on the eyes, to my eyes at least. > > > > > > > > Regards, > > > > > > > > Tvrtko > > > > > > > > > >