On 16/05/2025 18:16, Philipp Stanner wrote:
On Fri, 2025-05-16 at 15:30 +0100, Tvrtko Ursulin wrote:

On 16/05/2025 14:38, Philipp Stanner wrote:
On Fri, 2025-05-16 at 13:10 +0100, Tvrtko Ursulin wrote:

On 16/05/2025 12:53, Tvrtko Ursulin wrote:

On 16/05/2025 08:28, Philipp Stanner wrote:
On Thu, 2025-05-15 at 17:17 +0100, Tvrtko Ursulin wrote:

On 15/05/2025 16:00, Christian König wrote:
Sometimes drivers need to be able to submit multiple jobs
which
depend on
each other to different schedulers at the same time, but
using
drm_sched_job_add_dependency() can't fail any more after
the
first
job is
initialized.

This function preallocate memory for dependency slots so
that
no
ENOMEM
can come later while adding dependencies.

v2: rework implementation an documentation
v3: rework from scratch, use separate function to add
preallocated
deps

I think we agreed to not put change logs into commit messages
anymore
:)

They aren't useful for any reader. Who needs the changelog
afterwards
can retreive it through the mail thread link that we add.


Signed-off-by: Christian König <christian.koe...@amd.com>
---
     drivers/gpu/drm/scheduler/sched_main.c | 45
++++++++++++++++++++++++++
     include/drm/gpu_scheduler.h            |  4 +++
     2 files changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index f7118497e47a..b95e7089aa70 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -858,6 +858,51 @@ void drm_sched_job_arm(struct
drm_sched_job
*job)
     }
     EXPORT_SYMBOL(drm_sched_job_arm);
+/**
+ * drm_sched_job_prealloc_dependency_slot - avoid ENOMEM
on
adding
dependencies
+ * @job: scheduler job where dependencies will be added
+ * @id: id for the allocated slot
+  *
+ * Sometimes drivers need to be able to submit multiple
jobs
which
depend on
+ * each other to different schedulers at the same time,
but
using
+ * drm_sched_job_add_dependency() can't fail any more
after
the
first job is
+ * initialized.
+ *
+ * This function preallocate memory for a dependency
slot so
that
no ENOMEM can
+ * come later while adding dependencies. The index of
the
preallocated slot is
+ * returned in @id.
+ *
+ * Return:
+ * 0 on success, or an error on failing to expand the
array.
+ */
+int drm_sched_job_prealloc_dependency_slot(struct
drm_sched_job
*job,
+                       u32 *id)
+{
+    return xa_alloc(&job->dependencies, id, NULL,
xa_limit_32b, GFP_KERNEL);
+}
+EXPORT_SYMBOL(drm_sched_job_prealloc_dependency_slot);
+
+/**
+ * drm_sched_job_add_prealloc_dep - add dependency to
preallocated
slot
+ * @job: scheduler job where dependencies will be added
+ * @id: the preallocated slot index
+ * @fence: the dependency to add
+ *
+ * Consumes @fence and adds it to the preallocated slot
dependency.
+ */
+void drm_sched_job_add_prealloc_dep(struct drm_sched_job
*job, u32
id,
+                    struct dma_fence *fence)
+{
+    fence = xa_store(&job->dependencies, id, fence,
GFP_ATOMIC);

Add assert that the passed id exists (was preallocated) and
is
NULL?

You

Hm?


Also, if someone preallocates and does not consume the slot
will that
confuse the iteration in drm_sched_job_dependency()?

drm_sched_job_add_dependency() you mean.

I was actually thinking of drm_sched_job_dependency() because
that
looked it would skip dependencies upon encountering an
unconsumed
preallocated slot, but yes, drm_sched_job_add_dependency()
could
explode
even earlier if adding a normal dependency after preallocating
a
slot.

Yes, it would. All operations simply give you NULL for those
slots. So
seems to me you have to check for NULL wherever a
preallocated
slot
might drop out. That would then be a bug.

It's kind of tricky, all that. It's a pity that Wilcox didn't
answer
our questions about the idiomatic way to do it.

Maybe reserving slots with already signaled fences wasn't
such a
bad
idea after all?

If we go for the NULL approach, it's probably the only sane
way
to then
check for NULL wherever dependencies are accessed :(

Opinions?

Well if the xarray API returns the NULL consistently the
approach
from
this patch is fine I think.

We just need to add two more checks to the above mentioned
functions,

I need to correct myself, drm_sched_job_dependency() wouldn't be
able
to
just skip NULLs since it relies on NULL for "no more
dependencies".
We
would need to track something like job->max_dependency and
terminate
on
job->last_dependency > job->max_dependency or so.

Agreed, that would have to be fixed.

I believe we should reconsider Christian's first idea [1].

Thinking about it some more:
   * With the NULL version, suddenly the xarray containing only
valid
     dependencies can sometimes contain NULL entries.
   * If we could create our own tag, entries could be returned that
were
     neither NULL nor valid fences, also requiring checks
'everywhere'.
   * Only the "signaled fence as prealloc reservation" approach is
fully
     backwards compatible and will never cause anyone to block after
     later reworks.

So maybe it's actually the best idea?

Sorry for the zigg-zagg. No hard requirements intended from my
side,
I'm willing to go with what you guys think.

Just saying, at least now I think that the already-signaled fence
seems
the most elegant solution. And since there's a function
(dma_fence_get_stub()) for that, it seems to be in alignment with
official dma_fence rules.

Potential problem there was dma_fence_is_signaled() and fence
signaling
annotations. In case some driver is holding a lock over the arm+push
pair. I wish we had a non-signaling is_signaled helper..


Yes! +1!

But Christian doesn't like that direction:

https://lore.kernel.org/all/20250409120640.106408-2-pha...@kernel.org/

Thanks, I read this but ended up uncertain on the conclusion.

For instance Christian at the end comments like this:

"""
You can test the flag if you know what the fence means to you, that is not a problem at all.
"""

That was in the context of testing the signaled bit without opportunistic signaling.

For me, from the scheduler dependencies side, that should exactly apply. Scheduler knows it does not need to add a signaled fence to the dep array so AFAICS it is fine to skip it. And it may easily be opportunistic signaling ends up a problem for the scheduler.

So maybe such helper would be okay after all.

Or if the concern is helper might encourage some potentially unsafe usage, in that case it should come with kerneldoc describing. It is not like review is guaranteed to catch someone using test_bit directly anyway so for me, on balance, helper is always better.

Regards,

Tvrtko





P.







Anyway, I think both options are passable. I even like the NULL entry
slightly more since it is simpler in a way and I don't mind some
extra
checks completely hidden in scheduler internals.

Regards,

Tvrtko



Philipp


[1]
https://lore.kernel.org/all/20250318120313.19099-2-christian.koe...@amd.com
/



Regards,

Tvrtko

some more unit tests probably to make sure, and that should be
fine
for
now.

On the bikeshedding front I would perhaps suggest:

    - drm_sched_job_preallocate_dependency()
    - drm_sched_job_replace_dependency()

Reads a little bit more aligned with the rest of the API and a
bit
easier on the eyes, to my eyes at least.

Regards,

Tvrtko






Reply via email to