On 19/05/2025 10:04, Philipp Stanner wrote:
On Mon, 2025-05-19 at 09:51 +0100, Tvrtko Ursulin wrote:
On 16/05/2025 18:16, Philipp Stanner wrote:
On Fri, 2025-05-16 at 15:30 +0100, Tvrtko Ursulin wrote:
On 16/05/2025 14:38, Philipp Stanner wrote:
On Fri, 2025-05-16 at 13:10 +0100, Tvrtko Ursulin wrote:
On 16/05/2025 12:53, Tvrtko Ursulin wrote:
[snip]
+
+/**
+ * drm_sched_job_add_prealloc_dep - add dependency
to
preallocated
slot
+ * @job: scheduler job where dependencies will be
added
+ * @id: the preallocated slot index
+ * @fence: the dependency to add
+ *
+ * Consumes @fence and adds it to the preallocated
slot
dependency.
+ */
+void drm_sched_job_add_prealloc_dep(struct
drm_sched_job
*job, u32
id,
+ struct dma_fence *fence)
+{
+ fence = xa_store(&job->dependencies, id, fence,
GFP_ATOMIC);
Add assert that the passed id exists (was preallocated)
and
is
NULL?
You
Hm?
Also, if someone preallocates and does not consume the
slot
will that
confuse the iteration in drm_sched_job_dependency()?
drm_sched_job_add_dependency() you mean.
I was actually thinking of drm_sched_job_dependency()
because
that
looked it would skip dependencies upon encountering an
unconsumed
preallocated slot, but yes, drm_sched_job_add_dependency()
could
explode
even earlier if adding a normal dependency after
preallocating
a
slot.
Yes, it would. All operations simply give you NULL for
those
slots. So
seems to me you have to check for NULL wherever a
preallocated
slot
might drop out. That would then be a bug.
It's kind of tricky, all that. It's a pity that Wilcox
didn't
answer
our questions about the idiomatic way to do it.
Maybe reserving slots with already signaled fences wasn't
such a
bad
idea after all?
If we go for the NULL approach, it's probably the only
sane
way
to then
check for NULL wherever dependencies are accessed :(
Opinions?
Well if the xarray API returns the NULL consistently the
approach
from
this patch is fine I think.
We just need to add two more checks to the above mentioned
functions,
I need to correct myself, drm_sched_job_dependency() wouldn't
be
able
to
just skip NULLs since it relies on NULL for "no more
dependencies".
We
would need to track something like job->max_dependency and
terminate
on
job->last_dependency > job->max_dependency or so.
Agreed, that would have to be fixed.
I believe we should reconsider Christian's first idea [1].
Thinking about it some more:
* With the NULL version, suddenly the xarray containing only
valid
dependencies can sometimes contain NULL entries.
* If we could create our own tag, entries could be returned
that
were
neither NULL nor valid fences, also requiring checks
'everywhere'.
* Only the "signaled fence as prealloc reservation" approach
is
fully
backwards compatible and will never cause anyone to block
after
later reworks.
So maybe it's actually the best idea?
Sorry for the zigg-zagg. No hard requirements intended from my
side,
I'm willing to go with what you guys think.
Just saying, at least now I think that the already-signaled
fence
seems
the most elegant solution. And since there's a function
(dma_fence_get_stub()) for that, it seems to be in alignment
with
official dma_fence rules.
Potential problem there was dma_fence_is_signaled() and fence
signaling
annotations. In case some driver is holding a lock over the
arm+push
pair. I wish we had a non-signaling is_signaled helper..
Yes! +1!
But Christian doesn't like that direction:
https://lore.kernel.org/all/20250409120640.106408-2-pha...@kernel.org/
Thanks, I read this but ended up uncertain on the conclusion.
For instance Christian at the end comments like this:
"""
You can test the flag if you know what the fence means to you, that
is
not a problem at all.
"""
That was in the context of testing the signaled bit without
opportunistic signaling.
For me, from the scheduler dependencies side, that should exactly
apply.
Scheduler knows it does not need to add a signaled fence to the dep
array so AFAICS it is fine to skip it. And it may easily be
opportunistic signaling ends up a problem for the scheduler.
So maybe such helper would be okay after all.
The thing is that, if I understand him correctly, Christian doesn't
want a helper. He wants "us" to just use test_bit().
I suspected it may be only about not renaming into some variant of
check_and_signal etc. With which I agree with, that the name should be
left alone, because...
My point is just that dma_fence_is_signaled() is a horrible name.
... as dodgy as it is, and despite me also failing to really understand
the reason why it _has_ to be have opportunistic signaling, I think
those semantics are by now pretty much ingrained into people dealing
with this API so I would leave it as is.
What I was thinking was the double underscore version of the helper and
some kerneldoc to say when it is safe to use. For me a helper is better
than poking directly.
The function pci_is_enabled() tells you whether the PCI device is
enabled. What it doesn't do is
bool pci_is_enabled(pdev)
{
if (crazy_callback_is_implemented()) {
pci_enable_device();
return true;
}
...
}
It's not intuitive that a function called "{something}_is_signaled()"
does signal that thing. Although I get that the syntactical idea
probably is that from the GPUs POV the fence is already signaled when
this or that seqno has been reached.
Anyways, judging aside, if a wrapper for test_bit(dma_fence) were
acceptable, then it would need to steal dma_fence_is_signaled()'s name,
and dma_fence_is_signaled() would have to get a new name. Which is
precisely what was rejected, as I see it.
__dma_fence_is_signaled() is what I would do and leave the existing one
as is.
Regards,
Tvrtko
Or if the concern is helper might encourage some potentially unsafe
usage, in that case it should come with kerneldoc describing. It is
not
like review is guaranteed to catch someone using test_bit directly
anyway so for me, on balance, helper is always better.
Regards,
Tvrtko
P.
Anyway, I think both options are passable. I even like the NULL
entry
slightly more since it is simpler in a way and I don't mind some
extra
checks completely hidden in scheduler internals.
Regards,
Tvrtko
Philipp
[1]
https://lore.kernel.org/all/20250318120313.19099-2-christian.koe...@amd.com
/
Regards,
Tvrtko
some more unit tests probably to make sure, and that should
be
fine
for
now.
On the bikeshedding front I would perhaps suggest:
- drm_sched_job_preallocate_dependency()
- drm_sched_job_replace_dependency()
Reads a little bit more aligned with the rest of the API
and a
bit
easier on the eyes, to my eyes at least.
Regards,
Tvrtko