sched: add drm_sched_prealloc_dependency_slots v3

Christian König Wed, 21 May 2025 01:26:23 -0700

Sorry for the delayed reply.

On 5/19/25 11:04, Philipp Stanner wrote:
>>>>>>>
>>>>>>>>>
>>>>>>>>> Also, if someone preallocates and does not consume the
>>>>>>>>> slot
>>>>>>>>> will that
>>>>>>>>> confuse the iteration in drm_sched_job_dependency()?


No it doesn't. The xarray is filtering NULL and XA_ZERO_ENTRY values in the 
iterator.

As far as I know that is *not* documented anywhere and it took me quite some 
time to figure that out.

E.g. you can explicitly ask for a NULL/XA_ZERO_ENTRY entry with xa_load(), but 
you won't see it in xa_for_each().

>>>>>>>>
>>>>>>>> drm_sched_job_add_dependency() you mean.
>>>>>>>
>>>>>>> I was actually thinking of drm_sched_job_dependency()
>>>>>>> because
>>>>>>> that
>>>>>>> looked it would skip dependencies upon encountering an
>>>>>>> unconsumed
>>>>>>> preallocated slot, but yes, drm_sched_job_add_dependency()
>>>>>>> could
>>>>>>> explode
>>>>>>> even earlier if adding a normal dependency after
>>>>>>> preallocating
>>>>>>> a
>>>>>>> slot.
>>>>>>>
>>>>>>>> Yes, it would. All operations simply give you NULL for
>>>>>>>> those
>>>>>>>> slots. So
>>>>>>>> seems to me you have to check for NULL wherever a
>>>>>>>> preallocated
>>>>>>>> slot
>>>>>>>> might drop out. That would then be a bug.
>>>>>>>>
>>>>>>>> It's kind of tricky, all that. It's a pity that Wilcox
>>>>>>>> didn't
>>>>>>>> answer
>>>>>>>> our questions about the idiomatic way to do it.
>>>>>>>>
>>>>>>>> Maybe reserving slots with already signaled fences wasn't
>>>>>>>> such a
>>>>>>>> bad
>>>>>>>> idea after all?
>>>>>>>>
>>>>>>>> If we go for the NULL approach, it's probably the only
>>>>>>>> sane
>>>>>>>> way
>>>>>>>> to then
>>>>>>>> check for NULL wherever dependencies are accessed :(
>>>>>>>>
>>>>>>>> Opinions?
>>>>>>>
>>>>>>> Well if the xarray API returns the NULL consistently the
>>>>>>> approach
>>>>>>> from
>>>>>>> this patch is fine I think.
>>>>>>>
>>>>>>> We just need to add two more checks to the above mentioned
>>>>>>> functions,
>>>>>>
>>>>>> I need to correct myself, drm_sched_job_dependency() wouldn't
>>>>>> be
>>>>>> able
>>>>>> to
>>>>>> just skip NULLs since it relies on NULL for "no more
>>>>>> dependencies".
>>>>>> We
>>>>>> would need to track something like job->max_dependency and
>>>>>> terminate
>>>>>> on
>>>>>> job->last_dependency > job->max_dependency or so.
>>>>>
>>>>> Agreed, that would have to be fixed.
>>>>>
>>>>> I believe we should reconsider Christian's first idea [1].
>>>>>
>>>>> Thinking about it some more:
>>>>>    * With the NULL version, suddenly the xarray containing only
>>>>> valid
>>>>>      dependencies can sometimes contain NULL entries.
>>>>>    * If we could create our own tag, entries could be returned
>>>>> that
>>>>> were
>>>>>      neither NULL nor valid fences, also requiring checks
>>>>> 'everywhere'.
>>>>>    * Only the "signaled fence as prealloc reservation" approach
>>>>> is
>>>>> fully
>>>>>      backwards compatible and will never cause anyone to block
>>>>> after
>>>>>      later reworks.
>>>>>
>>>>> So maybe it's actually the best idea?
>>>>>
>>>>> Sorry for the zigg-zagg. No hard requirements intended from my
>>>>> side,
>>>>> I'm willing to go with what you guys think.
>>>>>
>>>>> Just saying, at least now I think that the already-signaled
>>>>> fence
>>>>> seems
>>>>> the most elegant solution. And since there's a function
>>>>> (dma_fence_get_stub()) for that, it seems to be in alignment
>>>>> with
>>>>> official dma_fence rules.
>>>>
>>>> Potential problem there was dma_fence_is_signaled() and fence
>>>> signaling
>>>> annotations. In case some driver is holding a lock over the
>>>> arm+push
>>>> pair. I wish we had a non-signaling is_signaled helper..
>>>>
>>>
>>> Yes! +1!
>>>
>>> But Christian doesn't like that direction:
>>>
>>> https://lore.kernel.org/all/20250409120640.106408-2-pha...@kernel.org/
>>
>> Thanks, I read this but ended up uncertain on the conclusion.
>>
>> For instance Christian at the end comments like this:
>>
>> """
>> You can test the flag if you know what the fence means to you, that
>> is 
>> not a problem at all.
>> """
>>
>> That was in the context of testing the signaled bit without 
>> opportunistic signaling.
>>
>> For me, from the scheduler dependencies side, that should exactly
>> apply. 
>> Scheduler knows it does not need to add a signaled fence to the dep 
>> array so AFAICS it is fine to skip it. And it may easily be 
>> opportunistic signaling ends up a problem for the scheduler.

Well that was not what I meant.

The code which issued the fence can test the fence flags directly.

But the scheduler is a consumer of fences somebody else issued.

In other words we could do something like dma_fence_is_stub() or similar, to 
have a clean interface. But please no testing of the fence flags in the 
scheduler directly.

Anyway, we don't need it. The general assumption that the xarray returns NULL 
for the iterator is incorrect. NULL entries are simply skipped over.

Regards,
Christian.

>>
>> So maybe such helper would be okay after all.
> 
> The thing is that, if I understand him correctly, Christian doesn't
> want a helper. He wants "us" to just use test_bit().
> 
> My point is just that dma_fence_is_signaled() is a horrible name.
> 
> The function pci_is_enabled() tells you whether the PCI device is
> enabled. What it doesn't do is
> 
> bool pci_is_enabled(pdev)
> {
>    if (crazy_callback_is_implemented()) {
>       pci_enable_device();
>       return true;
>    }
> 
>   ...
> }
> 
> It's not intuitive that a function called "{something}_is_signaled()"
> does signal that thing. Although I get that the syntactical idea
> probably is that from the GPUs POV the fence is already signaled when
> this or that seqno has been reached.
> 
> Anyways, judging aside, if a wrapper for test_bit(dma_fence) were
> acceptable, then it would need to steal dma_fence_is_signaled()'s name,
> and dma_fence_is_signaled() would have to get a new name. Which is
> precisely what was rejected, as I see it.
> 
> P.
> 
>>
>> Or if the concern is helper might encourage some potentially unsafe 
>> usage, in that case it should come with kerneldoc describing. It is
>> not 
>> like review is guaranteed to catch someone using test_bit directly 
>> anyway so for me, on balance, helper is always better.
>>
>> Regards,
>>
>> Tvrtko
>>
>>
>>
>>
>>>
>>> P.
>>>
>>>>
>>>>
>>>
>>>
>>>>
>>>>
>>>> Anyway, I think both options are passable. I even like the NULL
>>>> entry
>>>> slightly more since it is simpler in a way and I don't mind some
>>>> extra
>>>> checks completely hidden in scheduler internals.
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>>>
>>>>>
>>>>>
>>>>> Philipp
>>>>>
>>>>>
>>>>> [1]
>>>>> https://lore.kernel.org/all/20250318120313.19099-2-christian.koe...@amd.com
>>>>> /
>>>>>
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Tvrtko
>>>>>>
>>>>>>> some more unit tests probably to make sure, and that should
>>>>>>> be
>>>>>>> fine
>>>>>>> for
>>>>>>> now.
>>>>>>>
>>>>>>> On the bikeshedding front I would perhaps suggest:
>>>>>>>
>>>>>>>     - drm_sched_job_preallocate_dependency()
>>>>>>>     - drm_sched_job_replace_dependency()
>>>>>>>
>>>>>>> Reads a little bit more aligned with the rest of the API
>>>>>>> and a
>>>>>>> bit
>>>>>>> easier on the eyes, to my eyes at least.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Tvrtko
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: [PATCH 1/3] drm/sched: add drm_sched_prealloc_dependency_slots v3

Reply via email to