On Tue, 2025-05-20 at 17:15 +0100, Tvrtko Ursulin wrote: > > On 19/05/2025 10:04, Philipp Stanner wrote: > > On Mon, 2025-05-19 at 09:51 +0100, Tvrtko Ursulin wrote: > > > > > > On 16/05/2025 18:16, Philipp Stanner wrote: > > > > On Fri, 2025-05-16 at 15:30 +0100, Tvrtko Ursulin wrote: > > > > > > > > > > On 16/05/2025 14:38, Philipp Stanner wrote: > > > > > > On Fri, 2025-05-16 at 13:10 +0100, Tvrtko Ursulin wrote: > > > > > > > > > > > > > > On 16/05/2025 12:53, Tvrtko Ursulin wrote: > > > > > > > > > > > > > > > > > > > > [snip] > > > > > > > > > > > > > + > > > > > > > > > > > +/** > > > > > > > > > > > + * drm_sched_job_add_prealloc_dep - add > > > > > > > > > > > dependency > > > > > > > > > > > to > > > > > > > > > > > preallocated > > > > > > > > > > > slot > > > > > > > > > > > + * @job: scheduler job where dependencies will > > > > > > > > > > > be > > > > > > > > > > > added > > > > > > > > > > > + * @id: the preallocated slot index > > > > > > > > > > > + * @fence: the dependency to add > > > > > > > > > > > + * > > > > > > > > > > > + * Consumes @fence and adds it to the > > > > > > > > > > > preallocated > > > > > > > > > > > slot > > > > > > > > > > > dependency. > > > > > > > > > > > + */ > > > > > > > > > > > +void drm_sched_job_add_prealloc_dep(struct > > > > > > > > > > > drm_sched_job > > > > > > > > > > > *job, u32 > > > > > > > > > > > id, > > > > > > > > > > > + struct dma_fence *fence) > > > > > > > > > > > +{ > > > > > > > > > > > + fence = xa_store(&job->dependencies, id, > > > > > > > > > > > fence, > > > > > > > > > > > GFP_ATOMIC); > > > > > > > > > > > > > > > > > > > > Add assert that the passed id exists (was > > > > > > > > > > preallocated) > > > > > > > > > > and > > > > > > > > > > is > > > > > > > > > > NULL? > > > > > > > > > > > > > > > > > > You > > > > > > > > > > > > > > > > Hm? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Also, if someone preallocates and does not consume > > > > > > > > > > the > > > > > > > > > > slot > > > > > > > > > > will that > > > > > > > > > > confuse the iteration in > > > > > > > > > > drm_sched_job_dependency()? > > > > > > > > > > > > > > > > > > drm_sched_job_add_dependency() you mean. > > > > > > > > > > > > > > > > I was actually thinking of drm_sched_job_dependency() > > > > > > > > because > > > > > > > > that > > > > > > > > looked it would skip dependencies upon encountering an > > > > > > > > unconsumed > > > > > > > > preallocated slot, but yes, > > > > > > > > drm_sched_job_add_dependency() > > > > > > > > could > > > > > > > > explode > > > > > > > > even earlier if adding a normal dependency after > > > > > > > > preallocating > > > > > > > > a > > > > > > > > slot. > > > > > > > > > > > > > > > > > Yes, it would. All operations simply give you NULL > > > > > > > > > for > > > > > > > > > those > > > > > > > > > slots. So > > > > > > > > > seems to me you have to check for NULL wherever a > > > > > > > > > preallocated > > > > > > > > > slot > > > > > > > > > might drop out. That would then be a bug. > > > > > > > > > > > > > > > > > > It's kind of tricky, all that. It's a pity that > > > > > > > > > Wilcox > > > > > > > > > didn't > > > > > > > > > answer > > > > > > > > > our questions about the idiomatic way to do it. > > > > > > > > > > > > > > > > > > Maybe reserving slots with already signaled fences > > > > > > > > > wasn't > > > > > > > > > such a > > > > > > > > > bad > > > > > > > > > idea after all? > > > > > > > > > > > > > > > > > > If we go for the NULL approach, it's probably the > > > > > > > > > only > > > > > > > > > sane > > > > > > > > > way > > > > > > > > > to then > > > > > > > > > check for NULL wherever dependencies are accessed :( > > > > > > > > > > > > > > > > > > Opinions? > > > > > > > > > > > > > > > > Well if the xarray API returns the NULL consistently > > > > > > > > the > > > > > > > > approach > > > > > > > > from > > > > > > > > this patch is fine I think. > > > > > > > > > > > > > > > > We just need to add two more checks to the above > > > > > > > > mentioned > > > > > > > > functions, > > > > > > > > > > > > > > I need to correct myself, drm_sched_job_dependency() > > > > > > > wouldn't > > > > > > > be > > > > > > > able > > > > > > > to > > > > > > > just skip NULLs since it relies on NULL for "no more > > > > > > > dependencies". > > > > > > > We > > > > > > > would need to track something like job->max_dependency > > > > > > > and > > > > > > > terminate > > > > > > > on > > > > > > > job->last_dependency > job->max_dependency or so. > > > > > > > > > > > > Agreed, that would have to be fixed. > > > > > > > > > > > > I believe we should reconsider Christian's first idea [1]. > > > > > > > > > > > > Thinking about it some more: > > > > > > * With the NULL version, suddenly the xarray containing > > > > > > only > > > > > > valid > > > > > > dependencies can sometimes contain NULL entries. > > > > > > * If we could create our own tag, entries could be > > > > > > returned > > > > > > that > > > > > > were > > > > > > neither NULL nor valid fences, also requiring checks > > > > > > 'everywhere'. > > > > > > * Only the "signaled fence as prealloc reservation" > > > > > > approach > > > > > > is > > > > > > fully > > > > > > backwards compatible and will never cause anyone to > > > > > > block > > > > > > after > > > > > > later reworks. > > > > > > > > > > > > So maybe it's actually the best idea? > > > > > > > > > > > > Sorry for the zigg-zagg. No hard requirements intended from > > > > > > my > > > > > > side, > > > > > > I'm willing to go with what you guys think. > > > > > > > > > > > > Just saying, at least now I think that the already-signaled > > > > > > fence > > > > > > seems > > > > > > the most elegant solution. And since there's a function > > > > > > (dma_fence_get_stub()) for that, it seems to be in > > > > > > alignment > > > > > > with > > > > > > official dma_fence rules. > > > > > > > > > > Potential problem there was dma_fence_is_signaled() and fence > > > > > signaling > > > > > annotations. In case some driver is holding a lock over the > > > > > arm+push > > > > > pair. I wish we had a non-signaling is_signaled helper.. > > > > > > > > > > > > > Yes! +1! > > > > > > > > But Christian doesn't like that direction: > > > > > > > > https://lore.kernel.org/all/20250409120640.106408-2-pha...@kernel.org/ > > > > > > Thanks, I read this but ended up uncertain on the conclusion. > > > > > > For instance Christian at the end comments like this: > > > > > > """ > > > You can test the flag if you know what the fence means to you, > > > that > > > is > > > not a problem at all. > > > """ > > > > > > That was in the context of testing the signaled bit without > > > opportunistic signaling. > > > > > > For me, from the scheduler dependencies side, that should exactly > > > apply. > > > Scheduler knows it does not need to add a signaled fence to the > > > dep > > > array so AFAICS it is fine to skip it. And it may easily be > > > opportunistic signaling ends up a problem for the scheduler. > > > > > > So maybe such helper would be okay after all. > > > > The thing is that, if I understand him correctly, Christian doesn't > > want a helper. He wants "us" to just use test_bit(). > > I suspected it may be only about not renaming into some variant of > check_and_signal etc. With which I agree with, that the name should > be > left alone, because... > > > My point is just that dma_fence_is_signaled() is a horrible name. > > ... as dodgy as it is, and despite me also failing to really > understand > the reason why it _has_ to be have opportunistic signaling, I think > those semantics are by now pretty much ingrained into people dealing > with this API so I would leave it as is. > > What I was thinking was the double underscore version of the helper > and > some kerneldoc to say when it is safe to use. For me a helper is > better > than poking directly. > > > The function pci_is_enabled() tells you whether the PCI device is > > enabled. What it doesn't do is > > > > bool pci_is_enabled(pdev) > > { > > if (crazy_callback_is_implemented()) { > > pci_enable_device(); > > return true; > > } > > > > ... > > } > > > > It's not intuitive that a function called > > "{something}_is_signaled()" > > does signal that thing. Although I get that the syntactical idea > > probably is that from the GPUs POV the fence is already signaled > > when > > this or that seqno has been reached. > > > > Anyways, judging aside, if a wrapper for test_bit(dma_fence) were > > acceptable, then it would need to steal dma_fence_is_signaled()'s > > name, > > and dma_fence_is_signaled() would have to get a new name. Which is > > precisely what was rejected, as I see it. > > __dma_fence_is_signaled() is what I would do and leave the existing > one > as is.
Fine by me. If you provide a patch for dma_fence, I can do a review. P. > > Regards, > > Tvrtko > > > > Or if the concern is helper might encourage some potentially > > > unsafe > > > usage, in that case it should come with kerneldoc describing. It > > > is > > > not > > > like review is guaranteed to catch someone using test_bit > > > directly > > > anyway so for me, on balance, helper is always better. > > > > > > Regards, > > > > > > Tvrtko > > > > > > > > > > > > > > > > > > > > P. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Anyway, I think both options are passable. I even like the > > > > > NULL > > > > > entry > > > > > slightly more since it is simpler in a way and I don't mind > > > > > some > > > > > extra > > > > > checks completely hidden in scheduler internals. > > > > > > > > > > Regards, > > > > > > > > > > Tvrtko > > > > > > > > > > > > > > > > > > > > > > > Philipp > > > > > > > > > > > > > > > > > > [1] > > > > > > https://lore.kernel.org/all/20250318120313.19099-2-christian.koe...@amd.com > > > > > > / > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > Tvrtko > > > > > > > > > > > > > > > some more unit tests probably to make sure, and that > > > > > > > > should > > > > > > > > be > > > > > > > > fine > > > > > > > > for > > > > > > > > now. > > > > > > > > > > > > > > > > On the bikeshedding front I would perhaps suggest: > > > > > > > > > > > > > > > > - drm_sched_job_preallocate_dependency() > > > > > > > > - drm_sched_job_replace_dependency() > > > > > > > > > > > > > > > > Reads a little bit more aligned with the rest of the > > > > > > > > API > > > > > > > > and a > > > > > > > > bit > > > > > > > > easier on the eyes, to my eyes at least. > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > > Tvrtko > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >