On Tue, 2025-10-14 at 17:54 +0200, Christian König wrote: > On 13.10.25 16:54, Philipp Stanner wrote: > > On Mon, 2025-10-13 at 15:48 +0200, Christian König wrote: > > > Hi everyone, > > > > > > dma_fences have ever lived under the tyranny dictated by the module > > > lifetime of their issuer, leading to crashes should anybody still holding > > > a reference to a dma_fence when the module of the issuer was unloaded. > > > > > > But those days are over! The patch set following this mail finally > > > implements a way for issuers to release their dma_fence out of this > > > slavery and outlive the module who originally created them. > > > > > > Previously various approaches have been discussed, including changing the > > > locking semantics of the dma_fence callbacks (by me) as well as using the > > > drm scheduler as intermediate layer (by Sima) to disconnect dma_fences > > > from their actual users. > > > > > > Changing the locking semantics turned out to be much more trickier than > > > originally thought because especially on older drivers (nouveau, radeon, > > > but also i915) this locking semantics is actually needed for correct > > > operation. > > > > > > Using the drm_scheduler as intermediate layer is still a good idea and > > > should probably be implemented to make live simpler for some drivers, but > > > doesn't work for all use cases. Especially TLB flush fences, preemption > > > fences and userqueue fences don't go through the drm scheduler because it > > > doesn't make sense for them. > > > > > > Tvrtko did some really nice prerequisite work by protecting the returned > > > strings of the dma_fence_ops by RCU. This way dma_fence creators where > > > able to just wait for an RCU grace period after fence signaling before > > > they could be save to free those data structures. > > > > > > Now this patch set here goes a step further and protects the whole > > > dma_fence_ops structure by RCU, so that after the fence signals the > > > pointer to the dma_fence_ops is set to NULL when there is no wait nor > > > release callback given. All functionality which use the dma_fence_ops > > > reference are put inside an RCU critical section, except for the > > > deprecated issuer specific wait and of course the optional release > > > callback. > > > > > > Additional to the RCU changes the lock protecting the dma_fence state > > > previously had to be allocated external. This set here now changes the > > > functionality to make that external lock optional and allows dma_fences > > > to use an inline lock and be self contained. > > > > Allowing for an embedded lock, is that actually necessary for the goals > > of this series, or is it an optional change / improvement? > > It is kind of necessary because otherwise you can't fully determine the > lifetime of the lock. > > The lock is used to avoid signaling a dma_fence when you modify the linked > list of callbacks for example. > > An alternative would be to protect the lock by RCU as well instead of > embedding it in the structure, but that would make things even more > complicated. > > > If I understood you correctly at XDC you wanted to have an embedded > > lock because it improves the memory footprint and because an external > > lock couldn't achieve some goals about fence-signaling-order originally > > intended. Can you elaborate on that? > > The embedded lock is also nice to have for the dma_fence_array, > dma_fence_chain and drm_sched_fence, but that just saves a few cache lines in > some use cases. > > The fence-signaling-order is important for drivers like radeon where the > external lock is protecting multiple fences from signaling at the same time > and makes sure that everything stays in order.
I mean, neither external nor internal lock can somehow force the driver to signal fences in order, can they? Only the driver can ensure this. I am, however, considering modeling something like that on a FenceContext object: fctx.signal_all_fences_up_to_ordered(seqno); P. > > While it is possible to change the locking semantics on such old drivers, > it's probably just better to stay away from it. > > Regards, > Christian. > > > > > P. > > > > > > > > > > The new approach is then applied to amdgpu allowing the module to be > > > unloaded even when dma_fences issued by it are still around. > > > > > > Please review and comment, > > > Christian. > > > > > >
