On Tue, Feb 17, 2026 at 3:04 PM Philipp Stanner <[email protected]> wrote: > > On Tue, 2026-02-10 at 16:45 +0100, Christian König wrote: > > On 2/10/26 16:07, Alice Ryhl wrote: > > > On Tue, Feb 10, 2026 at 02:56:52PM +0100, Christian König wrote: > > > > On 2/10/26 14:49, Alice Ryhl wrote: > > > > > On Tue, Feb 10, 2026 at 02:26:31PM +0100, Boris Brezillon wrote: > > > > > > On Tue, 10 Feb 2026 13:15:31 +0000 > > > > > > Alice Ryhl <[email protected]> wrote: > > > > > > > > > > > > > On Tue, Feb 10, 2026 at 01:36:17PM +0100, Boris Brezillon wrote: > > > > > > > > On Tue, 10 Feb 2026 10:15:04 +0000 > > > > > > > > Alice Ryhl <[email protected]> wrote: > > > > > > > > > > > > > > > > > impl MustBeSignalled<'_> { > > > > > > > > > /// Drivers generally should not use this one. > > > > > > > > > fn i_promise_it_will_be_signalled(self) -> > > > > > > > > > WillBeSignalled { ... } > > > > > > > > > > > > > > > > > > /// One way to ensure the fence has been signalled is to > > > > > > > > > signal it. > > > > > > > > > fn signal_fence(self) -> WillBeSignalled { > > > > > > > > > self.fence.signal(); > > > > > > > > > self.i_promise_it_will_be_signalled() > > > > > > > > > } > > > > > > > > > > > > > > > > > > /// Another way to ensure the fence will be signalled is > > > > > > > > > to spawn a > > > > > > > > > /// workqueue item that promises to signal it. > > > > > > > > > fn transfer_to_wq( > > > > > > > > > self, > > > > > > > > > wq: &Workqueue, > > > > > > > > > item: impl DmaFenceWorkItem, > > > > > > > > > ) -> WillBeSignalled { > > > > > > > > > // briefly obtain the lock class of the wq to > > > > > > > > > indicate to > > > > > > > > > // lockdep that the signalling path "blocks" on > > > > > > > > > arbitrary jobs > > > > > > > > > // from this wq completing > > > > > > > > > bindings::lock_acquire(&wq->key); > > > > > > > > > bindings::lock_release(&wq->key); > > > > > > > > > > > > > > > > > > // enqueue the job > > > > > > > > > wq.enqueue(item, wq); > > > > > > > > > > > > > > > > > > // The signature of DmaFenceWorkItem::run() promises > > > > > > > > > to arrange > > > > > > > > > // for it to be signalled. > > > > > > > > > self.i_promise_it_will_be_signalled() > > > > > > > > > } > > > > > > > > > > > > > > > > I guess what's still missing is some sort of `transfer_to_hw()` > > > > > > > > function and way to flag the IRQ handler taking over the fence > > > > > > > > signaling token. > > > > > > > > > > > > > > Yes, transfer to hardware needs to be another piece of logic > > > > > > > similar to > > > > > > > transfer to wq. And I imagine there are many ways such a transfer > > > > > > > to > > > > > > > hardware could work. > > > > > > > > > > > > > > Unless you have a timeout on it, in which case the > > > > > > > WillBeSignalled is > > > > > > > satisfied by the fact you have a timeout alone, and the > > > > > > > signalling that > > > > > > > happens from the irq is just an opportunistic signal from outside > > > > > > > the > > > > > > > dma fence signalling critical path. > > > > > > > > > > > > Yes and no. If it deadlocks in the completion WorkItem because of > > > > > > allocations (or any of the forbidden use cases), I think we want to > > > > > > catch that, because that's a sign fences are likely to end up with > > > > > > timeouts when they should have otherwise been signaled properly. > > > > > > > > > > > > > Well ... unless triggering timeouts can block on GFP_KERNEL > > > > > > > allocations... > > > > > > > > > > > > I mean, the timeout handler should also be considered a > > > > > > DMA-signalling > > > > > > path, and the same rules should apply to it. > > > > > > > > > > I guess that's fair. Even with a timeout you want both to be > > > > > signalling > > > > > path. > > > > > > > > > > I guess more generally, if a fence is signalled by mechanism A or B, > > > > > whichever happens first, you have the choice between: > > > > > > > > That doesn't happen in practice. > > > > > > > > For each fence you only have one signaling path you need to guarantee > > > > forward progress for. > > > > > > > > All other signaling paths are just opportunistically optimizations > > > > which *can* signal the fence, but there is no guarantee that they > > > > will. > > > > > > > > We used to have some exceptions to that, especially around aborting > > > > submissions, but those turned out to be a really bad idea as well. > > > > > > > > Thinking more about it you should probably enforce that there is only > > > > one signaling path for each fence signaling. > > > > > > I'm not really convinced by this. > > > > > > First, the timeout path must be a fence signalling path because the > > > reason you have a timeout in the first place is because the hw might > > > never signal the fence. So if the timeout path deadlocks on a > > > kmalloc(GFP_KERNEL) and the hw never comes around to wake you up, boom. > > > > Mhm, good point. On the other hand the timeout handling should probably be > > considered part of the normal signaling path. > > > Why would anyone want to allocate in a timeout path in the first place – > especially for jobqueue? > > Timeout -> close the associated ring. Done. > JobQueue will signal the done_fences with -ECANCELED. > > What would the driver want to allocate in its timeout path, i.e.: timeout > callback.
Maybe you need an allocation to hold the struct delayed_work_struct field that you use to enqueue the timeout? Alice
