On Tue, 2026-02-10 at 16:45 +0100, Christian König wrote:
> On 2/10/26 16:07, Alice Ryhl wrote:
> > On Tue, Feb 10, 2026 at 02:56:52PM +0100, Christian König wrote:
> > > On 2/10/26 14:49, Alice Ryhl wrote:
> > > > On Tue, Feb 10, 2026 at 02:26:31PM +0100, Boris Brezillon wrote:
> > > > > On Tue, 10 Feb 2026 13:15:31 +0000
> > > > > Alice Ryhl <[email protected]> wrote:
> > > > > 
> > > > > > On Tue, Feb 10, 2026 at 01:36:17PM +0100, Boris Brezillon wrote:
> > > > > > > On Tue, 10 Feb 2026 10:15:04 +0000
> > > > > > > Alice Ryhl <[email protected]> wrote:
> > > > > > >   
> > > > > > > > impl MustBeSignalled<'_> {
> > > > > > > >     /// Drivers generally should not use this one.
> > > > > > > >     fn i_promise_it_will_be_signalled(self) -> WillBeSignalled 
> > > > > > > > { ... }
> > > > > > > > 
> > > > > > > >     /// One way to ensure the fence has been signalled is to 
> > > > > > > > signal it.
> > > > > > > >     fn signal_fence(self) -> WillBeSignalled {
> > > > > > > >         self.fence.signal();
> > > > > > > >         self.i_promise_it_will_be_signalled()
> > > > > > > >     }
> > > > > > > > 
> > > > > > > >     /// Another way to ensure the fence will be signalled is to 
> > > > > > > > spawn a
> > > > > > > >     /// workqueue item that promises to signal it.
> > > > > > > >     fn transfer_to_wq(
> > > > > > > >         self,
> > > > > > > >         wq: &Workqueue,
> > > > > > > >         item: impl DmaFenceWorkItem,
> > > > > > > >     ) -> WillBeSignalled {
> > > > > > > >         // briefly obtain the lock class of the wq to indicate 
> > > > > > > > to
> > > > > > > >         // lockdep that the signalling path "blocks" on 
> > > > > > > > arbitrary jobs
> > > > > > > >         // from this wq completing
> > > > > > > >         bindings::lock_acquire(&wq->key);
> > > > > > > >         bindings::lock_release(&wq->key);
> > > > > > > > 
> > > > > > > >         // enqueue the job
> > > > > > > >         wq.enqueue(item, wq);
> > > > > > > > 
> > > > > > > >         // The signature of DmaFenceWorkItem::run() promises to 
> > > > > > > > arrange
> > > > > > > >         // for it to be signalled.
> > > > > > > >         self.i_promise_it_will_be_signalled()
> > > > > > > >     }  
> > > > > > > 
> > > > > > > I guess what's still missing is some sort of `transfer_to_hw()`
> > > > > > > function and way to flag the IRQ handler taking over the fence
> > > > > > > signaling token.  
> > > > > > 
> > > > > > Yes, transfer to hardware needs to be another piece of logic 
> > > > > > similar to
> > > > > > transfer to wq. And I imagine there are many ways such a transfer to
> > > > > > hardware could work.
> > > > > > 
> > > > > > Unless you have a timeout on it, in which case the WillBeSignalled 
> > > > > > is
> > > > > > satisfied by the fact you have a timeout alone, and the signalling 
> > > > > > that
> > > > > > happens from the irq is just an opportunistic signal from outside 
> > > > > > the
> > > > > > dma fence signalling critical path.
> > > > > 
> > > > > Yes and no. If it deadlocks in the completion WorkItem because of
> > > > > allocations (or any of the forbidden use cases), I think we want to
> > > > > catch that, because that's a sign fences are likely to end up with
> > > > > timeouts when they should have otherwise been signaled properly.
> > > > > 
> > > > > > Well ... unless triggering timeouts can block on GFP_KERNEL
> > > > > > allocations...
> > > > > 
> > > > > I mean, the timeout handler should also be considered a DMA-signalling
> > > > > path, and the same rules should apply to it.
> > > > 
> > > > I guess that's fair. Even with a timeout you want both to be signalling
> > > > path.
> > > > 
> > > > I guess more generally, if a fence is signalled by mechanism A or B,
> > > > whichever happens first, you have the choice between:
> > > 
> > > That doesn't happen in practice.
> > > 
> > > For each fence you only have one signaling path you need to guarantee
> > > forward progress for.
> > > 
> > > All other signaling paths are just opportunistically optimizations
> > > which *can* signal the fence, but there is no guarantee that they
> > > will.
> > > 
> > > We used to have some exceptions to that, especially around aborting
> > > submissions, but those turned out to be a really bad idea as well.  
> > > 
> > > Thinking more about it you should probably enforce that there is only
> > > one signaling path for each fence signaling.
> > 
> > I'm not really convinced by this.
> > 
> > First, the timeout path must be a fence signalling path because the
> > reason you have a timeout in the first place is because the hw might
> > never signal the fence. So if the timeout path deadlocks on a
> > kmalloc(GFP_KERNEL) and the hw never comes around to wake you up, boom.
> 
> Mhm, good point. On the other hand the timeout handling should probably be 
> considered part of the normal signaling path.


Why would anyone want to allocate in a timeout path in the first place – 
especially for jobqueue?

Timeout -> close the associated ring. Done.
JobQueue will signal the done_fences with -ECANCELED. 

What would the driver want to allocate in its timeout path, i.e.: timeout 
callback.


P.

Reply via email to