Re: [RFC PATCH 2/4] rust: sync: Add dma_fence abstractions

Boris Brezillon Wed, 11 Feb 2026 01:21:07 -0800

On Wed, 11 Feb 2026 08:16:38 +0000
Alice Ryhl <[email protected]> wrote:


> On Tue, Feb 10, 2026 at 03:50:25PM +0100, Boris Brezillon wrote:
> > On Tue, 10 Feb 2026 14:11:12 +0000
> > Alice Ryhl <[email protected]> wrote:
> >   
> > > On Tue, Feb 10, 2026 at 02:51:56PM +0100, Boris Brezillon wrote:  
> > > > On Tue, 10 Feb 2026 13:26:48 +0000
> > > > Alice Ryhl <[email protected]> wrote:
> > > >     
> > > > > On Tue, Feb 10, 2026 at 01:49:13PM +0100, Boris Brezillon wrote:    
> > > > > > On Tue, 10 Feb 2026 10:15:04 +0000
> > > > > > Alice Ryhl <[email protected]> wrote:
> > > > > >       
> > > > > > > /// The owner of this value must ensure that this fence is 
> > > > > > > signalled.
> > > > > > > struct MustBeSignalled<'fence> { ... }
> > > > > > > /// Proof value indicating that the fence has either already been
> > > > > > > /// signalled, or it will be. The lifetime ensures that you 
> > > > > > > cannot mix
> > > > > > > /// up the proof value.
> > > > > > > struct WillBeSignalled<'fence> { ... }      
> > > > > > 
> > > > > > Sorry, I have more questions, unfortunately. Seems that
> > > > > > {Must,Will}BeSignalled are targeting specific fences (at least 
> > > > > > that's
> > > > > > what the doc and 'fence lifetime says), but in practice, the 
> > > > > > WorkItem
> > > > > > backing the scheduler can queue 0-N jobs (0 if no jobs have their 
> > > > > > deps
> > > > > > met, and N > 1 if more than one job is ready). Similarly, an IRQ
> > > > > > handler can signal 0-N fences (can be that the IRQ has nothing to 
> > > > > > do we
> > > > > > job completion, or, it can be that multiple jobs have completed). 
> > > > > > How
> > > > > > is this MustBeSignalled object going to be instantiated in practice 
> > > > > > if
> > > > > > it's done before the DmaFenceWorkItem::run() function is called?    
> > > > > >   
> > > > > 
> > > > > The {Must,Will}BeSignalled closure pair needs to wrap the piece of 
> > > > > code
> > > > > that ensures a specific fence is signalled. If you have code that
> > > > > manages a collection of fences and invokes code for specific fences
> > > > > depending on outside conditions, then that's a different matter.
> > > > > 
> > > > > After all, transfer_to_wq() has two components:
> > > > > 1. Logic to ensure any spawned workqueue job eventually gets to run.
> > > > > 2. Once the individual job runs, logic specific to the one fence 
> > > > > ensures
> > > > >    that this one fence gets signalled.    
> > > > 
> > > > Okay, that's a change compared to how things are modeled in C (and in
> > > > JobQueue) at the moment: the WorkItem is not embedded in a specific
> > > > job, it's something that's attached to the JobQueue. The idea being
> > > > that the WorkItem represents a task to be done on the queue itself
> > > > (check if the first element in the queue is ready for execution), not on
> > > > a particular job. Now, we could change that and have a per-job WorkItem,
> > > > but ultimately, we'll have to make sure jobs are dequeued in order
> > > > (deps on JobN can be met before deps on Job0, but we still want JobN to
> > > > be submitted after Job0), and we'd pay the WorkItem overhead once per
> > > > Job instead of once per JobQueue. Probably not the end of the world,
> > > > but it's worth considering, still.    
> > > 
> > > It sounds like the fix here is to have transfer_to_job_queue() instead
> > > of trying to do it at the workqueue level.  
> > 
> > Hm, so Job would be something like that (naming/trait-def are just
> > suggestions to get the discussion going):
> > 
> > trait JobConsumer {
> >     type FenceType;
> >     type JobData;
> > 
> >     fn run(self: MustBeSignalled<T::FenceType>) -> 
> > Result<WillBeSignaled<Self::FenceType>>;
> > }
> > 
> > struct Job<T: JobConsumer> {
> >     fence: MustBeSignalled<T::FenceType>,
> >     data: T::JobData,
> > }  
> 
> The fence field of Job would be PublishedFence or PrivateFence (or just
> DriverDmaFence). The MustBeSignalled/WillBeSignaled types should only
> exist temporarily in a function scope.
> 
> Any time you transfer from one function scope to another (like our
> transfer_to_job_queue() or transfer_to_wq() examples), that results in
> finishing the MustBeSignalled/WillBeSignaled scope on one thread and
> creating a new MustBeSignalled/WillBeSignaled scope on another thread.

Makes sense.

> 
> One could imagine a model where there is no lifetime and you can carry
> it around as you wish. That model works okay in most regards, but it
> gives up the ability to ensure that dma_fence_lockdep_map is properly
> configured to catch mistakes.
> 
> The lifetime prohibits you from using the normal ownership semantics to
> e.g. transfer the MustBeSignalled into a random workqueue, enforcing
> that you can only transfer it into a workqueue by using the provided
> methods, which sets up the lockdep dependencies correctly and ensures
> that dma_fence_lockdep_map is taken in the workqueue job too.
> 
> > I guess that would do.
> > 
> > And then we need to flag the WorkItem that's exposed by the
> > JobQueue as a DmaFenceWorkItem so that
> > bindings::dma_fence_begin_signalling() is called before entry and
> > lockdep can do its job and check that nothing forbidden happens in
> > this WorkItem.  
> 
> In the case of JobQueue, it may make sense to just have the job queue
> implementation do that manually. I do not think the workqueue-level API
> can fully enforce that the job queue can't make mistakes here.

Sure, it can't ensure the fence will be signaled in finite time
(like, you can't prevent an infinite loop, or jobs being dropped on
the floor without having their fences signaled), but it could at
least check that no prohibited ops (blocking allocs, prohibited
locks taken, etc) are done in the common JobQueue implementation if we
introduce some sort of MaySignalDmaFencesWorkItem
abstract (call it what you want, I just made the name super explicit
for clarity) that does the annotation around the ::run(), and then have
JobQueue use this instead of a regular WorkItem.

Note that we'll need this MaySignalDmaFencesWorkItem (which is basically
the thing I've been describing in my previous replies) in Tyr if we
want to mimic Panthor, because the IRQHandler doesn't directly signal
the job fences. We have an extra work item that's scheduled when we
receive FW events related to scheduling, and we do the fence
signalling from a workqueue context after having demuxed all the events
and extracted which GPU context made progress.

Of course, devs can very much call some
{Driver,Published}DmaFence::may_signal_fences() (which would return this
Guard we were discussing) manually in their WorkItem::run()
implementation. But then it becomes an explicit operation, with the
risk of forgetting (intentionally or not) to flag those sections as
being part of the signalling path. If we make it an explicit object,
with a dedicated DmaFenceWorkqueue abstract preventing anything but
MaySignalDmaFencesWorkItem from being scheduled on it, all of a sudden,
this explicit model becomes implicit, and it strengthen the
requirements, which is a good thing, I think.

> 
> > > > > And {Must,Will}BeSignalled exists to help model part (2.). But what 
> > > > > you
> > > > > described with the IRQ callback falls into (1.) instead, which is
> > > > > outside the scope of {Must,Will}BeSignalled (or at least requires more
> > > > > complex APIs).    
> > > > 
> > > > For IRQ callbacks, it's not just about making sure they run, but also
> > > > making sure nothing in there can lead to deadlocks, which is basically
> > > > #2, except it's not scoped to a particular fence. It's just a "fences
> > > > can be signaled from there" marker. We could restrict it to "fences of
> > > > this particular implementation can be signaled from there" but not
> > > > "this particular fence instance will be signaled next, if any", because
> > > > that we don't know until we've walked some HW state to figure out which
> > > > job is complete and thus which fence we need to signal (the interrupt
> > > > we get is most likely multiplexing completion on multiple GPU contexts,
> > > > so before we can even get to our per-context in-flight-jobs FIFO, we
> > > > need to demux this thing).    
> > > 
> > > All I can say is that this is a different use-case for the C api
> > > dma_fence_begin_signalling(). This different usage also seems useful,
> > > but it would be one that does not involve {Must,Will}BeSignalled
> > > arguments at all.
> > > 
> > > After all, dma_fence_begin_signalling() only requires those arguments if
> > > you want to convert a PrivateFence into a PublishedFence. (I guess a
> > > better name is PublishableFence.) If you're not trying to prove that a
> > > specific fence will be signalled, then you don't need the
> > > {Must,Will}BeSignalled arguments.  
> > 
> > Okay, so that would be another function returning some sort of guard
> > then? What I find confusing is the fact
> > dma_fence::dma_fence_begin_signalling() matches the C function name
> > which is not per-fence, but just this lock-guard model flagging a
> > section from which any fence can be signalled, so maybe we should
> > name your dma_fence_begin_signalling() proposal differently, dunno.  
> 
> Yes we would need multiple methods that call dma_fence_begin_signalling()
> depending on why you are calling it.

Ack.

Re: [RFC PATCH 2/4] rust: sync: Add dma_fence abstractions

Reply via email to