On Tue, 10 Feb 2026 13:26:48 +0000
Alice Ryhl <[email protected]> wrote:

> On Tue, Feb 10, 2026 at 01:49:13PM +0100, Boris Brezillon wrote:
> > On Tue, 10 Feb 2026 10:15:04 +0000
> > Alice Ryhl <[email protected]> wrote:
> >   
> > > /// The owner of this value must ensure that this fence is signalled.
> > > struct MustBeSignalled<'fence> { ... }
> > > /// Proof value indicating that the fence has either already been
> > > /// signalled, or it will be. The lifetime ensures that you cannot mix
> > > /// up the proof value.
> > > struct WillBeSignalled<'fence> { ... }  
> > 
> > Sorry, I have more questions, unfortunately. Seems that
> > {Must,Will}BeSignalled are targeting specific fences (at least that's
> > what the doc and 'fence lifetime says), but in practice, the WorkItem
> > backing the scheduler can queue 0-N jobs (0 if no jobs have their deps
> > met, and N > 1 if more than one job is ready). Similarly, an IRQ
> > handler can signal 0-N fences (can be that the IRQ has nothing to do we
> > job completion, or, it can be that multiple jobs have completed). How
> > is this MustBeSignalled object going to be instantiated in practice if
> > it's done before the DmaFenceWorkItem::run() function is called?  
> 
> The {Must,Will}BeSignalled closure pair needs to wrap the piece of code
> that ensures a specific fence is signalled. If you have code that
> manages a collection of fences and invokes code for specific fences
> depending on outside conditions, then that's a different matter.
> 
> After all, transfer_to_wq() has two components:
> 1. Logic to ensure any spawned workqueue job eventually gets to run.
> 2. Once the individual job runs, logic specific to the one fence ensures
>    that this one fence gets signalled.

Okay, that's a change compared to how things are modeled in C (and in
JobQueue) at the moment: the WorkItem is not embedded in a specific
job, it's something that's attached to the JobQueue. The idea being
that the WorkItem represents a task to be done on the queue itself
(check if the first element in the queue is ready for execution), not on
a particular job. Now, we could change that and have a per-job WorkItem,
but ultimately, we'll have to make sure jobs are dequeued in order
(deps on JobN can be met before deps on Job0, but we still want JobN to
be submitted after Job0), and we'd pay the WorkItem overhead once per
Job instead of once per JobQueue. Probably not the end of the world,
but it's worth considering, still.

> And {Must,Will}BeSignalled exists to help model part (2.). But what you
> described with the IRQ callback falls into (1.) instead, which is
> outside the scope of {Must,Will}BeSignalled (or at least requires more
> complex APIs).

For IRQ callbacks, it's not just about making sure they run, but also
making sure nothing in there can lead to deadlocks, which is basically
#2, except it's not scoped to a particular fence. It's just a "fences
can be signaled from there" marker. We could restrict it to "fences of
this particular implementation can be signaled from there" but not
"this particular fence instance will be signaled next, if any", because
that we don't know until we've walked some HW state to figure out which
job is complete and thus which fence we need to signal (the interrupt
we get is most likely multiplexing completion on multiple GPU contexts,
so before we can even get to our per-context in-flight-jobs FIFO, we
need to demux this thing).

Reply via email to