Re: Opportunistic dma_fence polling

Philipp Stanner Tue, 24 Feb 2026 02:32:21 -0800

On Mon, 2026-02-23 at 12:42 +0100, Christian König wrote:
> Hi Philip,
> 
> I only found this message by coincident, please make sure to always CC my AMD 
> work email address as well.


You've been the direct recipent, in the To: header field :)

> 
> On 2/19/26 12:06, Philipp Stanner wrote:
> > Yo Christian,
> > 
> > I'd like to discuss the dma_fence fast path optimization
> > (ops.is_signaled) again.
> > 
> > As far as I understand by now, the use case is that some drivers will
> > never signal fences; but the consumer of the fence actively polls
> > whether a fence is signaled or not.
> > 
> > Right?
> 
> Close but not 100% right. The semantic is that enabled_signaling is only 
> called when somebody actively waits for the dma_fence to finish.
> 
> So as long as both userspace and kernel only poll for the fence status 
> enable_signaling is never called and only is_signaled is called.

So you're telling me that enable_signaling enables interrupt-driven
signaling, typically. IOW in some cases you can request that a specific
fence gets signaled the expensive way (interrupt) while polling on the
others.

What is the hw->hw signaling that the documentation details?

hw->sw signaling seems to refer to interrupts.

> 
> What drivers/fence implementations do with that is up to them. For example 
> userqueues use it as preemption signaling, but most drivers simply try to 
> avoid waking up the system with IRQs.
> 
> > I have a bunch of questions regarding that:
> > 
> >    1. What does the party polling the fence typically look like? I bet
> >       it's not userspace, is it? Userspace I'd expect to use poll() on
> >       a FD, thus an underlying driver has to check the fence somehow.
> 
> No no, that is indeed userspace.

Userspace has no direct access to a fence. It's, ultimately a kernel
ioctl through which userspace can check a fence. That's what I meant:
it's kernel code implemented in the driver [but running in the user's
process context]

> 
> As soon as the kernel starts to call dma_fence_wait() (for example) we have 
> the normal guaranteed to signal semantics we always have.
> 
> >    2. What if that party checks the fence, determines it is unsignaled?
> >       Will it then again try later?
> 
> I have no idea, that depends on how the userspace component is implemented.
> 
> >    3. If it tries again later anyways, then what is the problem with
> >       the fence-issuing driver itself checking every 5, 10 or 50
> >       milliseconds what the counter in the GPU ring buffer is, and then
> >       signals all those fences?
> 
> That you need to wake up for that, this costs quite a lot of power.
> 
> See two different approach:
> 
> 1. Interrupt driven, e.g. somebody says signal me as soon as possible when 
> the work is done.
> 
> 2. Poll driven, e.g. userspace wakes up every N milliseconds anyway and it 
> doesn't matter if the status changes a bit later.

Makes sense, I guess.

> 
> > So it circles around the question why ops.is_signaled is supposedly
> > unavoidable.
> 
> Additional to the interrupt/poll handling it is also a really important 
> optimization for multicore systems, e.g. it makes the signaling state visible 
> to other CPU cores even when the core handling the IRQ is still busy.

What is the "signaling state"?

A fence's signaled status is indicated through an atomic flag which
becomes visible globally once someone, like said interrupt, has
signaled the fence.


P.

> 
> That is also really important for some use cases as far as I know. Keep in 
> mind that this framework drivers everything from Android mobiles all the way 
> up to supercomputers.
> 
> I mean what we could potentially do is to fix the locking invariant of the 
> is_signaled callback, but that is probably the only simplification possible 
> without breaking tons of use cases.
> 
> Regards,
> Christian.
> 
> > 
> > Regards
> > P.
>

Re: Opportunistic dma_fence polling

Reply via email to