On Thu, 2025-04-10 at 14:58 +0200, Christian König wrote: > Am 10.04.25 um 11:24 schrieb Philipp Stanner: > > Nouveau currently relies on the assumption that dma_fences will > > only > > ever get signaled through nouveau_fence_signal(), which takes care > > of > > removing a signaled fence from the list nouveau_fence_chan.pending. > > > > This self-imposed rule is violated in nouveau_fence_done(), where > > dma_fence_is_signaled() (somewhat surprisingly, considering its > > name) > > can signal the fence without removing it from the list. This > > enables > > accesses to already signaled fences through the list, which is a > > bug. > > > > In particular, it can race with nouveau_fence_context_kill(), which > > would then attempt to set an error code on an already signaled > > fence, > > which is illegal. > > > > In nouveau_fence_done(), the call to nouveau_fence_update() already > > ensures to signal all ready fences. Thus, the signaling potentially > > performed by dma_fence_is_signaled() is actually not necessary. > > Ah, I now got what you are trying to do here! But that won't help. > > The problem is it is perfectly valid for somebody external (e.g. > other driver, TTM etc...) to call dma_fence_is_signaled() on a > nouveau fence. > > This will then in turn still signal the fence and leave it on the > pending list and creating the problem you have.
Good to hear – precisely that then is the use case for a dma_fence callback! ^_^ It guarantees that, no matter who signals a fence, no matter at what place, a certain action will always be performed. I can't think of any other mechanism which could guarantee that a signaled fence immediately gets removed from nouveau's pending list, other than the callbacks. But seriously, I don't think that anyone does this currently, nor do I think that anyone could get away with doing it without the entire computer burning down. P. > > Regards, > Christian. > > > > > Replace the call to dma_fence_is_signaled() with > > nouveau_fence_base_is_signaled(). > > > > Cc: <sta...@vger.kernel.org> # 4.10+, precise commit not to be > > determined > > Signed-off-by: Philipp Stanner <pha...@kernel.org> > > --- > > drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c > > b/drivers/gpu/drm/nouveau/nouveau_fence.c > > index 7cc84472cece..33535987d8ed 100644 > > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c > > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c > > @@ -274,7 +274,7 @@ nouveau_fence_done(struct nouveau_fence *fence) > > nvif_event_block(&fctx->event); > > spin_unlock_irqrestore(&fctx->lock, flags); > > } > > - return dma_fence_is_signaled(&fence->base); > > + return test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence- > > >base.flags); > > } > > > > static long >