On 24/10/2025 08:50, Philipp Stanner wrote:
To decouple the dma_fence_ops lifetime from dma_fences lifetime RCU
support was added to said function, coupled with using the signaled bit
to detect whether the fence_ops might be gone already.
When implementing that a wrong string was set as a default return
parameter, indicating that every driver whose fence is already signalled
must be detached, which is frankly wrong.
Depends on how you look at it. After being signaled fence has to be
detached from the driver. Ie. nothing belonging to this driver must be
accessed via the fence.
I started with names and Christian has recently continued with ops.
Reported-by: Danilo Krummrich <[email protected]>
Fixes: 506aa8b02a8d ("dma-fence: Add safe access helpers and document the
rules")
Signed-off-by: Philipp Stanner <[email protected]>
---
When this was merged, it sadly slipped by me. I think this entire RCU
mechanism was / is an overengineered idea.
If we look at who actually uses dma_fence_driver_name() and
dma_fence_timeline_name() – functions from which the largest share of
the fence_ops vs. fence lifetime issue stems from – we discover that
there is a single user:
i915.
Not quite. The trigger event for fixing this was actually xe where use
after free was achievable with a trivial set of userspace steps. See
reproducer at:
https://lore.kernel.org/igt-dev/[email protected]/
Essentially any fence exporter whose fence can be exported either
directly via sync file, or via syncobj to sync file export, and has
state accessible via the fence ops which may be freed after the fence is
signalled, or if the driver can be unbound from the device and unloaded,
is vulnerable.
Isn't that driver even deprecated?
Not exactly. It is not getting support for new hardware generations,
while the new driver is not supporting old. There is a cut off point and
an overlap of around one generation. Although I am not even sure this
overlap is officially supported by Intel.
I think the better thing to do is: remove these functions alltogether,
or at least deprecate them. Then the only lifetime issue left so solve
is the callback functions.
That would be nice, I also do not see much value in exporting names to
userspace. But first more conversation around breaking the sync file ABI
needs to happen. I think we had a little bit of it when changing the
names of signalled fences and thinking was existing tools which look at
the names will mostly survive it. Not sure if they would if unsignalled
names would change.
P.
---
drivers/dma-buf/dma-fence.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 3f78c56b58dc..1875a0abebd3 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -1111,7 +1111,7 @@ const char __rcu *dma_fence_driver_name(struct dma_fence
*fence)
if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
return fence->ops->get_driver_name(fence);
else
- return "detached-driver";
+ return "driver-whose-fence-is-already-signalled";
IMHO unnecessarily verbose and whether or not changing it to anything
different warrants a Fixes: tag is debatable.
Regards,
Tvrtko
}
EXPORT_SYMBOL(dma_fence_driver_name);