Thank you for your contribution! Sashiko AI review found 2 potential issue(s) 
to consider:
- [High] Double acquisition of the fence spinlock in 
`__dma_fence_enable_signaling` causes an unconditional self-deadlock.
- [High] Acquiring the fence spinlock in `dma_fence_set_deadline` causes 
deadlocks with backend drivers.
--

commit a0ed8f208472e7d15328c05be3bdfe3b20342e3b
Author: Philipp Stanner <[email protected]>

dma-fence: Fix races of fence callbacks versus destructors by locking

This patch modifies dma-fence backend operations to take the fence lock
before executing callbacks. This aims to prevent races between a driver
signaling a fence and freeing its resources while a callback is accessing
driver-specific data.

> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
[ ... ]
> @@ -629,7 +629,8 @@ EXPORT_SYMBOL(dma_fence_free);
>  static bool __dma_fence_enable_signaling(struct dma_fence *fence)
>  {
[ ... ]
>       if (!was_set && ops && ops->enable_signaling) {
>               trace_dma_fence_enable_signal(fence);
>  
> -             if (!ops->enable_signaling(fence)) {
> +             dma_fence_lock_irqsave(fence, flags);
> +             success = ops->enable_signaling(fence);
> +             dma_fence_unlock_irqrestore(fence, flags);
> +             if (!success) {

[Severity: High]
Will this cause an unconditional deadlock?

Looking at the beginning of __dma_fence_enable_signaling(), there is a
dma_fence_assert_held(fence) check, meaning the fence spinlock is already
held when entering this function.

Since standard Linux spinlocks are not recursive, won't attempting to
acquire the same lock a second time here immediately deadlock the CPU?

[ ... ]
> @@ -1020,11 +1024,20 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>  void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>  {
[ ... ]
> +     dma_fence_lock_irqsave(fence, flags);
> +     if (!dma_fence_is_signaled_locked(fence))
>               ops->set_deadline(fence, deadline);
> +
> +     dma_fence_unlock_irqrestore(fence, flags);
>       rcu_read_unlock();
>  }

[Severity: High]
Could taking the lock around this callback cause deadlocks with backend
drivers?

Historically this callback was invoked unlocked, and multiple backend
implementations (such as drm_sched_fence_set_deadline_finished or
msm_fence_set_deadline) explicitly acquire the fence spinlock internally.

If the core now calls them with the lock already held, it appears they will
deadlock trying to acquire the non-recursive lock again.

-- 
Sashiko AI review · 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to