From: Mingyu Wang <[email protected]>

A severe deadlock occurs when disabling the CRTC while the VKMS vblank
hrtimer is running. The issue is caused by a circular lock dependency
(ABBA) involving the DRM core's dev->vbl_lock and the hrtimer cancellation
sequence.

Stack traces from NMI backtrace confirm the deadlock:
CPU 0 (IRQ Context):
 [ <0>] hrtimer_interrupt
 [ <0>] vkms_vblank_simulate
 [ <0>] drm_crtc_handle_vblank
 [ <0>] _raw_spin_lock_irqsave (waiting for dev->vbl_lock)

CPU 2 (Process Context):
 [ <2>] drm_crtc_vblank_off
 [ <2>] vmw_vkms_disable_vblank
 [ <2>] hrtimer_cancel (blocks waiting for timer callback)

This results in a system lockup and RCU stall:
[ 3367.370429] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks
[ 3367.912523] rcu: rcu_preempt kthread starved for 10504 jiffies!

The driver incorrectly calls the blocking hrtimer_cancel() while holding
dev->vbl_lock inside the disable_vblank() callback.

Fix this by using hrtimer_try_to_cancel() in vmw_vkms_disable_vblank().
This callback must remain non-blocking as it is called with dev->vbl_lock
held by the DRM core. Subsequently, call hrtimer_cancel() in
vmw_vkms_crtc_atomic_disable() *after* drm_crtc_vblank_off() has released
the lock. This ensures the timer is safely and synchronously stopped
without inducing a deadlock.

Signed-off-by: Mingyu Wang <[email protected]>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c
index 5abd7f5ad2db..96fc856b9e06 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_vkms.c
@@ -305,7 +305,10 @@ vmw_vkms_disable_vblank(struct drm_crtc *crtc)
        if (!vmw->vkms_enabled)
                return;
 
-       hrtimer_cancel(&du->vkms.timer);
+       /*
+        * Non-blocking cancel to avoid ABBA deadlock while holding vbl_lock.
+        */
+       hrtimer_try_to_cancel(&du->vkms.timer);
        du->vkms.surface = NULL;
        du->vkms.period_ns = ktime_set(0, 0);
 }
@@ -390,9 +393,16 @@ vmw_vkms_crtc_atomic_disable(struct drm_crtc *crtc,
                             struct drm_atomic_state *state)
 {
        struct vmw_private *vmw = vmw_priv(crtc->dev);
+       struct vmw_display_unit *du = vmw_crtc_to_du(crtc);
 
-       if (vmw->vkms_enabled)
-               drm_crtc_vblank_off(crtc);
+       if (vmw->vkms_enabled) {
+               drm_crtc_vblank_off(crtc);
+               /*
+                * Synchronously stop the timer after releasing the vbl_lock
+                * to ensure no further callbacks occur.
+                */
+               hrtimer_cancel(&du->vkms.timer);
+       }
 }
 
 static bool
-- 
2.34.1

Reply via email to