There is a race between panthor_device_unplug() and
panthor_device_suspend() which can lead to IRQ handlers running on a
powered down GPU. This is how it can happen:
- unplug routine calls drm_dev_unplug()
- panthor_device_suspend() can now execute, and will skip a lot of
  important work because the device is currently marked as unplugged.
- IRQs will remain active in this case and IRQ handlers can therefore
  try to access a powered down GPU.

The fix is simply to take the PM ref in panthor_device_unplug() a
little bit earlier, before drm_dev_unplug().

Signed-off-by: Ketil Johnsen <[email protected]>
Fixes: 5fe909cae118a ("drm/panthor: Add the device logical block")
---
 drivers/gpu/drm/panthor/panthor_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.c 
b/drivers/gpu/drm/panthor/panthor_device.c
index 81df49880bd87..962a10e00848e 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -83,6 +83,8 @@ void panthor_device_unplug(struct panthor_device *ptdev)
                return;
        }
 
+       drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
+
        /* Call drm_dev_unplug() so any access to HW blocks happening after
         * that point get rejected.
         */
@@ -93,8 +95,6 @@ void panthor_device_unplug(struct panthor_device *ptdev)
         */
        mutex_unlock(&ptdev->unplug.lock);
 
-       drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
-
        /* Now, try to cleanly shutdown the GPU before the device resources
         * get reclaimed.
         */
-- 
2.47.2

Reply via email to