There is a race between panthor_device_unplug() and panthor_device_suspend() which can lead to IRQ handlers running on a powered down GPU. This is how it can happen: - unplug routine calls drm_dev_unplug() - panthor_device_suspend() can now execute, and will skip a lot of important work because the device is currently marked as unplugged. - IRQs will remain active in this case and IRQ handlers can therefore try to access a powered down GPU.
The fix is simply to take the PM ref in panthor_device_unplug() a little bit earlier, before drm_dev_unplug(). Signed-off-by: Ketil Johnsen <[email protected]> Fixes: 5fe909cae118a ("drm/panthor: Add the device logical block") --- drivers/gpu/drm/panthor/panthor_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c index 81df49880bd87..962a10e00848e 100644 --- a/drivers/gpu/drm/panthor/panthor_device.c +++ b/drivers/gpu/drm/panthor/panthor_device.c @@ -83,6 +83,8 @@ void panthor_device_unplug(struct panthor_device *ptdev) return; } + drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0); + /* Call drm_dev_unplug() so any access to HW blocks happening after * that point get rejected. */ @@ -93,8 +95,6 @@ void panthor_device_unplug(struct panthor_device *ptdev) */ mutex_unlock(&ptdev->unplug.lock); - drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0); - /* Now, try to cleanly shutdown the GPU before the device resources * get reclaimed. */ -- 2.47.2
