As reported by Chia-I [1], a race exists between drm_gem_lru_remove() and drm_gem_lru_scan(), causing a UAF on a stack-allocated object.
This first patch fixes the problem at the panthor level by making sure we never use drm_gem_lru_remove(). The second one fixes an undetected race between drm_gem_lru_scan() and drm_gem_object_release(). The third one kills drm_gem_lru_remove() so no one else relying on the drm_gem_lru infra gets bitten by this race again. And the last one tries to simplify the locking around LRU updates so we can solve the chicken/egg problem where the lock that needs to acquired is under gem->lru->lock, and gem->lru is also supposed to be accessed with the lru->lock held. Note that patch 1, 2 and 3 could be skipped if we go directly for the approach in patch 4. Panthor wouldn't be impacted because the shrinker support didn't land in Linus tree yet, so no fix to backport there. We might still want patch 2 so it can easily be backported (if the bug is deemed important for MSM). Rob, I'll leave it up to you, but no matter what we decide, I'd really like to have some fix in before the next merge window. Liviu, Chia-I, Steve, I've intentionally dropped your R-b on patch 2 and 3 because they changed a bit. [1]https://gitlab.freedesktop.org/panfrost/linux/-/work_items/86 --- Changes in v2: - Collect R-b - Drop a useless obj->lru != NULL check in drm_gem_lru_scan() - Fix another race introduced in patch 2 - Document why the lru != NULL check done without the lru lock held in drm_gem_lru_remove() is safe - Add a patch to sanitize the GEM LRU locking: lock is now part of drm_device, meaning we don't have this chicken/egg problem where the lock that needs to acquired is under gem->lru->lock, and gem->lru is also supposed to be accessed with the lru->lock held - Fix typos in commit messages and comments - Link to v1: https://lore.kernel.org/r/[email protected] --- Boris Brezillon (4): drm/panthor: Don't use the racy drm_gem_lru_remove() helper drm/gem: Fix a race between drm_gem_lru_scan() and drm_gem_object_release() drm/gem: Stop exposing the racy/unsafe drm_gem_lru_remove() helper drm/gem: Make the GEM LRU lock part of drm_device drivers/gpu/drm/drm_drv.c | 2 + drivers/gpu/drm/drm_gem.c | 79 +++++++++++++------------------- drivers/gpu/drm/msm/msm_drv.c | 11 ++--- drivers/gpu/drm/msm/msm_drv.h | 7 --- drivers/gpu/drm/msm/msm_gem.c | 32 ++++++------- drivers/gpu/drm/msm/msm_gem_shrinker.c | 4 +- drivers/gpu/drm/msm/msm_gem_submit.c | 6 +-- drivers/gpu/drm/msm/msm_gem_vma.c | 12 ++--- drivers/gpu/drm/msm/msm_ringbuffer.c | 6 +-- drivers/gpu/drm/panthor/panthor_device.h | 11 ++++- drivers/gpu/drm/panthor/panthor_gem.c | 24 +++++----- drivers/gpu/drm/panthor/panthor_mmu.c | 29 ++++++------ include/drm/drm_device.h | 7 +++ include/drm/drm_gem.h | 21 ++++----- 14 files changed, 120 insertions(+), 131 deletions(-) --- base-commit: c006978163fd001fbca55e5fa57bddcf49f47ad9 change-id: 20260506-panthor-shrinker-fixes-58c1f45cfc41 Best regards, -- Boris Brezillon <[email protected]>
