As reported by Chia-I [1], a race exists between drm_gem_lru_remove()
and drm_gem_lru_scan(), causing a UAF on a stack-allocated object.

This first patch fixes the problem at the panthor level by making
sure we never use drm_gem_lru_remove(). The second one fixes an
undetected race between drm_gem_lru_scan() and
drm_gem_object_release(). The third one kills drm_gem_lru_remove()
so no one else relying on the drm_gem_lru infra gets bitten by this
race again. And the last one tries to simplify the locking around
LRU updates so we can solve the chicken/egg problem where the lock
that needs to acquired is under gem->lru->lock, and gem->lru is
also supposed to be accessed with the lru->lock held.

Note that patch 1, 2 and 3 could be skipped if we go directly for the
approach in patch 4. Panthor wouldn't be impacted because the
shrinker support didn't land in Linus tree yet, so no fix to backport
there. We might still want patch 2 so it can easily be backported (if
the bug is deemed important for MSM).

Rob, I'll leave it up to you, but no matter what we decide, I'd really
like to have some fix in before the next merge window.

Liviu, Chia-I, Steve, I've intentionally dropped your R-b on patch 2
and 3 because they changed a bit.

[1]https://gitlab.freedesktop.org/panfrost/linux/-/work_items/86

---
Changes in v2:
- Collect R-b
- Drop a useless obj->lru != NULL check in drm_gem_lru_scan()
- Fix another race introduced in patch 2
- Document why the lru != NULL check done without the lru lock held
  in drm_gem_lru_remove() is safe
- Add a patch to sanitize the GEM LRU locking: lock is now part of
  drm_device, meaning we don't have this chicken/egg problem where
  the lock that needs to acquired is under gem->lru->lock, and
  gem->lru is also supposed to be accessed with the lru->lock held
- Fix typos in commit messages and comments
- Link to v1: 
https://lore.kernel.org/r/[email protected]

---
Boris Brezillon (4):
      drm/panthor: Don't use the racy drm_gem_lru_remove() helper
      drm/gem: Fix a race between drm_gem_lru_scan() and 
drm_gem_object_release()
      drm/gem: Stop exposing the racy/unsafe drm_gem_lru_remove() helper
      drm/gem: Make the GEM LRU lock part of drm_device

 drivers/gpu/drm/drm_drv.c                |  2 +
 drivers/gpu/drm/drm_gem.c                | 79 +++++++++++++-------------------
 drivers/gpu/drm/msm/msm_drv.c            | 11 ++---
 drivers/gpu/drm/msm/msm_drv.h            |  7 ---
 drivers/gpu/drm/msm/msm_gem.c            | 32 ++++++-------
 drivers/gpu/drm/msm/msm_gem_shrinker.c   |  4 +-
 drivers/gpu/drm/msm/msm_gem_submit.c     |  6 +--
 drivers/gpu/drm/msm/msm_gem_vma.c        | 12 ++---
 drivers/gpu/drm/msm/msm_ringbuffer.c     |  6 +--
 drivers/gpu/drm/panthor/panthor_device.h | 11 ++++-
 drivers/gpu/drm/panthor/panthor_gem.c    | 24 +++++-----
 drivers/gpu/drm/panthor/panthor_mmu.c    | 29 ++++++------
 include/drm/drm_device.h                 |  7 +++
 include/drm/drm_gem.h                    | 21 ++++-----
 14 files changed, 120 insertions(+), 131 deletions(-)
---
base-commit: c006978163fd001fbca55e5fa57bddcf49f47ad9
change-id: 20260506-panthor-shrinker-fixes-58c1f45cfc41

Best regards,
-- 
Boris Brezillon <[email protected]>

Reply via email to