Hi Steve,

On 2/2/26 17:09, Steven Price wrote:
On 02/02/2026 11:36, Boris Brezillon wrote:
From: Akash Goel <[email protected]>

This implementation is losely based on the MSM shrinker, and it's
relying on the drm_gpuvm eviction/validation infrastructure.

Right now we only support swapout/eviction, but we could add an extra
flag to specify when buffer content doesn't need to be preserved to
avoid the swapout/swapin dance.

Locking is a bit of a nightmare, but using _trylock() all the way in
the reclaim path seems to make lockdep happy. And yes, we might be
missing opportunities to reclaim when the system is under heavy GPU
load/heavy memory pressure/heavy GPU VM activity, but that's better
than no reclaim at all.

v2:
- Move gpu_mapped_shared next to the mmapped LRU
- Add a bunch of missing is_[vm_bo,vma]_evicted() tests
- Only test mmap_count to check if a BO is mmaped
- Remove stale comment about shrinker not being a thing
- Allow pin_count to be non-zero in panthor_gem_swapin_locked()
- Fix panthor_gem_sync() to check for BO residency before doing the CPU sync
- Fix the value returned by panthor_gem_shrinker_count() in case some
   memory has been released
- Check drmm_mutex_init() ret code
- Explicitly mention that PANTHOR_GEM_UNRECLAIMABLE is the initial state
   of all BOs

Signed-off-by: Akash Goel <[email protected]>
Co-developed-by: Boris Brezillon <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
---
  drivers/gpu/drm/panthor/panthor_device.c |  11 +-
  drivers/gpu/drm/panthor/panthor_device.h |  73 ++++
  drivers/gpu/drm/panthor/panthor_gem.c    | 460 ++++++++++++++++++++++-
  drivers/gpu/drm/panthor/panthor_gem.h    |  70 ++++
  drivers/gpu/drm/panthor/panthor_mmu.c    | 345 ++++++++++++++++-
  drivers/gpu/drm/panthor/panthor_mmu.h    |   8 +
  6 files changed, 938 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.c 
b/drivers/gpu/drm/panthor/panthor_device.c
index 54fbb1aa07c5..bc62a498a8a8 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -2,6 +2,7 @@
  /* Copyright 2018 Marty E. Plummer <[email protected]> */
  /* Copyright 2019 Linaro, Ltd, Rob Herring <[email protected]> */
  /* Copyright 2023 Collabora ltd. */
+/* Copyright 2025 ARM Limited. All rights reserved. */
#include <linux/clk.h>
  #include <linux/mm.h>
@@ -122,6 +123,7 @@ void panthor_device_unplug(struct panthor_device *ptdev)
        panthor_sched_unplug(ptdev);
        panthor_fw_unplug(ptdev);
        panthor_mmu_unplug(ptdev);
+       panthor_gem_shrinker_unplug(ptdev);
        panthor_gpu_unplug(ptdev);
        panthor_pwr_unplug(ptdev);
@@ -291,10 +293,14 @@ int panthor_device_init(struct panthor_device *ptdev)
        if (ret)
                goto err_unplug_gpu;
- ret = panthor_mmu_init(ptdev);
+       ret = panthor_gem_shrinker_init(ptdev);
        if (ret)
                goto err_unplug_gpu;
+ ret = panthor_mmu_init(ptdev);
+       if (ret)
+               goto err_unplug_shrinker;
+
        ret = panthor_fw_init(ptdev);
        if (ret)
                goto err_unplug_mmu;
@@ -326,6 +332,9 @@ int panthor_device_init(struct panthor_device *ptdev)
  err_unplug_mmu:
        panthor_mmu_unplug(ptdev);
+err_unplug_shrinker:
+       panthor_gem_shrinker_unplug(ptdev);
+
  err_unplug_gpu:
        panthor_gpu_unplug(ptdev);
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index b6696f73a536..5cba272f9b4d 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -14,6 +14,7 @@
  #include <linux/spinlock.h>
#include <drm/drm_device.h>
+#include <drm/drm_gem.h>
  #include <drm/drm_mm.h>
  #include <drm/gpu_scheduler.h>
  #include <drm/panthor_drm.h>
@@ -178,6 +179,78 @@ struct panthor_device {
        /** @devfreq: Device frequency scaling management data. */
        struct panthor_devfreq *devfreq;
+ /** @reclaim: Reclaim related stuff */
+       struct {
+               /** @reclaim.shrinker: Shrinker instance */
+               struct shrinker *shrinker;
+
+               /** @reclaim.lock: Lock protecting all LRUs */
+               struct mutex lock;
+
+               /**
+                * @reclaim.unused: BOs with unused pages
+                *
+                * Basically all buffers that got mmapped, vmapped or GPU 
mapped and
+                * then unmapped. There should be no contention on these 
buffers,
+                * making them ideal to reclaim.
+                */
+               struct drm_gem_lru unused;
+
+               /**
+                * @reclaim.mmapped: mmap()-ed buffers
+                *
+                * Those are relatively easy to reclaim since we don't need user
+                * agreement, we can simply teardown the mapping and let it 
fault on
+                * the next access.
+                */
+               struct drm_gem_lru mmapped;
+
+               /**
+                * @reclaim.gpu_mapped_shared: shared BO LRU list
+                *
+                * That's the most tricky BO type to reclaim, because it 
involves
+                * tearing down all mappings in all VMs where this BO is mapped,
+                * which increases the risk of contention and thus decreases the
+                * likeliness of success.
+                */
+               struct drm_gem_lru gpu_mapped_shared;
+
+               /**
+                * @reclaim.vms: VM LRU list
+                *
+                * VMs that have reclaimable BOs only mapped to a single VM are 
placed
+                * in this LRU. Reclaiming such BOs implies waiting for VM 
idleness
+                * (no in-flight GPU jobs targeting this VM), meaning we can't 
reclaim
+                * those if we're in a context where we can't block/sleep.
+                */
+               struct list_head vms;
+
+               /**
+                * @reclaim.gpu_mapped_count: Global counter of pages that are 
GPU mapped
+                *
+                * Allows us to get the number of reclaimable pages without 
walking
+                * the vms and gpu_mapped_shared LRUs.
+                */
+               long gpu_mapped_count;
+
+               /**
+                * @reclaim.retry_count: Number of times we ran the shrinker 
without being
+                * able to reclaim stuff
+                *
+                * Used to stop scanning GEMs when too many attempts were made
+                * without progress.
+                */
+               atomic_t retry_count;
+
+#ifdef CONFIG_DEBUG_FS
+               /**
+                * @reclaim.nr_pages_reclaimed_on_last_scan: Number of pages 
reclaimed on the last
+                * shrinker scan
+                */
+               unsigned long nr_pages_reclaimed_on_last_scan;
+#endif
+       } reclaim;
+
        /** @unplug: Device unplug related fields. */
        struct {
                /** @lock: Lock used to serialize unplug operations. */
diff --git a/drivers/gpu/drm/panthor/panthor_gem.c 
b/drivers/gpu/drm/panthor/panthor_gem.c
index 26fe4be10a86..7af9285447c3 100644
--- a/drivers/gpu/drm/panthor/panthor_gem.c
+++ b/drivers/gpu/drm/panthor/panthor_gem.c
@@ -2,8 +2,10 @@
  /* Copyright 2019 Linaro, Ltd, Rob Herring <[email protected]> */
  /* Copyright 2023 Collabora ltd. */
  /* Copyright 2025 Amazon.com, Inc. or its affiliates */
+/* Copyright 2025 ARM Limited. All rights reserved. */
#include <linux/cleanup.h>
+#include <linux/debugfs.h>
  #include <linux/dma-buf.h>
  #include <linux/dma-mapping.h>
  #include <linux/err.h>
@@ -12,6 +14,8 @@
#include <drm/drm_debugfs.h>
  #include <drm/drm_file.h>
+#include <drm/drm_gpuvm.h>
+#include <drm/drm_managed.h>
  #include <drm/drm_prime.h>
  #include <drm/drm_print.h>
  #include <drm/panthor_drm.h>
@@ -114,6 +118,103 @@ should_map_wc(struct panthor_gem_object *bo)
        return true;
  }
+static bool is_gpu_mapped(struct panthor_gem_object *bo,
+                         enum panthor_gem_reclaim_state *state)
+{
+       struct drm_gpuvm *vm = NULL;
+       struct drm_gpuvm_bo *vm_bo;
+
+       drm_gem_for_each_gpuvm_bo(vm_bo, &bo->base) {
+               /* Skip evicted GPU mappings. */
+               if (vm_bo->evicted)
+                       continue;
+
+               if (!vm) {
+                       *state = PANTHOR_GEM_GPU_MAPPED_PRIVATE;
+                       vm = vm_bo->vm;
+               } else if (vm != vm_bo->vm) {
+                       *state = PANTHOR_GEM_GPU_MAPPED_SHARED;
+                       break;
+               }
+       }
+
+       return !!vm;
+}
+
+static enum panthor_gem_reclaim_state
+panthor_gem_evaluate_reclaim_state_locked(struct panthor_gem_object *bo)
+{
+       enum panthor_gem_reclaim_state gpu_mapped_state;
+
+       dma_resv_assert_held(bo->base.resv);
+       lockdep_assert_held(&bo->base.gpuva.lock);
+
+       /* If pages have not been allocated, there's nothing to reclaim. */
+       if (!bo->backing.pages)
+               return PANTHOR_GEM_UNRECLAIMABLE;
+
+       /* If memory is pinned, we prevent reclaim. */
+       if (refcount_read(&bo->backing.pin_count))
+               return PANTHOR_GEM_UNRECLAIMABLE;
+
+       if (is_gpu_mapped(bo, &gpu_mapped_state))
+               return gpu_mapped_state;
+
+       if (refcount_read(&bo->cmap.mmap_count))
+               return PANTHOR_GEM_MMAPPED;
+
+       return PANTHOR_GEM_UNUSED;
+}
+
+void panthor_gem_update_reclaim_state_locked(struct panthor_gem_object *bo,
+                                            enum panthor_gem_reclaim_state 
*old_statep)
+{
+       struct panthor_device *ptdev = container_of(bo->base.dev, struct 
panthor_device, base);
+       enum panthor_gem_reclaim_state old_state = bo->reclaim_state;
+       enum panthor_gem_reclaim_state new_state;
+       bool was_gpu_mapped, is_gpu_mapped;
+
+       if (old_statep)
+               *old_statep = old_state;
+
+       new_state = panthor_gem_evaluate_reclaim_state_locked(bo);
+       if (new_state == old_state)
+               return;
+
+       was_gpu_mapped = old_state == PANTHOR_GEM_GPU_MAPPED_SHARED ||
+                        old_state == PANTHOR_GEM_GPU_MAPPED_PRIVATE;
+       is_gpu_mapped = new_state == PANTHOR_GEM_GPU_MAPPED_SHARED ||
+                       new_state == PANTHOR_GEM_GPU_MAPPED_PRIVATE;
+
+       if (is_gpu_mapped && !was_gpu_mapped)
+               ptdev->reclaim.gpu_mapped_count += bo->base.size >> PAGE_SHIFT;
+       else if (!is_gpu_mapped && was_gpu_mapped)
+               ptdev->reclaim.gpu_mapped_count -= bo->base.size >> PAGE_SHIFT;
+
+       switch (new_state) {
+       case PANTHOR_GEM_UNUSED:
+               drm_gem_lru_move_tail(&ptdev->reclaim.unused, &bo->base);
+               break;
+       case PANTHOR_GEM_MMAPPED:
+               drm_gem_lru_move_tail(&ptdev->reclaim.mmapped, &bo->base);
+               break;
+       case PANTHOR_GEM_GPU_MAPPED_PRIVATE:
+               panthor_vm_update_bo_reclaim_lru_locked(bo);
+               break;
+       case PANTHOR_GEM_GPU_MAPPED_SHARED:
+               drm_gem_lru_move_tail(&ptdev->reclaim.gpu_mapped_shared, 
&bo->base);
+               break;
+       case PANTHOR_GEM_UNRECLAIMABLE:
+               drm_gem_lru_remove(&bo->base);
+               break;
+       default:
+               drm_WARN(&ptdev->base, true, "invalid GEM reclaim state 
(%d)\n", new_state);
+               break;
+       }
+
+       bo->reclaim_state = new_state;
+}
+
  static void
  panthor_gem_backing_cleanup_locked(struct panthor_gem_object *bo)
  {
@@ -157,8 +258,12 @@ static int panthor_gem_backing_pin_locked(struct 
panthor_gem_object *bo)
                return 0;
ret = panthor_gem_backing_get_pages_locked(bo);
-       if (!ret)
+       if (!ret) {
                refcount_set(&bo->backing.pin_count, 1);
+               mutex_lock(&bo->base.gpuva.lock);
+               panthor_gem_update_reclaim_state_locked(bo, NULL);
+               mutex_unlock(&bo->base.gpuva.lock);
+       }
return ret;
  }
@@ -172,6 +277,9 @@ static void panthor_gem_backing_unpin_locked(struct 
panthor_gem_object *bo)
                /* We don't release anything when pin_count drops to zero.
                 * Pages stay there until an explicit cleanup is requested.
                 */
+               mutex_lock(&bo->base.gpuva.lock);
+               panthor_gem_update_reclaim_state_locked(bo, NULL);
+               mutex_unlock(&bo->base.gpuva.lock);
        }
  }
@@ -203,9 +311,6 @@ panthor_gem_dev_map_get_sgt_locked(struct panthor_gem_object *bo)
        if (drm_WARN_ON_ONCE(bo->base.dev, !bo->backing.pages))
                return ERR_PTR(-EINVAL);
- /* Pages stay around after they've been allocated. At least that stands
-        * until we add a shrinker.
-        */
        ret = panthor_gem_backing_get_pages_locked(bo);
        if (ret)
                return ERR_PTR(ret);
@@ -534,6 +639,46 @@ void panthor_gem_unpin(struct panthor_gem_object *bo)
                panthor_gem_backing_unpin_locked(bo);
  }
+int panthor_gem_swapin_locked(struct panthor_gem_object *bo)
+{
+       struct sg_table *sgt;
+       int ret;
+
+       dma_resv_assert_held(bo->base.resv);
+
+       if (drm_WARN_ON_ONCE(bo->base.dev, drm_gem_is_imported(&bo->base)))
+               return -EINVAL;
+
+       ret = panthor_gem_backing_get_pages_locked(bo);
+       if (ret)
+               return ret;
+
+       sgt = panthor_gem_dev_map_get_sgt_locked(bo);
+       if (IS_ERR(sgt))
+               return PTR_ERR(sgt);
+
+       return 0;
+}
+
+static void panthor_gem_evict_locked(struct panthor_gem_object *bo)
+{
+       dma_resv_assert_held(bo->base.resv);
+       lockdep_assert_held(&bo->base.gpuva.lock);
+
+       if (drm_WARN_ON_ONCE(bo->base.dev, drm_gem_is_imported(&bo->base)))
+               return;
+
+       if (drm_WARN_ON_ONCE(bo->base.dev, 
refcount_read(&bo->backing.pin_count)))
+               return;
+
+       if (drm_WARN_ON_ONCE(bo->base.dev, !bo->backing.pages))
+               return;
+
+       panthor_gem_dev_map_cleanup_locked(bo);
+       panthor_gem_backing_cleanup_locked(bo);
+       panthor_gem_update_reclaim_state_locked(bo, NULL);
+}
+
  static struct sg_table *panthor_gem_get_sg_table(struct drm_gem_object *obj)
  {
        struct panthor_gem_object *bo = to_panthor_bo(obj);
@@ -688,6 +833,10 @@ static vm_fault_t blocking_page_setup(struct vm_fault *vmf,
        } else {
                struct page *page = bo->backing.pages[page_offset];
+ mutex_lock(&bo->base.gpuva.lock);
+               panthor_gem_update_reclaim_state_locked(bo, NULL);
+               mutex_unlock(&bo->base.gpuva.lock);
+
                if (mmap_lock_held)
                        ret = insert_page(vmf, page);
                else
@@ -761,7 +910,9 @@ static void panthor_gem_vm_close(struct vm_area_struct *vma)
dma_resv_lock(bo->base.resv, NULL);
        if (refcount_dec_and_test(&bo->cmap.mmap_count)) {
-               /* Nothing to do, pages are reclaimed lazily. */
+               mutex_lock(&bo->base.gpuva.lock);
+               panthor_gem_update_reclaim_state_locked(bo, NULL);
+               mutex_unlock(&bo->base.gpuva.lock);
        }
        dma_resv_unlock(bo->base.resv);
@@ -798,6 +949,7 @@ panthor_gem_alloc_object(uint32_t flags)
        if (!bo)
                return ERR_PTR(-ENOMEM);
+ bo->reclaim_state = PANTHOR_GEM_UNRECLAIMABLE;
        bo->base.funcs = &panthor_gem_funcs;
        bo->flags = flags;
        mutex_init(&bo->label.lock);
@@ -956,6 +1108,7 @@ panthor_gem_sync(struct drm_gem_object *obj, u32 type,
        struct sg_table *sgt;
        struct scatterlist *sgl;
        unsigned int count;
+       int ret;
/* Make sure the range is in bounds. */
        if (offset + size < offset || offset + size > bo->base.size)
@@ -982,9 +1135,21 @@ panthor_gem_sync(struct drm_gem_object *obj, u32 type,
        if (size == 0)
                return 0;
- sgt = panthor_gem_get_dev_sgt(bo);
-       if (IS_ERR(sgt))
-               return PTR_ERR(sgt);
+       ret = dma_resv_lock_interruptible(bo->base.resv, NULL);
+       if (ret)
+               return ret;
+
+       /* If there's no pages, there's no point pulling those back, bail out 
early. */
+       if (!bo->backing.pages) {
+               ret = 0;
+               goto out_unlock;
+       }
+
+       sgt = panthor_gem_dev_map_get_sgt_locked(bo);
+       if (IS_ERR(sgt)) {
+               ret = PTR_ERR(sgt);
+               goto out_unlock;
+       }
for_each_sgtable_dma_sg(sgt, sgl, count) {
                if (size == 0)
@@ -1028,7 +1193,11 @@ panthor_gem_sync(struct drm_gem_object *obj, u32 type,
                        dma_sync_single_for_cpu(dma_dev, paddr, len, 
DMA_FROM_DEVICE);
        }
- return 0;
+       ret = 0;
+
+out_unlock:
+       dma_resv_unlock(bo->base.resv);
+       return ret;
  }
/**
@@ -1038,11 +1207,13 @@ panthor_gem_sync(struct drm_gem_object *obj, u32 type,
   */
  void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
  {
+       struct panthor_device *ptdev;
        struct panthor_vm *vm;
if (IS_ERR_OR_NULL(bo))
                return;
+ ptdev = container_of(bo->obj->dev, struct panthor_device, base);
        vm = bo->vm;
        panthor_kernel_bo_vunmap(bo);
@@ -1050,6 +1221,8 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
                    to_panthor_bo(bo->obj)->exclusive_vm_root_gem != 
panthor_vm_root_gem(vm));
        panthor_vm_unmap_range(vm, bo->va_node.start, bo->va_node.size);
        panthor_vm_free_va(vm, &bo->va_node);
+       if (vm == panthor_fw_vm(ptdev))
+               panthor_gem_unpin(to_panthor_bo(bo->obj));
        drm_gem_object_put(bo->obj);
        panthor_vm_put(vm);
        kfree(bo);
@@ -1098,6 +1271,12 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, 
struct panthor_vm *vm,
kbo->obj = &bo->base; + if (vm == panthor_fw_vm(ptdev)) {
+               ret = panthor_gem_pin(bo);
+               if (ret)
+                       goto err_put_obj;
+       }
+
        panthor_gem_kernel_bo_set_label(kbo, name);
/* The system and GPU MMU page size might differ, which becomes a
@@ -1109,7 +1288,7 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, 
struct panthor_vm *vm,
        size = ALIGN(size, panthor_vm_page_size(vm));
        ret = panthor_vm_alloc_va(vm, gpu_va, size, &kbo->va_node);
        if (ret)
-               goto err_put_obj;
+               goto err_unpin;
ret = panthor_vm_map_bo_range(vm, bo, 0, size, kbo->va_node.start, vm_map_flags);
        if (ret)
@@ -1121,6 +1300,10 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, 
struct panthor_vm *vm,
  err_free_va:
        panthor_vm_free_va(vm, &kbo->va_node);
+err_unpin:
+       if (vm == panthor_fw_vm(ptdev))
+               panthor_gem_unpin(bo);
+
  err_put_obj:
        drm_gem_object_put(&bo->base);
@@ -1129,6 +1312,231 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
        return ERR_PTR(ret);
  }
+static bool can_swap(void)
+{
+       return get_nr_swap_pages() > 0;
+}
+
+static bool can_block(struct shrink_control *sc)
+{
+       if (!(sc->gfp_mask & __GFP_DIRECT_RECLAIM))
+               return false;
+       return current_is_kswapd() || (sc->gfp_mask & __GFP_RECLAIM);
+}
+
+static unsigned long
+panthor_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control 
*sc)
+{
+       struct panthor_device *ptdev = shrinker->private_data;
+       unsigned long count;
+
+       /* We currently don't have a flag to tell when the content of a
+        * BO can be discarded.
+        */
+       if (!can_swap())
+               return 0;
+
+       count = ptdev->reclaim.unused.count;
+       count += ptdev->reclaim.mmapped.count;
+
+       if (can_block(sc))
+               count += ptdev->reclaim.gpu_mapped_count;
+
+       return count ? count : SHRINK_EMPTY;
+}
+
+static bool should_wait(enum panthor_gem_reclaim_state reclaim_state)
+{
+       return reclaim_state == PANTHOR_GEM_GPU_MAPPED_PRIVATE ||
+              reclaim_state == PANTHOR_GEM_GPU_MAPPED_SHARED;
+}
+
+bool panthor_gem_try_evict(struct drm_gem_object *obj,
+                          struct ww_acquire_ctx *ticket)

I think this could be static - I don't see any reference outside this
file (other than the header).

+{
+       /*
+        * Track last locked entry for unwinding locks in error and
+        * success paths
+        */
+       struct panthor_gem_object *bo = to_panthor_bo(obj);
+       struct drm_gpuvm_bo *vm_bo, *last_locked = NULL;
+       enum panthor_gem_reclaim_state old_state;
+       int ret = 0;
+
+       /* To avoid potential lock ordering issue between bo_gpuva and
+        * mapping->i_mmap_rwsem, unmap the pages from CPU side before
+        * acquring the bo_gpuva lock. As the bo_resv lock is held, CPU
+        * page fault handler won't be able to map in the pages whilst
+        * eviction is in progress.
+        */
+       drm_vma_node_unmap(&bo->base.vma_node, 
bo->base.dev->anon_inode->i_mapping);

There might be an issue here - drm_gem_lru_scan() will have taken the
resv lock. drm_vma_node_unmap() could cause a callback to
panthor_vm_close(). If that ends up being the last reference to
bo->cmap.mmap_count then we'll deadlock attempting to aquire the resv
lock again.

Actually drm_vma_node_unmap() would just invalidate the CPU PTEs.
The CPU mapping won't be removed and so panthor_vm_close() won't get called.


I not 100% on that, and sadly it seems my test setup has died so I can't
test that out today.


We have tests that tries to trigger an evicition for a CPU mapped BO and so far we didn't see a deadlock problem.


Best regards
Akash


Thanks,
Steve

+
+       /* We take this lock when walking the list to prevent

Reply via email to