+Christian, Alyssa and Faith, as suggested by Sima. I'll make sure to Cc you on v4, but before that, I'd like to get your opinion on the drm-gem/drm-gem-shmem changes to see if sending a v4 is actually desirable or if I should go back to the drawing board.
On Fri, 4 Apr 2025 11:26:26 +0200 Boris Brezillon <boris.brezil...@collabora.com> wrote: > This patch series is a proposal for implementing sparse page allocations > for shmem objects. It was initially motivated by a kind of BO managed by > the Panfrost/Panthor and Lima drivers, the tiler heap, which grows on > demand every time the GPU faults on a virtual address within the heap BO > range. > > Because keeping a struct page pointer array that can describe the entire > virtual range is wasteful when only a few backing pages have been > allocated, we thought a sparse allocation approach with xarrays was a > more efficient choice. > > This sparse GEM shmem features takes the form of a new optional > drm_gem_shmem_sparse_backing object that can be attached to a > drm_gem_shmem_object at creation time if the driver wants. When this > sparse GEM shmem backing mode is enabled, the driver is allow to > partially populate the GEM pages, either at use time, or at fault > time. The page population can be done progressively as the need for > more memory arises. The new APIs takes explicit gfp_t flags to solve > a long standing issue reported by Sima a while ago: all allocations > happening in the fault handler path shouldn't block. > > We also introduce two new helpers at the drm_gem.{c,h} level to > automate the partial xarray population which the GEM-SHMEM logic > relies on to populate its sparse page array. > > A few details about the implementation now: > > - Sparse GEM backing locking is deferred to the driver, because > we can't take the resv locks in the GPU fault handler path, and > since the page population can come from there, we have to let > the driver decide how to lock. > - The pin/get semantics for sparse GEM shmem objects is different, > in that it doesn't populate the pages, but just flag them as > being used/required by some GPU component. The page population > will happen explicitly at GEM creation time or when a GPU fault > happens. Once pages have been populated, they won't disappear > until the GEM object is destroyed, purged or evicted. This means > you can grow, but not shrink. If we ever need to discard > BO ranges, the API can be extended to allow it, but it's not > something we needed for the Lima/Panthor/Panfrost case. > - Panthor and Panfrost changes have been tested, but not extensively. > Lima changes have only been compile-tested. > > Changes from v2: > - We decided to focus more on the DRM aspects and forget about the > sgt building optimizations (sgt helpers taking an xarray instead of > a plain array). If the xarray -> array copies happening in that > path ever become the bottleneck, we can resurrect those changes. > - We decided to make the sparse GEM changes less invasive by making > them an extra layer on top of drm_gem_shmem_object. What this means > is that sparse GEM shmem can now be used as regular GEM shmem > objects (vmap, pin, export, ... all work as they would on a regular > GEM). When that happens, we just force all pages to be populated, > so we kinda lose the benefit of sparse GEMs, but that's fine, because > in practice, such objects shouldn't be manipulated as regular GEMs. > It just serves as a safety guard to limit the risk of regressions > introduced by these sparse GEM shmem changes. > > Discussion of previus revision can be found here[2][3]. > > For those used to review gitlab MRs, here's a link [1] to a Draft > MR grouping those changes, but I'm in no way asking that the review > happens on gitlab. > > Regards, > > Boris > > [1]https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/16 > [2]https://lore.kernel.org/lkml/20250326021433.772196-1-adrian.laru...@collabora.com/T/ > [3]https://lore.kernel.org/dri-devel/20250218232552.3450939-1-adrian.laru...@collabora.com/ > > Adrián Larumbe (1): > drm/gem: Add helpers to request a range of pages on a GEM > > Boris Brezillon (7): > drm/gem-shmem: Support sparse backing > drm/panfrost: Switch to sparse gem shmem to implement our > alloc-on-fault > drm/panthor: Add support for alloc-on-fault buffers > drm/panthor: Allow kernel BOs to pass DRM_PANTHOR_BO_ALLOC_ON_FAULT > drm/panthor: Add a panthor_vm_pre_fault_range() helper > drm/panthor: Make heap chunk allocation non-blocking > drm/lima: Use drm_gem_shmem_sparse_backing for heap buffers > > drivers/gpu/drm/drm_gem.c | 134 +++++++ > drivers/gpu/drm/drm_gem_shmem_helper.c | 404 +++++++++++++++++++- > drivers/gpu/drm/lima/lima_gem.c | 89 ++--- > drivers/gpu/drm/lima/lima_gem.h | 1 + > drivers/gpu/drm/lima/lima_vm.c | 48 ++- > drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +- > drivers/gpu/drm/panfrost/panfrost_gem.c | 37 +- > drivers/gpu/drm/panfrost/panfrost_gem.h | 8 +- > drivers/gpu/drm/panfrost/panfrost_mmu.c | 98 ++--- > drivers/gpu/drm/panfrost/panfrost_mmu.h | 2 + > drivers/gpu/drm/panthor/panthor_drv.c | 20 +- > drivers/gpu/drm/panthor/panthor_fw.c | 6 +- > drivers/gpu/drm/panthor/panthor_gem.c | 18 +- > drivers/gpu/drm/panthor/panthor_gem.h | 12 +- > drivers/gpu/drm/panthor/panthor_heap.c | 222 ++++++----- > drivers/gpu/drm/panthor/panthor_mmu.c | 481 ++++++++++++++++++------ > drivers/gpu/drm/panthor/panthor_mmu.h | 2 + > drivers/gpu/drm/panthor/panthor_sched.c | 6 +- > include/drm/drm_gem.h | 14 + > include/drm/drm_gem_shmem_helper.h | 285 +++++++++++++- > include/uapi/drm/panthor_drm.h | 19 +- > 21 files changed, 1492 insertions(+), 416 deletions(-) >