On Mon, Jun 23, 2025 at 6:54 PM Christian König <[email protected]> wrote: > > On 6/19/25 09:20, Dave Airlie wrote: > > From: Dave Airlie <[email protected]> > > > > While discussing memcg intergration with gpu memory allocations, > > it was pointed out that there was no numa/system counters for > > GPU memory allocations. > > > > With more integrated memory GPU server systems turning up, and > > more requirements for memory tracking it seems we should start > > closing the gap. > > > > Add two counters to track GPU per-node system memory allocations. > > > > The first is currently allocated to GPU objects, and the second > > is for memory that is stored in GPU page pools that can be reclaimed, > > by the shrinker. > > > > Cc: Christian Koenig <[email protected]> > > Cc: Matthew Brost <[email protected]> > > Cc: Johannes Weiner <[email protected]> > > Cc: [email protected] > > Cc: Andrew Morton <[email protected]> > > Signed-off-by: Dave Airlie <[email protected]> > > > > --- > > > > v2: add more info to the documentation on this memory. > > > > I'd like to get acks to merge this via the drm tree, if possible, > > > > Dave. > > --- > > Documentation/filesystems/proc.rst | 8 ++++++++ > > drivers/base/node.c | 5 +++++ > > fs/proc/meminfo.c | 6 ++++++ > > include/linux/mmzone.h | 2 ++ > > mm/show_mem.c | 9 +++++++-- > > mm/vmstat.c | 2 ++ > > 6 files changed, 30 insertions(+), 2 deletions(-) > > > > diff --git a/Documentation/filesystems/proc.rst > > b/Documentation/filesystems/proc.rst > > index 5236cb52e357..7cc5a9185190 100644 > > --- a/Documentation/filesystems/proc.rst > > +++ b/Documentation/filesystems/proc.rst > > @@ -1095,6 +1095,8 @@ Example output. You may not have all of these fields. > > CmaFree: 0 kB > > Unaccepted: 0 kB > > Balloon: 0 kB > > + GPUActive: 0 kB > > + GPUReclaim: 0 kB > > Active certainly makes sense, but I think we should rather disable the pool > on newer CPUs than adding reclaimable memory here.
I'm not just concerned about newer platforms though, even on Fedora 42 on my test ryzen1+7900xt machine, with a desktop session running nr_gpu_active 7473 nr_gpu_reclaim 6656 It's not an insignificant amount of memory. I also think if we get to some sort of discardable GTT objects with a shrinker they should probably be accounted in reclaim. Dave.
