lru: enable memcg tracking for ttm and amdgpu driver

Balbir Singh Tue, 01 Jul 2025 16:26:46 -0700

On 6/30/25 14:49, Dave Airlie wrote:
> Hi all,
> 
> tl;dr: start using list_lru/numa/memcg in GPU driver core and amdgpu driver 
> for now.
> 
> This is a complete series of patches, some of which have been sent before and 
> reviewed,
> but I want to get the complete picture for others, and try to figure out how 
> best to land this.
> 
> There are 3 pieces to this:
> 01->02: add support for global gpu stat counters (previously posted, patch 2 
> is newer)
> 03->07: port ttm pools to list_lru for numa awareness
> 08->14: add memcg stats + gpu apis, then port ttm pools to memcg aware 
> list_lru and shrinker
> 15->17: enable amdgpu to use new functionality.
> 
> The biggest difference in the memcg code from previously is I discovered what
> obj cgroups were designed for and I'm reusing the page/objcg intergration 
> that 
> already exists, to avoid reinventing that wheel right now.
> 
> There are some igt-gpu-tools tests I've written at:
> https://gitlab.freedesktop.org/airlied/igt-gpu-tools/-/tree/amdgpu-cgroups?ref_type=heads
> 
> One problem is there are a lot of delayed action, that probably means the 
> testing
> needs a bit more robustness, but the tests validate all the basic paths.
>


Hi, Dave

memcg is designed to use memory (rss and page cache) as a single entity in a 
way that users
don't need to worry about the distinction between memory types and need to 
think about their
overall memory utilization or discover it with ability to overcommit via swap 
as needed.

How does dmem fit into the picture? Is the cgroup integration designed to 
overcommit or limit
dmem/both? Is the programmer expected to know how much dmem the program will 
need? May be this
was answered, but I missed it.

Balbir Singh

Re: drm/ttm/memcg/lru: enable memcg tracking for ttm and amdgpu driver

Reply via email to