On Wed, May 21, 2025 at 12:23:58PM +1000, Dave Airlie wrote: > > > > So in the GPU case, you'd charge on allocation, free objects into a > > cgroup-specific pool, and shrink using a cgroup-specific LRU > > list. Freed objects can be reused by this cgroup, but nobody else. > > They're reclaimed through memory pressure inside the cgroup, not due > > to the action of others. And all allocated memory is accounted for. > > > > I have to admit I'm pretty clueless about the gpu driver internals and > > can't really judge how feasible this is. But from a cgroup POV, if you > > want proper memory isolation between groups, it seems to me that's the > > direction you'd have to take this in. > > I've been digging into this a bit today, to try and work out what > various paths forward might look like and run into a few impedance > mismatches. > > 1. TTM doesn't pool objects, it pools pages. TTM objects are varied in > size, we don't need to keep any sort of special allocator that we > would need if we cached sized objects (size buckets etc). list_lru > doesn't work on pages, if we were pooling the ttm objects I can see > being able to enable list_lru. But I'm seeing increased complexity for > no major return, but I might dig a bit more into whether caching > objects might help. > > 2. list_lru isn't suitable for pages, AFAICS we have to stick the page > into another object to store it in the list_lru, which would mean we'd > be allocating yet another wrapper object. Currently TTM uses the page > LRU pointer to add it to the shrinker_list, which is simple and low > overhead.
Why wouldn't you be able to use the page LRU list_head with list_lru? list_lru_add(&ttm_pool_lru, &page->lru, page_to_nid(page), page_memcg(page));