On Thu, 22 May 2025 at 00:43, Johannes Weiner <han...@cmpxchg.org> wrote: > > On Wed, May 21, 2025 at 12:23:58PM +1000, Dave Airlie wrote: > > > > > > So in the GPU case, you'd charge on allocation, free objects into a > > > cgroup-specific pool, and shrink using a cgroup-specific LRU > > > list. Freed objects can be reused by this cgroup, but nobody else. > > > They're reclaimed through memory pressure inside the cgroup, not due > > > to the action of others. And all allocated memory is accounted for. > > > > > > I have to admit I'm pretty clueless about the gpu driver internals and > > > can't really judge how feasible this is. But from a cgroup POV, if you > > > want proper memory isolation between groups, it seems to me that's the > > > direction you'd have to take this in. > > > > I've been digging into this a bit today, to try and work out what > > various paths forward might look like and run into a few impedance > > mismatches. > > > > 1. TTM doesn't pool objects, it pools pages. TTM objects are varied in > > size, we don't need to keep any sort of special allocator that we > > would need if we cached sized objects (size buckets etc). list_lru > > doesn't work on pages, if we were pooling the ttm objects I can see > > being able to enable list_lru. But I'm seeing increased complexity for > > no major return, but I might dig a bit more into whether caching > > objects might help. > > > > 2. list_lru isn't suitable for pages, AFAICS we have to stick the page > > into another object to store it in the list_lru, which would mean we'd > > be allocating yet another wrapper object. Currently TTM uses the page > > LRU pointer to add it to the shrinker_list, which is simple and low > > overhead. > > Why wouldn't you be able to use the page LRU list_head with list_lru? > > list_lru_add(&ttm_pool_lru, &page->lru, page_to_nid(page), page_memcg(page));
I for some reason got it into my head that list_lru objects weren't list_head, not sure why, guess I shall spend next week exploring this possibility. Dave.