On 5/21/25 04:23, Dave Airlie wrote: >> >> So in the GPU case, you'd charge on allocation, free objects into a >> cgroup-specific pool, and shrink using a cgroup-specific LRU >> list. Freed objects can be reused by this cgroup, but nobody else. >> They're reclaimed through memory pressure inside the cgroup, not due >> to the action of others. And all allocated memory is accounted for. >> >> I have to admit I'm pretty clueless about the gpu driver internals and >> can't really judge how feasible this is. But from a cgroup POV, if you >> want proper memory isolation between groups, it seems to me that's the >> direction you'd have to take this in. > > I've been digging into this a bit today, to try and work out what > various paths forward might look like and run into a few impedance > mismatches. > > 1. TTM doesn't pool objects, it pools pages. TTM objects are varied in > size, we don't need to keep any sort of special allocator that we > would need if we cached sized objects (size buckets etc). list_lru > doesn't work on pages, if we were pooling the ttm objects I can see > being able to enable list_lru. But I'm seeing increased complexity for > no major return, but I might dig a bit more into whether caching > objects might help. > > 2. list_lru isn't suitable for pages, AFAICS we have to stick the page > into another object to store it in the list_lru, which would mean we'd > be allocating yet another wrapper object. Currently TTM uses the page > LRU pointer to add it to the shrinker_list, which is simple and low > overhead. > > If we wanted to stick with keeping pages in the pool, I do feel moving > the pool code closer to the mm core and having some sort of more > tightly integrated reclaim to avoid the overheads. Now in an ideal > world we'd get a page flag like PG_uncached, and we can keep an > uncached inactive list per memcg/node and migrate pages off it, but I > don't think anyone is willing to give us a page flag for this, so I > think we do need to find a compromise that isn't ideal but works for > us now. I've also played a bit with the idea of MEMCG_LOWOVERHEAD > which adds a shrinker to start of shrinker list instead of end and > registering TTM pool shrinker as one of those. > > Have I missed anything here that might make this easier?
Just for completeness of the picture, there is also the generic memory pool allocator (see include/linux/genalloc.h) which is an alternative to kmalloc() for uncached memory. But we never used it because it isn't optimized for pages, but rather objects of certain size (e.g. like kmalloc/vmalloc). Apart from that the general idea of having cgroup specific pools sounds bad to me. Instead of a technical need for this it only exists because we failed to integrate those device pools into the core memory management. Regards, Christian. > > Dave.