On 5/19/25 08:18, Dave Airlie wrote: > On Mon, 19 May 2025 at 02:28, Christian König <christian.koe...@amd.com> > wrote: >> >> On 5/16/25 22:25, Dave Airlie wrote: >>> On Sat, 17 May 2025 at 06:04, Johannes Weiner <han...@cmpxchg.org> wrote: >>>>> The memory properties are similar to what GFP_DMA or GFP_DMA32 >>>>> provide. >>>>> >>>>> The reasons we haven't moved this into the core memory management is >>>>> because it is completely x86 specific and only used by a rather >>>>> specific group of devices. >>>> >>>> I fully understand that. It's about memory properties. >>>> >>>> What I think you're also saying is that the best solution would be >>>> that you could ask the core MM for pages with a specific property, and >>>> it would hand you pages that were previously freed with those same >>>> properties. Or, if none such pages are on the freelists, it would grab >>>> free pages with different properties and convert them on the fly. >>>> >>>> For all intents and purposes, this free memory would then be trivially >>>> fungible between drm use, non-drm use, and different cgroups - except >>>> for a few CPU cycles when converting but that's *probably* negligible? >>>> And now you could get rid of the "hack" in drm and didn't have to hang >>>> on to special-property pages and implement a shrinker at all. >>>> >>>> So far so good. >>>> >>>> But that just isn't the implementation of today. And the devil is very >>>> much in the details with this: >>>> >>>> Your memory attribute conversions are currently tied to a *shrinker*. >>>> >>>> This means the conversion doesn't trivially happen in the allocator, >>>> it happens from *reclaim context*. >> >> Ah! At least I now understand your concern here. >> >>>> Now *your* shrinker is fairly cheap to run, so I do understand when >>>> you're saying in exasperation: We give this memory back if somebody >>>> needs it for other purposes. What *is* the big deal? >>>> >>>> The *reclaim context* is the big deal. The problem is *all the other >>>> shrinkers that run at this time as well*. Because you held onto those >>>> pages long enough that they contributed to a bonafide, general memory >>>> shortage situation. And *that* has consequences for other cgroups. >> >> No it doesn't, or at least not as much as you think. >> >> We have gone back and forth on this multiple times already when discussion >> the shrinker implementations. See the DRM mailing list about both the TTM >> and the GEM shared mem shrinker. >> >> The TTM pool shrinker is basically just a nice to have feature which is used >> to avoid deny of service attacks and allows to kick in when use cases >> change. E.g. between installing software (gcc) and running software >> (Blender, ROCm etc..). >> >> In other words the TTM shrinker is not even optimized and spends tons of >> extra CPU cycles because the expectation is that it never really triggers in >> practice. >> >>> I think this is where we have 2 options: >>> (a) moving this stuff into core mm and out of shrinker context >>> (b) fix our shrinker to be cgroup aware and solve that first. >> >> (c) give better priorities to the shrinker API. >> >> E.g. the shrinker for example assumes that the users of the API must scan >> the pages to be able to clean them up. > > Well my again naive approach is to just add simpler low-overhead > shrinkers to the start of the shrinker list and if they free up enough > memory then win, otherwise we were in reclaim anyways, > > however this asks the question if just going into reclaim and having > to touch any shrinkers at all is bad, if the overheads of just doing > that aren't acceptable then we would need to come up with a better way > I suspect?
Yes, but in that case we would need to come up with a better way for the non-cgroup case as well. > adding a single shrinker flag to put the ttm shrinker at the top of > the list is pretty trivial. I'm not 100% sure when the shrinkers actually start doing something. For the TTM pool it would make sense to let memory management look into it before it even looks into the page cache, in other words very early. > Thanks for use-cases that probably matter, I can see the online gaming > workloads being useful overhead reduction. > > There probably isn't much appetite to just migrate the ttm pools into > the core mm, I see a couple of other users like sound do set_memory_* > calls, but I doubt they are on the radar for how much it costs. The long term plan was always to move the whole TTM pool into the DMA API. This way we can not only potentially abstract all this architecture specific stuff, but also remove other cases where people implemented pools in drivers because the DMA API was just not fast enough to come up with coherent memory for example. It's just that there was never a technical need to look into this, so it never hurt hard enough to actually clean that up. The GFP_DMA and GFP_DMA32 flags are also seen as kind of historical leftovers as well. So moving the logic into the core memory management is most likely a no-go from upstream, but what we could do is to move it into the DMA API and let the DMA API then interact with the core memory management in a much better way than what we can do now with the various shrinkers. Regards, Christian. > > Dave.