Re: [PATCH] cgroup/dmem: implement dmem.high soft limit and throttling

Natalie Vock Thu, 21 May 2026 03:53:08 -0700

On 5/20/26 08:07, Qiliang Yuan wrote:

Introduce the "high" soft limit for the dmem cgroup v2 controller.
When a cgroup's device memory usage exceeds its high limit, tasks
belonging to that cgroup are throttled by being forced into a sleep
before returning to user space, instead of being failed outright
as with the "max" limit.


Key changes:
- Add high counter configuration to dmem_cgroup_pool.
- Add over-high check in the try_charge path and set TIF_NOTIFY_RESUME.
- Inject the dmem throttling handler into resume_user_mode_work.
- Implement the handler to perform a 100ms interruptible sleep for
   over-limit tasks.

Interesting proposal, but inserting sleeps on allocation is never a goodidea and doesn't work like you might think it does. In graphics driverland, lots of random things may result in buffer allocation functionsbeing called. Whenever TTM determines some buffer needs to be physicallymoved (most often during VRAM contention, but also as a result ofpinning buffers for scanout, etc etc), dmem cgroup pools arecharged/uncharged in accordance with the change in buffer residency.Sleeping in a charge/uncharge path means that in the worst case, a taskwill be put to sleep over and over again for exceeding its high limitjust once.

Most critically, submit ioctls typically go over the task's entireworking set and call ttm_bo_validate() to make sure the buffer isaccessible by the GPU, since paging things in on fault is not availablein many consumer GPUs. Your approach could lead to every singlesubmission sleeping for at least 100ms, thus permanently destroyingperformance.

Maarten's suggestion of preferentially evicting memory that is over thehigh limit sounds like a better approach.

(Also, did you use AI for this? Please disclose your AI usage as perkernel guidelines if so.)


Best,
Natalie

Re: [PATCH] cgroup/dmem: implement dmem.high soft limit and throttling

Reply via email to