On 5/20/26 08:07, Qiliang Yuan wrote:
Introduce the "high" soft limit for the dmem cgroup v2 controller.
When a cgroup's device memory usage exceeds its high limit, tasks
belonging to that cgroup are throttled by being forced into a sleep
before returning to user space, instead of being failed outright
as with the "max" limit.
Key changes:
- Add high counter configuration to dmem_cgroup_pool.
- Add over-high check in the try_charge path and set TIF_NOTIFY_RESUME.
- Inject the dmem throttling handler into resume_user_mode_work.
- Implement the handler to perform a 100ms interruptible sleep for
over-limit tasks.
Interesting proposal, but inserting sleeps on allocation is never a good
idea and doesn't work like you might think it does. In graphics driver
land, lots of random things may result in buffer allocation functions
being called. Whenever TTM determines some buffer needs to be physically
moved (most often during VRAM contention, but also as a result of
pinning buffers for scanout, etc etc), dmem cgroup pools are
charged/uncharged in accordance with the change in buffer residency.
Sleeping in a charge/uncharge path means that in the worst case, a task
will be put to sleep over and over again for exceeding its high limit
just once.
Most critically, submit ioctls typically go over the task's entire
working set and call ttm_bo_validate() to make sure the buffer is
accessible by the GPU, since paging things in on fault is not available
in many consumer GPUs. Your approach could lead to every single
submission sleeping for at least 100ms, thus permanently destroying
performance.
Maarten's suggestion of preferentially evicting memory that is over the
high limit sounds like a better approach.
(Also, did you use AI for this? Please disclose your AI usage as per
kernel guidelines if so.)
Best,
Natalie