On 6/6/26 00:44, Eric Chanudet wrote:
Accept only one "region value" pair entry for the dmem.max, dmem.min,
dmem.low files.>
This changes the UAPI that otherwise accepted multiple lines for setting
multiple entries in one write. No existing user is known to rely on
writing multiple regions in a single write.
Ugh, shoot.
For dmem.low specifically, there already are some userspace thingies
floating around that may write more than one region/value pairs.
These thingies all depend on that one patchset for dmemcg protection
that I should really get around to merging[1]. Since the userspace
utilities depend on not-yet-merged patches, they sort of have to expect
stuff changing under their belts, so I wouldn't really consider those
users a blocker by necessity.
As I see it, we could go down one of two paths:
1. We go ahead with the patch as proposed, and I make sure that the
users I know of adapt. Could be a bit icky wrt. "do not break userspace"
rules, but since the already use non-merged UAPIs in one place, you can
argue that these users kind of have to expect breakage.
2. We use the old handling allowing multiple lines for dmem.min and
dmem.low only. This preserves compatibility but uglifies the code by
quite a bit.
All things considered, I think I personally would prefer going with 1.
and taking the patch as proposed and just having one codepath handling
every limit file. Just highlighting this so we don't do it on accident.
[1] https://patchwork.freedesktop.org/series/163183/
Some more review comments inline.
Processing multiple regions in dmemcg_limit_write() could quietly change
first limits before failing on a later one and returning an error to the
writer, with no indication some changes occurred.
Signed-off-by: Eric Chanudet <[email protected]>
---
Follow up from discussions on a previous thread[1].
If Albert's series[2] lands, I can cleanup and prepare some kunits for
these as well.
[1]
https://lore.kernel.org/all/[email protected]/
[2]
https://lore.kernel.org/all/[email protected]/
---
kernel/cgroup/dmem.c | 70 +++++++++++++++++++---------------------------------
1 file changed, 26 insertions(+), 44 deletions(-)
diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
index
6430c7ce1e0372f59f1313163fb7630ce49ac1ef..113ee88e276296bccb4def546adf5cc175d7f0be
100644
--- a/kernel/cgroup/dmem.c
+++ b/kernel/cgroup/dmem.c
@@ -734,57 +734,39 @@ static ssize_t dmemcg_limit_write(struct kernfs_open_file
*of,
void (*apply)(struct dmem_cgroup_pool_state *,
u64))
{
struct dmemcg_state *dmemcs = css_to_dmemcs(of_css(of));
- int err = 0;
-
- while (buf && !err) {
- struct dmem_cgroup_pool_state *pool = NULL;
- char *options, *region_name;
- struct dmem_cgroup_region *region;
- u64 new_limit;
-
- options = buf;
- buf = strchr(buf, '\n');
- if (buf)
- *buf++ = '\0';
-
- options = strstrip(options);
-
- /* eat empty lines */
- if (!options[0])
- continue;
-
- region_name = strsep(&options, " \t");
- if (!region_name[0])
- continue;
-
- if (!options || !*options)
- return -EINVAL;
+ struct dmem_cgroup_pool_state *pool;
+ struct dmem_cgroup_region *region;
+ char *region_name;
+ u64 new_limit;
+ int err;
- rcu_read_lock();
- region = dmemcg_get_region_by_name(region_name);
- rcu_read_unlock();
+ buf = strstrip(buf);
+ region_name = strsep(&buf, " \t");
+ if (!region_name[0] || !buf)
If buf is NULL, isn't strsep(&buf, ...) also NULL? region_name[0] would
therefore be a NULL pointer deref. Flipping the order of the logical or
should be enough to prevent this.
+ return -EINVAL;
- if (!region)
- return -EINVAL;
+ rcu_read_lock();
+ region = dmemcg_get_region_by_name(region_name);
+ rcu_read_unlock();
+ if (!region)
+ return -EINVAL;
- err = dmemcg_parse_limit(options, &new_limit);
- if (err < 0)
- goto out_put;
+ buf = strstrip(buf);
Do we start allowing extra spaces between region and limit as well?
Would also be fine by me, it doesn't break anything, just highlighting
that it's a change in behavior. Should perhaps be documented in the
commit message, too.
Also, you should be able to use skip_spaces() here for an equivalent
result. I'm not strongly opinionated on either way, but using
skip_spaces() indicates more clearly that this can only remove spaces at
the start.
Best,
Natalie
+ err = dmemcg_parse_limit(buf, &new_limit);
+ if (err < 0)
+ goto out_put;
- pool = get_cg_pool_unlocked(dmemcs, region);
- if (IS_ERR(pool)) {
- err = PTR_ERR(pool);
- goto out_put;
- }
+ pool = get_cg_pool_unlocked(dmemcs, region);
+ if (IS_ERR(pool)) {
+ err = PTR_ERR(pool);
+ goto out_put;
+ }
- /* And commit */
- apply(pool, new_limit);
- dmemcg_pool_put(pool);
+ apply(pool, new_limit);
+ dmemcg_pool_put(pool);
out_put:
- kref_put(®ion->ref, dmemcg_free_region);
- }
-
+ kref_put(®ion->ref, dmemcg_free_region);
return err ?: nbytes;
}
---
base-commit: 640c57d6ca1346a1c2363a3f473b405af979e046
change-id: 20260605-cgroup-dmem-write-single-region-9bf05b6d995d
Best regards,