On Sat, Jun 06, 2026 at 06:31:53PM +0200, Natalie Vock wrote:
> On 6/6/26 00:44, Eric Chanudet wrote:
> > Accept only one "region value" pair entry for the dmem.max, dmem.min,
> > dmem.low files.>
> > This changes the UAPI that otherwise accepted multiple lines for setting
> > multiple entries in one write. No existing user is known to rely on
> > writing multiple regions in a single write.
>
> Ugh, shoot.
>
> For dmem.low specifically, there already are some userspace thingies
> floating around that may write more than one region/value pairs.
>
> These thingies all depend on that one patchset for dmemcg protection that I
> should really get around to merging[1]. Since the userspace utilities depend
> on not-yet-merged patches, they sort of have to expect stuff changing under
> their belts, so I wouldn't really consider those users a blocker by
> necessity.
>
> As I see it, we could go down one of two paths:
> 1. We go ahead with the patch as proposed, and I make sure that the users I
> know of adapt. Could be a bit icky wrt. "do not break userspace" rules, but
> since the already use non-merged UAPIs in one place, you can argue that
> these users kind of have to expect breakage.
> 2. We use the old handling allowing multiple lines for dmem.min and dmem.low
> only. This preserves compatibility but uglifies the code by quite a bit.
>
> All things considered, I think I personally would prefer going with 1. and
> taking the patch as proposed and just having one codepath handling every
> limit file. Just highlighting this so we don't do it on accident.
>
> [1] https://patchwork.freedesktop.org/series/163183/
>
> Some more review comments inline.
>
> >
> > Processing multiple regions in dmemcg_limit_write() could quietly change
> > first limits before failing on a later one and returning an error to the
> > writer, with no indication some changes occurred.
> >
> > Signed-off-by: Eric Chanudet <[email protected]>
> > ---
> > Follow up from discussions on a previous thread[1].
> > If Albert's series[2] lands, I can cleanup and prepare some kunits for
> > these as well.
> > [1]
> > https://lore.kernel.org/all/[email protected]/
> > [2]
> > https://lore.kernel.org/all/[email protected]/
> > ---
> > kernel/cgroup/dmem.c | 70
> > +++++++++++++++++++---------------------------------
> > 1 file changed, 26 insertions(+), 44 deletions(-)
> >
> > diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c
> > index
> > 6430c7ce1e0372f59f1313163fb7630ce49ac1ef..113ee88e276296bccb4def546adf5cc175d7f0be
> > 100644
> > --- a/kernel/cgroup/dmem.c
> > +++ b/kernel/cgroup/dmem.c
> > @@ -734,57 +734,39 @@ static ssize_t dmemcg_limit_write(struct
> > kernfs_open_file *of,
> > void (*apply)(struct dmem_cgroup_pool_state *,
> > u64))
> > {
> > struct dmemcg_state *dmemcs = css_to_dmemcs(of_css(of));
> > - int err = 0;
> > -
> > - while (buf && !err) {
> > - struct dmem_cgroup_pool_state *pool = NULL;
> > - char *options, *region_name;
> > - struct dmem_cgroup_region *region;
> > - u64 new_limit;
> > -
> > - options = buf;
> > - buf = strchr(buf, '\n');
> > - if (buf)
> > - *buf++ = '\0';
> > -
> > - options = strstrip(options);
> > -
> > - /* eat empty lines */
> > - if (!options[0])
> > - continue;
> > -
> > - region_name = strsep(&options, " \t");
> > - if (!region_name[0])
> > - continue;
> > -
> > - if (!options || !*options)
> > - return -EINVAL;
> > + struct dmem_cgroup_pool_state *pool;
> > + struct dmem_cgroup_region *region;
> > + char *region_name;
> > + u64 new_limit;
> > + int err;
> > - rcu_read_lock();
> > - region = dmemcg_get_region_by_name(region_name);
> > - rcu_read_unlock();
> > + buf = strstrip(buf);
> > + region_name = strsep(&buf, " \t");
> > + if (!region_name[0] || !buf)
>
> If buf is NULL, isn't strsep(&buf, ...) also NULL? region_name[0] would
> therefore be a NULL pointer deref. Flipping the order of the logical or
> should be enough to prevent this.
>
I can do a v2 with that today.
I added it if there are no delimiter found (e.g, if only the region name
is passed and strstrip() ate any trailing space). Although, buf can't be
NULL in the write callback iirc, it's either pre-allocated or
kmalloc'ed.
> > + return -EINVAL;
> > - if (!region)
> > - return -EINVAL;
> > + rcu_read_lock();
> > + region = dmemcg_get_region_by_name(region_name);
> > + rcu_read_unlock();
> > + if (!region)
> > + return -EINVAL;
> > - err = dmemcg_parse_limit(options, &new_limit);
> > - if (err < 0)
> > - goto out_put;
> > + buf = strstrip(buf);
>
> Do we start allowing extra spaces between region and limit as well? Would
> also be fine by me, it doesn't break anything, just highlighting that it's a
> change in behavior. Should perhaps be documented in the commit message, too.
>
> Also, you should be able to use skip_spaces() here for an equivalent result.
> I'm not strongly opinionated on either way, but using skip_spaces()
> indicates more clearly that this can only remove spaces at the start.
Same I can add to v2, I failed to notice it wasn't allowed in the
original logic.
Thank you for the review.
Best,
>
> Best,
> Natalie
>
> > + err = dmemcg_parse_limit(buf, &new_limit);
> > + if (err < 0)
> > + goto out_put;
> > - pool = get_cg_pool_unlocked(dmemcs, region);
> > - if (IS_ERR(pool)) {
> > - err = PTR_ERR(pool);
> > - goto out_put;
> > - }
> > + pool = get_cg_pool_unlocked(dmemcs, region);
> > + if (IS_ERR(pool)) {
> > + err = PTR_ERR(pool);
> > + goto out_put;
> > + }
> > - /* And commit */
> > - apply(pool, new_limit);
> > - dmemcg_pool_put(pool);
> > + apply(pool, new_limit);
> > + dmemcg_pool_put(pool);
> > out_put:
> > - kref_put(®ion->ref, dmemcg_free_region);
> > - }
> > -
> > + kref_put(®ion->ref, dmemcg_free_region);
> > return err ?: nbytes;
> > }
> >
> > ---
> > base-commit: 640c57d6ca1346a1c2363a3f473b405af979e046
> > change-id: 20260605-cgroup-dmem-write-single-region-9bf05b6d995d
> >
> > Best regards,
>
--
Eric Chanudet