On Wed, Feb 11, 2026 at 2:41 AM Kees Cook <[email protected]> wrote:
> On Wed, Feb 11, 2026 at 02:28:53AM +0100, Jann Horn wrote:
> > On Wed, Feb 11, 2026 at 1:48 AM Kees Cook <[email protected]> wrote:
> > > A building block of Use-After-Free heap memory corruption attacks is
> > > using userspace controllable kernel allocations to fill specifically sized
> > > kmalloc regions with specific contents. The most powerful of these kinds
> > > of primitives is arbitrarily controllable contents with arbitrary size.
> > > Keeping these kinds of allocations out of the general kmalloc buckets is
> > > needed to harden the kernel against such manipulations, so this is why
> > > these sorts of "copy data from userspace into kernel heap" situations are
> > > expected to use things like memdup_user(), which keeps the allocations
> > > in their own set of slab buckets. However, using memdup_user() is not
> > > always appropriate, so in those cases, kmem_buckets can used directly.
> > >
> > > Filenames used to be isolated in their own (fixed size) slab cache so
> > > they would not end up in the general kmalloc buckets (which made them
> > > unusable for the heap grooming method described above). After commit
> > > 8c888b31903c ("struct filename: saner handling of long names"), filenames
> > > were being copied into arbitrarily sized kmalloc regions in the general
> > > kmalloc buckets. Instead, like memdup_user(), use a dedicated set of
> > > kmem buckets for long filenames so we do not introduce a new way for
> > > attackers to place arbitrary contents into the general kmalloc buckets.
> > >
> > > Fixes: 8c888b31903c ("struct filename: saner handling of long names")
> > > Signed-off-by: Kees Cook <[email protected]>
> > > ---
> > > Also, from the same commit, is the loss of SLAB_HWCACHE_ALIGN|SLAB_PANIC
> > > for filename allocations relavant at all? It could be added back for
> > > these buckets if desired, but I left it default in this patch.
> > >
> > > Cc: Al Viro <[email protected]>
> > > Cc: Christian Brauner <[email protected]>
> > > Cc: Jan Kara <[email protected]>
> > > Cc: <[email protected]>
> > > ---
> > >  fs/namei.c | 10 +++++++---
> > >  1 file changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/fs/namei.c b/fs/namei.c
> > > index 8e7792de0000..a901733380cd 100644
> > > --- a/fs/namei.c
> > > +++ b/fs/namei.c
> > > @@ -128,6 +128,8 @@
> > >  /* SLAB cache for struct filename instances */
> > >  static struct kmem_cache *__names_cache __ro_after_init;
> > >  #define names_cache    runtime_const_ptr(__names_cache)
> > > +/* SLAB buckets for long names */
> > > +static kmem_buckets *names_buckets __ro_after_init;
> > >
> > >  void __init filename_init(void)
> > >  {
> > > @@ -135,6 +137,8 @@ void __init filename_init(void)
> > >                          SLAB_HWCACHE_ALIGN|SLAB_PANIC, offsetof(struct 
> > > filename, iname),
> > >                          EMBEDDED_NAME_MAX, NULL);
> > >         runtime_const_init(ptr, __names_cache);
> > > +
> > > +       names_buckets = kmem_buckets_create("names_bucket", 0, 0, 
> > > PATH_MAX, NULL);
> > >  }
> > >
> > >  static inline struct filename *alloc_filename(void)
> > > @@ -156,7 +160,7 @@ static inline void initname(struct filename *name)
> > >  static int getname_long(struct filename *name, const char __user 
> > > *filename)
> > >  {
> > >         int len;
> > > -       char *p __free(kfree) = kmalloc(PATH_MAX, GFP_KERNEL);
> > > +       char *p __free(kfree) = kmem_buckets_alloc(names_buckets, 
> > > PATH_MAX, GFP_KERNEL);
> > >         if (unlikely(!p))
> > >                 return -ENOMEM;
> >
> > I think this path, where we always do maximally-sized allocations, is
> > the normal case where we're handling paths coming from userspace...
>
> Actually, is there any reason we can't use strnlen_user() in
> do_getname(), and then just use strndup_user() in the long case?

I'm not an expert, but as far as I know, this path is supposed to be
really fast (because pretty much every syscall that operates on a path
will hit it), and doesn't care how much memory it allocates (because
these allocations are normally only alive for the duration of a
syscall). strnlen_user() would add another pass over the userspace
buffer, which I think would probably have negative performance impact?

> > > @@ -264,14 +268,14 @@ static struct filename *do_getname_kernel(const 
> > > char *filename, bool incomplete)
> > >
> > >         if (len <= EMBEDDED_NAME_MAX) {
> > >                 p = (char *)result->iname;
> > > -               memcpy(p, filename, len);
> > >         } else {
> > > -               p = kmemdup(filename, len, GFP_KERNEL);
> > > +               p = kmem_buckets_alloc(names_buckets, len, GFP_KERNEL);
> >
> > ... while this is kind of the exceptional case, where paths are coming
> > from kernelspace.
> >
> > So you might want to get rid of the bucketing and instead just create
> > a single kmem_cache for long paths.
>
> I wasn't sure if there was a controllable way to reach this case or not.

I don't understand your point. I'm suggesting that in both of these
two cases, you can just allocate from the same dedicated slab.

But yes, you can get controlled data into the filename passed to
do_getname_kernel() - for example, when you execute a script (a file
that starts with "#!"), the interpreter path after "#!" is passed from
load_script() to open_exec(), which uses CLASS(filename_kernel, ...).

> > By the way, did you know that "struct filename" is only used for
> > storing fairly-temporary stuff like paths supplied to open(), but not
> > for storing the names of directory entries in the dentry cache (which
> > are more long-lived)? My understanding is that this is also why the
> > kernel doesn't really try to optimize the size of "struct filename" -
> > almost all instances of it only exist for the duration of a syscall or
> > something like that.
>
> Right, but it was enough of a location change that I felt like it was
> worth fixing it up.
>
> > The dentry cache allocates long names as "struct external_name" in
> > reclaimable kmalloc slabs, see __d_alloc().
>
> Oh hey, look at that!
>
>                 struct external_name *p = kmalloc(size + name->len,
>                                                   GFP_KERNEL_ACCOUNT |
>                                                   __GFP_RECLAIMABLE);
>
> Yeah, let's put that into dedicated buckets instead?

Actually, looking around a bit, there really aren't that many
allocations with __GFP_RECLAIMABLE, so this probably isn't all that
useful for same-cache attacks. (To be clear: Anything with
__GFP_RECLAIMABLE goes in the special kmalloc-rcl-* slabs.) Looking
around, the only other kmalloc users of __GFP_RECLAIMABLE I see are:

 - alloc_buffer_data() in drivers/md/dm-bufio.c
 - fuse_dentry_init()

So I don't think there's anything to be done here, this is already
using quasi-dedicated buckets. Sorry for the distraction.

Reply via email to