On Tue, Mar 31, 2026 at 01:12:27PM +0200, Marco Elver wrote:
> Rework the general infrastructure around RANDOM_KMALLOC_CACHES into more
> flexible PARTITION_KMALLOC_CACHES, with the former being a partitioning
> mode of the latter.
> 
> Introduce a new mode, TYPED_KMALLOC_CACHES, which leverages a feature
> available in Clang 22 and later, called "allocation tokens" via
> __builtin_infer_alloc_token [1]. Unlike RANDOM_KMALLOC_CACHES, this mode
> deterministically assigns a slab cache to an allocation of type T,
> regardless of allocation site.
> 
> The builtin __builtin_infer_alloc_token(<malloc-args>, ...) instructs
> the compiler to infer an allocation type from arguments commonly passed
> to memory-allocating functions and returns a type-derived token ID. The
> implementation passes kmalloc-args to the builtin: the compiler performs
> best-effort type inference, and then recognizes common patterns such as
> `kmalloc(sizeof(T), ...)`, `kmalloc(sizeof(T) * n, ...)`, but also
> `(T *)kmalloc(...)`. Where the compiler fails to infer a type the
> fallback token (default: 0) is chosen.
> 
> Note: kmalloc_obj(..) APIs fix the pattern how size and result type are
> expressed, and therefore ensures there's not much drift in which
> patterns the compiler needs to recognize. Specifically, kmalloc_obj()
> and friends expand to `(TYPE *)KMALLOC(__obj_size, GFP)`, which the
> compiler recognizes via the cast to TYPE*.
> 
> Clang's default token ID calculation is described as [1]:
> 
>    typehashpointersplit: This mode assigns a token ID based on the hash
>    of the allocated type's name, where the top half ID-space is reserved
>    for types that contain pointers and the bottom half for types that do
>    not contain pointers.
> 
> Separating pointer-containing objects from pointerless objects and data
> allocations can help mitigate certain classes of memory corruption
> exploits [2]: attackers who gains a buffer overflow on a primitive
> buffer cannot use it to directly corrupt pointers or other critical
> metadata in an object residing in a different, isolated heap region.
> 
> It is important to note that heap isolation strategies offer a
> best-effort approach, and do not provide a 100% security guarantee,
> albeit achievable at relatively low performance cost. Note that this
> also does not prevent cross-cache attacks, and SLAB_VIRTUAL [3] should
> be used as a complementary mitigation (once available).
> 
> With all that, my kernel (x86 defconfig) shows me a histogram of slab
> cache object distribution per /proc/slabinfo (after boot):
> 
>   <slab cache>      <objs> <hist>
>   kmalloc-part-15    1537  +++++++++++++++
>   kmalloc-part-14    2996  +++++++++++++++++++++++++++++
>   kmalloc-part-13    1555  +++++++++++++++
>   kmalloc-part-12    1045  ++++++++++
>   kmalloc-part-11    1717  +++++++++++++++++
>   kmalloc-part-10    1489  ++++++++++++++
>   kmalloc-part-09     851  ++++++++
>   kmalloc-part-08     710  +++++++
>   kmalloc-part-07     100  +
>   kmalloc-part-06     217  ++
>   kmalloc-part-05     105  +
>   kmalloc-part-04    4047  ++++++++++++++++++++++++++++++++++++++++
>   kmalloc-part-03     276  ++
>   kmalloc-part-02     283  ++
>   kmalloc-part-01     316  +++
>   kmalloc            1599  +++++++++++++++
> 
> The above /proc/slabinfo snapshot shows me there are 6943 allocated
> objects (slabs 00 - 07) that the compiler claims contain no pointers or
> it was unable to infer the type of, and 11900 objects that contain
> pointers (slabs 08 - 15). On a whole, this looks relatively sane.
> 
> Additionally, when I compile my kernel with -Rpass=alloc-token, which
> provides diagnostics where (after dead-code elimination) type inference
> failed, I see 179 allocation sites where the compiler failed to identify
> a type (down from 966 when I sent the RFC [4]). Some initial review
> confirms these are mostly variable sized buffers, but also include
> structs with trailing flexible length arrays.
> 
> Link: https://clang.llvm.org/docs/AllocToken.html [1]
> Link: https://blog.dfsec.com/ios/2025/05/30/blasting-past-ios-18/ [2]
> Link: https://lwn.net/Articles/944647/ [3]
> Link: https://lore.kernel.org/all/[email protected]/ 
> [4]
> Link: 
> https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434
> Signed-off-by: Marco Elver <[email protected]>
> ---
> Changelog:
> v1:
> * Rebase and switch to builtin name that was released in Clang 22.

> * Keep RANDOM_KMALLOC_CACHES the default.

Presumably because only the latest Clang supports it?

> RFC: https://lore.kernel.org/all/[email protected]/
> ---
>  Makefile                        |  5 ++
>  include/linux/percpu.h          |  2 +-
>  include/linux/slab.h            | 94 ++++++++++++++++++++-------------
>  kernel/configs/hardening.config |  2 +-
>  mm/Kconfig                      | 45 ++++++++++++----
>  mm/kfence/kfence_test.c         |  4 +-
>  mm/slab.h                       |  4 +-
>  mm/slab_common.c                | 48 ++++++++---------
>  mm/slub.c                       | 31 +++++------
>  9 files changed, 144 insertions(+), 91 deletions(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 15a60b501b95..c0bf00ee6025 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -864,10 +877,10 @@ unsigned int kmem_cache_sheaf_size(struct slab_sheaf 
> *sheaf);
>   * with the exception of kunit tests
>   */
>  
> -void *__kmalloc_noprof(size_t size, gfp_t flags)
> +void *__kmalloc_noprof(size_t size, gfp_t flags, kmalloc_token_t token)
>                               __assume_kmalloc_alignment __alloc_size(1);
>  
> -void *__kmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int 
> node)
> +void *__kmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int 
> node, kmalloc_token_t token)
>                               __assume_kmalloc_alignment __alloc_size(1);

So the @token parameter is unused when CONFIG_PARTITION_KMALLOC_CACHES is
disabled but still increases the kernel size by a few kilobytes...
but yeah I'm not sure if we can get avoid it without hurting readability.

Just saying. (does anybody care?)

>  void *__kmalloc_cache_noprof(struct kmem_cache *s, gfp_t flags, size_t size)

> diff --git a/mm/Kconfig b/mm/Kconfig
> index ebd8ea353687..fa4ffc1fcb80 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -247,22 +247,47 @@ config SLUB_STATS
>         out which slabs are relevant to a particular load.
>         Try running: slabinfo -DA
>  
> -config RANDOM_KMALLOC_CACHES
> -     default n
> +config PARTITION_KMALLOC_CACHES
>       depends on !SLUB_TINY
> -     bool "Randomize slab caches for normal kmalloc"
> +     bool "Partitioned slab caches for normal kmalloc"
>       help
> -       A hardening feature that creates multiple copies of slab caches for
> -       normal kmalloc allocation and makes kmalloc randomly pick one based
> -       on code address, which makes the attackers more difficult to spray
> -       vulnerable memory objects on the heap for the purpose of exploiting
> -       memory vulnerabilities.
> +       A hardening feature that creates multiple isolated copies of slab
> +       caches for normal kmalloc allocations. This makes it more difficult
> +       to exploit memory-safety vulnerabilities by attacking vulnerable
> +       co-located memory objects. Several modes are provided.
>  
>         Currently the number of copies is set to 16, a reasonably large value
>         that effectively diverges the memory objects allocated for different
>         subsystems or modules into different caches, at the expense of a
> -       limited degree of memory and CPU overhead that relates to hardware and
> -       system workload.
> +       limited degree of memory and CPU overhead that relates to hardware
> +       and system workload.
> +
> +choice
> +     prompt "Partitioned slab cache mode"
> +     depends on PARTITION_KMALLOC_CACHES
> +     default RANDOM_KMALLOC_CACHES
> +     help
> +       Selects the slab cache partitioning mode.
> +
> +config RANDOM_KMALLOC_CACHES
> +     bool "Randomize slab caches for normal kmalloc"
> +     help
> +       Randomly pick a slab cache based on code address.
> +
> +config TYPED_KMALLOC_CACHES
> +     bool "Type based slab cache selection for normal kmalloc"
> +     depends on $(cc-option,-falloc-token-max=123)
> +     help
> +       Rely on Clang's allocation tokens to choose a slab cache, where token
> +       IDs are derived from the allocated type.
> +
> +       The current effectiveness of Clang's type inference can be judged by
> +       -Rpass=alloc-token, which provides diagnostics where (after dead-code
> +       elimination) type inference failed.
> +
> +       Requires Clang 22 or later.

Assuming not all people building the kernel are security experts...
(including myself) could you please add some insights/guidance on how to
decide between RANDOM_KMALLOC_CACHES and TYPED_KMALLOC_CACHES?

Something like what Florent wrote [1]:                                          
| One more perspective on this: in a data center environment, attackers
| typically get a first foothold by compromising a userspace network
| service. If they can do that once, they can do that a bunch of times,
| and gain code execution on different machines every time.
| 
| Before trying to exploit a kernel memory corruption to elevate
| privileges on a machine, they can test the SLAB properties of the
| running kernel to make sure it's as they wish (eg: with timing side
| channels like in the SLUBStick paper). So with RANDOM_KMALLOC_CACHES,
| attackers can just keep retrying their attacks until they land on a
| machine where the types T and S are collocated and only then proceed
| with their exploit.
| 
| With TYPED_KMALLOC_CACHES (and with SLAB_VIRTUAL hopefully someday),
| they are simply never able to cross the "objects without pointers" to
| "objects with pointers" boundary which really gets in the way of many
| exploitation techniques and feels at least to me like a much stronger
| security boundary.
| 
| This limit of RANDOM_KMALLOC_CACHES may not be as relevant in other
| deployments (eg: on a smartphone) but it makes me strongly prefer
| TYPED_KMALLOC_CACHES for server use cases at least.

[1] 
https://lore.kernel.org/all/calgbs4u6fox7swmdhfduawmowfqeqsxta1x_vqrxthpss-s...@mail.gmail.com

Otherwise the patch is really straightforward and looks good to me.

Thanks!

-- 
Cheers,
Harry / Hyeonggon

Reply via email to