Re: [PATCH] remove AND operation in choose_random_kstack_offset()

2024-06-17 Thread Kees Cook
On Mon, Jun 17, 2024 at 10:33:08PM +0200, Arnd Bergmann wrote:
> On Mon, Jun 17, 2024, at 20:22, Kees Cook wrote:
> > On Mon, Jun 17, 2024 at 04:52:15PM +0100, Mark Rutland wrote:
> >> On Mon, Jun 17, 2024 at 01:37:21PM +, Yuntao Liu wrote:
> >> > Since the offset would be bitwise ANDed with 0x3FF in
> >> > add_random_kstack_offset(), so just remove AND operation here.
> >> > 
> >> > Signed-off-by: Yuntao Liu 
> >> 
> >> The comments in arm64 and x86 say that they're deliberately capping the
> >> offset at fewer bits than the result of KSTACK_OFFSET_MAX() masking the
> >> value with 0x3FF.
> >> 
> >> Maybe it's ok to expand that, but if that's the case the commit message
> >> needs to explain why it's safe add extra bits (2 on arm64, 3 on s39 and
> >> x86), and those comments need to be updated accordingly.
> >> 
> >> As-is, I do not think this patch is ok.
> >
> > Yeah, I agree: the truncation is intentional and tuned to the
> > architecture.
> 
> It may be intentional, but it's clearly nonsense: there is nothing
> inherent to the architecture that means we have can go only 256
> bytes instead of 512 bytes into the 16KB available stack space.
> 
> As far as I can tell, any code just gets bloated to the point
> where it fills up the available memory, regardless of how
> much you give it. I'm sure one can find code paths today that
> exceed the 16KB, so there is no point pretending that 15.75KB
> is somehow safe to use while 15.00KB is not.
> 
> I'm definitely in favor of making this less architecture
> specific, we just need to pick a good value, and we may well
> end up deciding to use less than the default 1KB. We can also
> go the opposite way and make the limit 4KB but then increase
> the default stack size to 20KB for kernels that enable
> randomization.

I'm all for more entropy, but arch maintainers had wanted specific
control over this value, and given the years of bikeshedding over the
feature, I'm not inclined dive back into that debate, but okay.

FWIW, the here's the actual entropy (due to stack alignment enforced by
the compiler for the given arch ABI).

standard cap is 0x3FF (10 bits).

arm64: capped at 0x1FF (9 bits), 5 bits effective
powerpc: uncapped (10 bits), 6 or 7 bits effective
riscv: uncapped (10 bits), 6 bits effective
x86: capped at 0xFF (8 bits), 5 (x86_64) or 6 (ia32) bits effective
s390: capped at 0xFF (8 bits), undocumented effective entropy

So if x86, arm64, and s390 maintainers are okay with it, we can try
dropping the masks on those architectures. They would gain 2, 1, and 2
bits respectively.

-Kees

-- 
Kees Cook



Re: [PATCH v2] x86/traps: Enable UBSAN traps on x86

2024-06-17 Thread Kees Cook
On Tue, Jun 18, 2024 at 12:13:27AM +0200, Thomas Gleixner wrote:
> On Wed, Jun 12 2024 at 11:42, Kees Cook wrote:
> > On Tue, Jun 11, 2024 at 01:26:09PM -0700, Gatlin Newhouse wrote:
> >> It seems that is_valid_bugaddr() needs to be implemented on all 
> >> architectures
> >> and the function get_ud_type() replaces it here. So how should the patch 
> >> handle
> >> is_valid_bugaddr()? Should the function remain as-is in traps.c despite no
> >> longer being used?
> >
> > Yeah, this is why I'd suggested to Gatlin in early designs to reuse
> > is_valid_bugaddr()'s int value. It's a required function, so it seemed
> > sensible to just repurpose it from yes/no to no/type1/type2/type3/etc.
> 
> It's not sensible, it's just tasteless.
> 
> If is_valid_bugaddr() is globaly required in it's boolean form then it
> should just stay that way and not be abused just because it can be
> abused.
> 
> What's wrong with doing:
> 
> __always_inline u16 get_ud_type(unsigned long addr)
> {
> 
> }
> 
> int is_valid_bugaddr(unsigned long addr)
> {
>   return get_ud_type() != BUG_UD_NONE;
> }
> 
> Hmm?
> 
> In fact is_valid_bugaddr() should be globally fixed up to return bool to
> match what the function name suggests.
> 
> The UD type information is x86 specific and has zero business in a
> generic architecture agnostic function return value.
> 
> It's a sad state of affairs that I have to explain this to people who
> care about code correctness. Readability and consistency are substantial
> parts of correctness, really.

Well, it's trade-offs. If get_ud_type() is in is_valid_bugaddr(), we
have to call it _again_ outside of is_valid_bugaddr(). That's suboptimal
as well. I was trying to find a reasonable way to avoid refactoring all
architectures and to avoid code code.

Looking at it all again, I actually think arch/x86/kernel/traps.c
shouldn't call is_valid_bugaddr() at all. That usage can continue to
stay in lib/bug.c, which is only ever used by x86 during very early
boot, according to the comments in early_fixup_exception(). So just a
direct replacement of is_valid_bugaddr() with the proposed get_ud_type()
should be fine in arch/x86/kernel/traps.c.

What do you think?

-- 
Kees Cook



Re: [PATCH] init/Kconfig: extend -Wno-array-bounds to gcc 13

2024-06-17 Thread Kees Cook
d,
 
/* For each scheduling entry */
for (i = 0; i < sg->num_entries; i++) {
+   if (WARN_ON_ONCE(i >= ARRAY_SIZE(sg->gce)))
+   break;
gce = >gce[i];
ips = sparx5_psfp_ipv_to_ips(gce->ipv);
/* hardware needs TimeInterval to be cumulative */


-- 
Kees Cook



Re: [PATCH v3 2/2] usercopy: Convert test_user_copy to KUnit test

2024-06-17 Thread Kees Cook
On Fri, Jun 14, 2024 at 09:50:05AM -0600, Shuah Khan wrote:
> On 6/12/24 13:59, Kees Cook wrote:
> > Convert the runtime tests of hardened usercopy to standard KUnit tests.
> > 
> > Additionally disable usercopy_test_invalid() for systems with separate
> > address spaces (or no MMU) since it's not sensible to test for address
> > confusion there (e.g. m68k).
> > 
> > Co-developed-by: Vitor Massaru Iha 
> > Signed-off-by: Vitor Massaru Iha 
> > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org
> > Tested-by: Ivan Orlov 
> > Reviewed-by: David Gow 
> > Signed-off-by: Kees Cook 
> > ---
> >   MAINTAINERS|   1 +
> >   lib/Kconfig.debug  |  21 +-
> >   lib/Makefile   |   2 +-
> >   lib/{test_user_copy.c => usercopy_kunit.c} | 282 ++---
> >   4 files changed, 151 insertions(+), 155 deletions(-)
> >   rename lib/{test_user_copy.c => usercopy_kunit.c} (46%)
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 8754ac2c259d..0cd171ec6010 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -11962,6 +11962,7 @@ F:  arch/*/configs/hardening.config
> >   F:include/linux/overflow.h
> >   F:include/linux/randomize_kstack.h
> >   F:kernel/configs/hardening.config
> > +F: lib/usercopy_kunit.c
> >   F:mm/usercopy.c
> >   K:\b(add|choose)_random_kstack_offset\b
> >   K:\b__check_(object_size|heap_object)\b
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 59b6765d86b8..561e346f5cb0 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -2505,18 +2505,6 @@ config TEST_VMALLOC
> >   If unsure, say N.
> > -config TEST_USER_COPY
> > -   tristate "Test user/kernel boundary protections"
> > -   depends on m
> > -   help
> > - This builds the "test_user_copy" module that runs sanity checks
> > - on the copy_to/from_user infrastructure, making sure basic
> > - user/kernel boundary testing is working. If it fails to load,
> > - a regression has been detected in the user/kernel memory boundary
> > - protections.
> > -
> > - If unsure, say N.
> > -
> >   config TEST_BPF
> > tristate "Test BPF filter functionality"
> > depends on m && NET
> > @@ -2814,6 +2802,15 @@ config SIPHASH_KUNIT_TEST
> >   This is intended to help people writing architecture-specific
> >   optimized versions.  If unsure, say N.
> > +config USERCOPY_KUNIT_TEST
> > +   tristate "KUnit Test for user/kernel boundary protections"
> > +   depends on KUNIT
> > +   default KUNIT_ALL_TESTS
> > +   help
> > + This builds the "usercopy_kunit" module that runs sanity checks
> > + on the copy_to/from_user infrastructure, making sure basic
> > + user/kernel boundary testing is working.
> > +
> 
> Please carry the following line forward as well to be complete assuming
> it is relevant.
> 
> If unsure, say N.

I've explicitly removed that because it would be repetitive if it were
included for all KUnit tests.

-- 
Kees Cook



Re: [PATCH] pstore: platform: add missing MODULE_DESCRIPTION() macro

2024-06-17 Thread Kees Cook
On Thu, 13 Jun 2024 21:39:42 -0700, Jeff Johnson wrote:
> With ARCH=csky, make allmodconfig && make W=1 C=1 reports:
> WARNING: modpost: missing MODULE_DESCRIPTION() in fs/pstore/pstore.o
> 
> Add the missing invocation of the MODULE_DESCRIPTION() macro.
> 
> 

Applied to for-next/pstore, thanks!

[1/1] pstore: platform: add missing MODULE_DESCRIPTION() macro
  https://git.kernel.org/kees/c/9b3c13c9ea4e

Take care,

-- 
Kees Cook




[PATCH] MAINTAINERS: Update entries for Kees Cook

2024-06-17 Thread Kees Cook
Update current email address for Kees Cook in the MAINTAINER file to
match the change from commit 4e173c825b19 ("mailmap: update entry for
Kees Cook").

Signed-off-by: Kees Cook 
---
 MAINTAINERS | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8754ac2c259d..f601a2fd1ebf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5296,7 +5296,7 @@ F:drivers/infiniband/hw/usnic/
 
 CLANG CONTROL FLOW INTEGRITY SUPPORT
 M: Sami Tolvanen 
-M: Kees Cook 
+M: Kees Cook 
 R: Nathan Chancellor 
 L: l...@lists.linux.dev
 S: Supported
@@ -8212,7 +8212,7 @@ F:rust/kernel/net/phy.rs
 
 EXEC & BINFMT API, ELF
 R: Eric Biederman 
-R: Kees Cook 
+R: Kees Cook 
 L: linux...@kvack.org
 S: Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
for-next/execve
@@ -8613,7 +8613,7 @@ S:Maintained
 F: drivers/net/ethernet/nvidia/*
 
 FORTIFY_SOURCE
-M: Kees Cook 
+M: Kees Cook 
 L: linux-hardening@vger.kernel.org
 S: Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
for-next/hardening
@@ -9103,7 +9103,7 @@ F:include/linux/mfd/gsc.h
 F: include/linux/platform_data/gsc_hwmon.h
 
 GCC PLUGINS
-M: Kees Cook 
+M: Kees Cook 
 L: linux-hardening@vger.kernel.org
 S: Maintained
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
for-next/hardening
@@ -9237,7 +9237,7 @@ S:Maintained
 F: drivers/input/touchscreen/resistive-adc-touch.c
 
 GENERIC STRING LIBRARY
-M: Kees Cook 
+M: Kees Cook 
 R: Andy Shevchenko 
 L: linux-hardening@vger.kernel.org
 S: Supported
@@ -11951,7 +11951,7 @@ F:  scripts/package/
 F: usr/
 
 KERNEL HARDENING (not covered by other areas)
-M: Kees Cook 
+M: Kees Cook 
 R: Gustavo A. R. Silva 
 L: linux-hardening@vger.kernel.org
 S: Supported
@@ -12479,7 +12479,7 @@ F:  drivers/scsi/53c700*
 
 LEAKING_ADDRESSES
 M: Tycho Andersen 
-R: Kees Cook 
+R: Kees Cook 
 L: linux-hardening@vger.kernel.org
 S: Maintained
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
for-next/hardening
@@ -12775,7 +12775,7 @@ F:  arch/powerpc/platforms/8xx/
 F: arch/powerpc/platforms/83xx/
 
 LINUX KERNEL DUMP TEST MODULE (LKDTM)
-M: Kees Cook 
+M: Kees Cook 
 S: Maintained
 F: drivers/misc/lkdtm/*
 F: tools/testing/selftests/lkdtm/*
@@ -12905,7 +12905,7 @@ Q:  
http://patchwork.linuxtv.org/project/linux-media/list/
 F: drivers/media/usb/dvb-usb-v2/lmedm04*
 
 LOADPIN SECURITY MODULE
-M: Kees Cook 
+M: Kees Cook 
 S: Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
for-next/hardening
 F: Documentation/admin-guide/LSM/LoadPin.rst
@@ -17998,7 +17998,7 @@ F:  tools/testing/selftests/proc/
 
 PROC SYSCTL
 M: Luis Chamberlain 
-M: Kees Cook 
+M: Kees Cook 
 M: Joel Granados 
 L: linux-ker...@vger.kernel.org
 L: linux-fsde...@vger.kernel.org
@@ -18054,7 +18054,7 @@ F:  Documentation/devicetree/bindings/net/pse-pd/
 F: drivers/net/pse-pd/
 
 PSTORE FILESYSTEM
-M: Kees Cook 
+M: Kees Cook 
 R: Tony Luck 
 R: Guilherme G. Piccoli 
 L: linux-hardening@vger.kernel.org
@@ -20060,7 +20060,7 @@ F:  drivers/media/cec/platform/seco/seco-cec.c
 F: drivers/media/cec/platform/seco/seco-cec.h
 
 SECURE COMPUTING
-M: Kees Cook 
+M: Kees Cook 
 R: Andy Lutomirski 
 R: Will Drewry 
 S: Supported
@@ -22974,7 +22974,7 @@ F:  drivers/block/ublk_drv.c
 F: include/uapi/linux/ublk_cmd.h
 
 UBSAN
-M: Kees Cook 
+M: Kees Cook 
 R: Marco Elver 
 R: Andrey Konovalov 
 R: Andrey Ryabinin 
@@ -24812,7 +24812,7 @@ F:  drivers/net/hamradio/yam*
 F: include/linux/yam.h
 
 YAMA SECURITY MODULE
-M: Kees Cook 
+M: Kees Cook 
 S: Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
for-next/hardening
 F: Documentation/admin-guide/LSM/Yama.rst
-- 
2.34.1




Re: [PATCH] remove AND operation in choose_random_kstack_offset()

2024-06-17 Thread Kees Cook
On Mon, Jun 17, 2024 at 04:52:15PM +0100, Mark Rutland wrote:
> On Mon, Jun 17, 2024 at 01:37:21PM +, Yuntao Liu wrote:
> > Since the offset would be bitwise ANDed with 0x3FF in
> > add_random_kstack_offset(), so just remove AND operation here.
> > 
> > Signed-off-by: Yuntao Liu 
> 
> The comments in arm64 and x86 say that they're deliberately capping the
> offset at fewer bits than the result of KSTACK_OFFSET_MAX() masking the
> value with 0x3FF.
> 
> Maybe it's ok to expand that, but if that's the case the commit message
> needs to explain why it's safe add extra bits (2 on arm64, 3 on s39 and
> x86), and those comments need to be updated accordingly.
> 
> As-is, I do not think this patch is ok.

Yeah, I agree: the truncation is intentional and tuned to the
architecture.

-- 
Kees Cook



Re: [PATCH] powerpc/pseries: Whitelist dtl slub object for copying to userspace

2024-06-17 Thread Kees Cook
On Fri, Jun 14, 2024 at 11:08:44PM +0530, Anjali K wrote:
> Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-*
> results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as
> shown below.
> 
> kernel BUG at mm/usercopy.c:102!
> Oops: Exception in kernel mode, sig: 5 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc
> scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror 
> dm_region_hash dm_log dm_mod fuse
> CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85
> Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf06 
> of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries
> NIP:  c05d23d4 LR: c05d23d0 CTR: 006ee6f8
> REGS: c00120c078c0 TRAP: 0700   Not tainted  (6.10.0-rc3)
> MSR:  80029033   CR: 2828220f  XER: 000e
> CFAR: c01fdc80 IRQMASK: 0
> [ ... GPRs omitted ... ]
> NIP [c05d23d4] usercopy_abort+0x78/0xb0
> LR [c05d23d0] usercopy_abort+0x74/0xb0
> Call Trace:
>  usercopy_abort+0x74/0xb0 (unreliable)
>  __check_heap_object+0xf8/0x120
>  check_heap_object+0x218/0x240
>  __check_object_size+0x84/0x1a4
>  dtl_file_read+0x17c/0x2c4
>  full_proxy_read+0x8c/0x110
>  vfs_read+0xdc/0x3a0
>  ksys_read+0x84/0x144
>  system_call_exception+0x124/0x330
>  system_call_vectored_common+0x15c/0x2ec
> --- interrupt: 3000 at 0x7fff81f3ab34
> 
> Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0")
> requires that only whitelisted areas in slab/slub objects can be copied to
> userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY.
> Dtl contains hypervisor dispatch events which are expected to be read by
> privileged users. Hence mark this safe for user access.
> Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the
> entire object.
> 
> Co-developed-by: Vishal Chourasia 
> Signed-off-by: Vishal Chourasia 
> Signed-off-by: Anjali K 
> ---
>  arch/powerpc/platforms/pseries/setup.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/setup.c 
> b/arch/powerpc/platforms/pseries/setup.c
> index 284a6fa04b0c..cba40d9d1284 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -343,8 +343,8 @@ static int alloc_dispatch_log_kmem_cache(void)
>  {
>   void (*ctor)(void *) = get_dtl_cache_ctor();
>  
> - dtl_cache = kmem_cache_create("dtl", DISPATCH_LOG_BYTES,
> - DISPATCH_LOG_BYTES, 0, ctor);
> + dtl_cache = kmem_cache_create_usercopy("dtl", DISPATCH_LOG_BYTES,
> + DISPATCH_LOG_BYTES, 0, 0, 
> DISPATCH_LOG_BYTES, ctor);
>   if (!dtl_cache) {
>   pr_warn("Failed to create dispatch trace log buffer cache\n");
>   pr_warn("Stolen time statistics will be unreliable\n");

Are you sure you want to universally expose this memory region? It
sounds like it's only exposed via a debug interface. Maybe it'd be
better to use a bounce buffer in the debug interface instead?


diff --git a/arch/powerpc/platforms/pseries/dtl.c 
b/arch/powerpc/platforms/pseries/dtl.c
index 3f1cdccebc9c..3adcff5cc4b2 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -257,6 +257,22 @@ static int dtl_file_release(struct inode *inode, struct 
file *filp)
return 0;
 }
 
+static inline int bounce_copy(char __user *buf, void *src, size_t size)
+{
+   u8 *bounce;
+   int rc;
+
+   bounce = kmalloc(size, GFP_KERNEL);
+   if (!bounce)
+   return -ENOMEM;
+
+   memcpy(bounce, src, size);
+   rc = copy_to_user(buf, bounce, size);
+
+   kfree(bounce);
+   return rc;
+}
+
 static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
loff_t *pos)
 {
@@ -300,7 +316,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user 
*buf, size_t len,
if (i + n_req > dtl->buf_entries) {
read_size = dtl->buf_entries - i;
 
-   rc = copy_to_user(buf, >buf[i],
+   rc = bounce_copy(buf, >buf[i],
read_size * sizeof(struct dtl_entry));
if (rc)
return -EFAULT;
@@ -312,7 +328,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user 
*buf, size_t len,
}
 
/* .. and now the head */
-   rc = copy_to_user(buf, >buf[i], n_req * sizeof(struct dtl_entry));
+   rc = bounce_copy(buf, >buf[i], n_req * sizeof(struct dtl_entry));
if (rc)
return -EFAULT;
 

-- 
Kees Cook



Re: [PATCH] powerpc/pseries: Whitelist dtl slub object for copying to userspace

2024-06-17 Thread Kees Cook
On Fri, Jun 14, 2024 at 11:08:44PM +0530, Anjali K wrote:
> Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-*
> results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as
> shown below.
> 
> kernel BUG at mm/usercopy.c:102!
> Oops: Exception in kernel mode, sig: 5 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc
> scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror 
> dm_region_hash dm_log dm_mod fuse
> CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85
> Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf06 
> of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries
> NIP:  c05d23d4 LR: c05d23d0 CTR: 006ee6f8
> REGS: c00120c078c0 TRAP: 0700   Not tainted  (6.10.0-rc3)
> MSR:  80029033   CR: 2828220f  XER: 000e
> CFAR: c01fdc80 IRQMASK: 0
> [ ... GPRs omitted ... ]
> NIP [c05d23d4] usercopy_abort+0x78/0xb0
> LR [c05d23d0] usercopy_abort+0x74/0xb0
> Call Trace:
>  usercopy_abort+0x74/0xb0 (unreliable)
>  __check_heap_object+0xf8/0x120
>  check_heap_object+0x218/0x240
>  __check_object_size+0x84/0x1a4
>  dtl_file_read+0x17c/0x2c4
>  full_proxy_read+0x8c/0x110
>  vfs_read+0xdc/0x3a0
>  ksys_read+0x84/0x144
>  system_call_exception+0x124/0x330
>  system_call_vectored_common+0x15c/0x2ec
> --- interrupt: 3000 at 0x7fff81f3ab34
> 
> Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0")
> requires that only whitelisted areas in slab/slub objects can be copied to
> userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY.
> Dtl contains hypervisor dispatch events which are expected to be read by
> privileged users. Hence mark this safe for user access.
> Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the
> entire object.
> 
> Co-developed-by: Vishal Chourasia 
> Signed-off-by: Vishal Chourasia 
> Signed-off-by: Anjali K 
> ---
>  arch/powerpc/platforms/pseries/setup.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/setup.c 
> b/arch/powerpc/platforms/pseries/setup.c
> index 284a6fa04b0c..cba40d9d1284 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -343,8 +343,8 @@ static int alloc_dispatch_log_kmem_cache(void)
>  {
>   void (*ctor)(void *) = get_dtl_cache_ctor();
>  
> - dtl_cache = kmem_cache_create("dtl", DISPATCH_LOG_BYTES,
> - DISPATCH_LOG_BYTES, 0, ctor);
> + dtl_cache = kmem_cache_create_usercopy("dtl", DISPATCH_LOG_BYTES,
> + DISPATCH_LOG_BYTES, 0, 0, 
> DISPATCH_LOG_BYTES, ctor);
>   if (!dtl_cache) {
>   pr_warn("Failed to create dispatch trace log buffer cache\n");
>   pr_warn("Stolen time statistics will be unreliable\n");

Are you sure you want to universally expose this memory region? It
sounds like it's only exposed via a debug interface. Maybe it'd be
better to use a bounce buffer in the debug interface instead?


diff --git a/arch/powerpc/platforms/pseries/dtl.c 
b/arch/powerpc/platforms/pseries/dtl.c
index 3f1cdccebc9c..3adcff5cc4b2 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -257,6 +257,22 @@ static int dtl_file_release(struct inode *inode, struct 
file *filp)
return 0;
 }
 
+static inline int bounce_copy(char __user *buf, void *src, size_t size)
+{
+   u8 *bounce;
+   int rc;
+
+   bounce = kmalloc(size, GFP_KERNEL);
+   if (!bounce)
+   return -ENOMEM;
+
+   memcpy(bounce, src, size);
+   rc = copy_to_user(buf, bounce, size);
+
+   kfree(bounce);
+   return rc;
+}
+
 static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
loff_t *pos)
 {
@@ -300,7 +316,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user 
*buf, size_t len,
if (i + n_req > dtl->buf_entries) {
read_size = dtl->buf_entries - i;
 
-   rc = copy_to_user(buf, >buf[i],
+   rc = bounce_copy(buf, >buf[i],
read_size * sizeof(struct dtl_entry));
if (rc)
return -EFAULT;
@@ -312,7 +328,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user 
*buf, size_t len,
}
 
/* .. and now the head */
-   rc = copy_to_user(buf, >buf[i], n_req * sizeof(struct dtl_entry));
+   rc = bounce_copy(buf, >buf[i], n_req * sizeof(struct dtl_entry));
if (rc)
return -EFAULT;
 

-- 
Kees Cook


[PATCH] kunit/usercopy: Disable testing on !CONFIG_MMU

2024-06-17 Thread Kees Cook
Since arch_pick_mmap_layout() is an inline for non-MMU systems, disable
this test there.

Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202406160505.ubge6tmy-...@intel.com/
Signed-off-by: Kees Cook 
---
Cc: Brendan Higgins 
Cc: David Gow 
Cc: Rae Moar 
Cc: "Gustavo A. R. Silva" 
Cc: Andrew Morton 
Cc: linux-kselftest@vger.kernel.org
Cc: kunit-...@googlegroups.com
Cc: linux-harden...@vger.kernel.org
Cc: linux...@kvack.org
---
 lib/kunit/user_alloc.c | 4 
 lib/usercopy_kunit.c   | 5 +
 mm/util.c  | 2 ++
 3 files changed, 11 insertions(+)

diff --git a/lib/kunit/user_alloc.c b/lib/kunit/user_alloc.c
index 76d3d1345ed7..ae935df09a5e 100644
--- a/lib/kunit/user_alloc.c
+++ b/lib/kunit/user_alloc.c
@@ -30,6 +30,10 @@ static int kunit_attach_mm(void)
if (current->mm)
return 0;
 
+   /* arch_pick_mmap_layout() is only sane with MMU systems. */
+   if (!IS_ENABLED(CONFIG_MMU))
+   return -EINVAL;
+
mm = mm_alloc();
if (!mm)
return -ENOMEM;
diff --git a/lib/usercopy_kunit.c b/lib/usercopy_kunit.c
index 45f1e558c464..e819561a540d 100644
--- a/lib/usercopy_kunit.c
+++ b/lib/usercopy_kunit.c
@@ -290,6 +290,11 @@ static int usercopy_test_init(struct kunit *test)
struct usercopy_test_priv *priv;
unsigned long user_addr;
 
+   if (!IS_ENABLED(CONFIG_MMU)) {
+   kunit_skip(test, "Userspace allocation testing not available on 
non-MMU systems");
+   return 0;
+   }
+
priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv);
test->priv = priv;
diff --git a/mm/util.c b/mm/util.c
index df37c47d9374..e70e8e439258 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -484,7 +484,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct 
rlimit *rlim_stack)
clear_bit(MMF_TOPDOWN, >flags);
 }
 #endif
+#ifdef CONFIG_MMU
 EXPORT_SYMBOL_IF_KUNIT(arch_pick_mmap_layout);
+#endif
 
 /**
  * __account_locked_vm - account locked pages to an mm's locked_vm
-- 
2.34.1




Re: [PATCH v4 0/3] Hardening perf subsystem

2024-06-17 Thread Kees Cook
On Sat, Jun 15, 2024 at 06:09:07PM +0200, Martin Uecker wrote:
> Am Freitag, dem 14.06.2024 um 12:17 +0200 schrieb Peter Zijlstra:
> > On Wed, Jun 12, 2024 at 04:23:31PM -0700, Kees Cook wrote:
> > > On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote:
> > > > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote:
> > > > > I'm happy to take patches. And for this bikeshed, this would be better
> > > > > named under the size_*() helpers which are trying to keep size_t
> > > > > calculations from overflowing (by saturating). i.e.:
> > > > > 
> > > > >   size_add_mult(sizeof(*p), sizeof(*p->member), num)
> > > > 
> > > > Fine I suppose, but what if we want something not size_t? Are we waiting
> > > > for the type system extension?
> > > 
> > > Because of C's implicit promotion/truncation, we can't do anything
> > > sanely with return values of arbitrary type size; we have to capture the
> > > lvalue type somehow so the checking can happen without C doing silent
> > > garbage.
> 
> What is the specific problem here?

This particular complaint is about not being able to "discover" the
lvalue type during an assignment, which means we can't create helper
macros that Just Work. For example, in:

void foo(some_type arg);

foo(add_mul(base, size, count));

There is no way to write "add_mul" so that it is checking for overflow
of the "some_type" type. The behavior of such a macro changes if it's
doing u8, int, size_t, etc.

The existing solution to this is to use macros (and builtins) that
cannot be composed:

add_mul(base, size, count, );
foo(result);

Here, the macro can examine the type of "result" and perform the correct
overflow handling for that type. But lots of developers (rightly)
dislike this style of "assignment side-effect".

Additionally, it can't be composed:

add(base, mul(size, count));

But, using type attributes we have much more flexibility. Hence, the
proposed "wraps" attribute:
https://github.com/llvm/llvm-project/pull/86618

> This is likely a stupid question, but making it signed 
> wouldn't work?   Or is a signed size_t too small  
> or some architectures? Or would this change break too much?

The ssize_t gets used in some places already, but since size_t is used
for address calculations too, I don't think we can universally switch
it.

-- 
Kees Cook



Re: [PATCH v4 0/3] Hardening perf subsystem

2024-06-17 Thread Kees Cook
On Fri, Jun 14, 2024 at 12:17:08PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 12, 2024 at 04:23:31PM -0700, Kees Cook wrote:
> > On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote:
> > > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote:
> > > > I'm happy to take patches. And for this bikeshed, this would be better
> > > > named under the size_*() helpers which are trying to keep size_t
> > > > calculations from overflowing (by saturating). i.e.:
> > > > 
> > > > size_add_mult(sizeof(*p), sizeof(*p->member), num)
> > > 
> > > Fine I suppose, but what if we want something not size_t? Are we waiting
> > > for the type system extension?
> > 
> > Because of C's implicit promotion/truncation, we can't do anything
> > sanely with return values of arbitrary type size; we have to capture the
> > lvalue type somehow so the checking can happen without C doing silent
> > garbage.
> 
> So sizeof() returns the native (built-in) size_t, right? If that type
> the nooverflow qualifier on, then:
> 
>   sizeof(*p) + num*sizeof(p->foo[0])
> 
> should all get the nooverflow semantics right? Because size_t is
> effectively 'nooverflow unsigned long' the multiplication should promote
> 'num' to some 'long'.

Hmmm. This is an interesting point. I'll see what Justin has found as
he's been working on limiting the overflow sanitizer to specific types.

It doesn't help this (unfortunately common) code pattern, though:

int size;

size = sizeof(*p) + num*sizeof(p->foo[0]);
p = kmalloc(size, GFP_KERNEL);

But that was going to be a problem either way.

> Now, I've re-read the rules and I don't see qualifiers mentioned, so
> can't we state that the overflow/nooverflow qualifiers are to be
> preserved on (implicit) promotion and when nooverflow and overflow are
> combined the 'safe' nooverflow takes precedence?
> 
> I mean, when we're adding qualifiers we can make up rules about them
> too, right?

Yup, that is the design of the "wraps" attribute (though it is the
reverse: it _allows_ for wrap-around, since we want to the default state
to be mitigation).

> If 'people' don't want to adorn the built-in size_t, we can always do
> something like:
> 
> #define sizeof(x) ((nooverflow unsigned long)(sizeof(x)))
> 
> and 'fix' it ourselves.

Right, though my hope is still we get the result of "nooverflow" by
marking that which was expected to overflow.

-- 
Kees Cook



Re: [PATCH v3 2/2] pstore/ramoops: Add ramoops.mem_name= command line option

2024-06-13 Thread Kees Cook



On June 13, 2024 7:06:47 AM PDT, Ard Biesheuvel  wrote:
>On Thu, 13 Jun 2024 at 15:26, Steven Rostedt  wrote:
>>
>> On Thu, 13 Jun 2024 08:11:48 +0200
>> Ard Biesheuvel  wrote:
>> > >
>> > > I've added one more comment to v5, with that fixed I can take this.
>> > >
>> >
>> > So how is this supposed to work wrt to the rigid 'no user visible
>> > regressions' rule, given that this whole thing is a best effort thing
>>
>> This has nothing to do with user space. The kernel command line has
>> broken in the past. If you update the kernel, you can update the
>> command line. There's no "no user visible regressions" rule. It's
>> "Don't break user space". This has nothing to do with user space.
>>
>> > to begin with. This needs at least a huge disclaimer that this rule
>> > does not apply, and if this works today, there is no guarantee that it
>> > will keep working on newer kernels. Otherwise, you will be making the
>> > job of the people who work on the boot code significantly more
>> > difficult. And even then, I wonder whether Linus and #regzcop are
>> > going to honour such a disclaimer.
>>
>> Again, this has nothing to do with user space. The rule Linus talks
>> about is breaking user space. This is about kernel debugging. Something
>> *completely different*!
>>
>> >
>> > So this belongs downstream, unless some guarantees can be provided
>> > that this functionality is exempt from the usual regression policies.
>>
>> I disagree. kexec/kdump also has the same issues.
>>
>
>Fair enough. As long as it is documented that there is no guarantee
>that this will keep working over a kernel upgrade, then I have no
>objections.

Yeah, I should better document this for pstore as a whole, but I've already 
made the call that cross-kernel-versison operation is best effort.

-Kees

-- 
Kees Cook



Re: [PATCH v3 0/2] usercopy: Convert test_user_copy to KUnit test

2024-06-12 Thread Kees Cook
On Thu, Jun 13, 2024 at 12:41:43PM +0800, David Gow wrote:
> On Thu, 13 Jun 2024 at 03:59, Kees Cook  wrote:
> >
> > Hi,
> >
> > This builds on the proposal[1] from Mark and lets me convert the
> > existing usercopy selftest to KUnit. Besides adding this basic test to
> > the KUnit collection, it also opens the door for execve testing (which
> > depends on having a functional current->mm), and should provide the
> > basic infrastructure for adding Mark's much more complete usercopy tests.
> >
> >  v3:
> >   - use MEMEQ KUnit helper (David)
> >   - exclude pathological address confusion test for systems with separate
> > address spaces, noticed by David
> >   - add KUnit-conditional exports for alloc_mm() and arch_pick_mmap_layout()
> > noticed by 0day
> >  v2: https://lore.kernel.org/lkml/20240610213055.it.075-k...@kernel.org/
> >  v1: https://lore.kernel.org/lkml/20240519190422.work.715-k...@kernel.org/
> >
> > -Kees
> >
> > [1] 
> > https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/
> 
> Thanks! This looks good to me (and passes everything here). Unless
> there's a compelling reason not to, I think we can take this via the
> KUnit tree.

That would be lovely, thank you! :)

-- 
Kees Cook



Re: [PATCH v4 0/3] Hardening perf subsystem

2024-06-12 Thread Kees Cook
On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote:
> On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote:
> > I'm happy to take patches. And for this bikeshed, this would be better
> > named under the size_*() helpers which are trying to keep size_t
> > calculations from overflowing (by saturating). i.e.:
> > 
> > size_add_mult(sizeof(*p), sizeof(*p->member), num)
> 
> Fine I suppose, but what if we want something not size_t? Are we waiting
> for the type system extension?

Because of C's implicit promotion/truncation, we can't do anything
sanely with return values of arbitrary type size; we have to capture the
lvalue type somehow so the checking can happen without C doing silent
garbage.

> The saturating thing is relying in the allocators never granting INT_MAX
> (or whatever size_t actually is) bytes?

The max of size_t is ULONG_MAX, but yes, most of the allocators will
refuse >INT_MAX, but I think vmalloc() is higher, but certainly not
SIZE_MAX, which is the entire virtual memory space. ;)

The saturating thing is two-fold: that we never wrap around SIZE_MAX,
and that the allocator will refuse a SIZE_MAX allocation.

> > LOL. It's basically doing compile-time (__builtin_object_size) and
> > run-time (__builtin_dynamic_object_size) bounds checking on destination
> > (and source) object sizes, mainly driven by the mentioned builtins:
> > https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
> 
> Right, I got that far. I also read most of:
> 
>   
> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854

Oh wow, that's serious extra credit. :) It'll also probably be a while
before most of that stuff is even landed in Clang, much less implemented
in GCC. What we _do_ have is the "counted_by" attribute. This was added
to Clang a little while ago and just landed last week in GCC for GCC 15.

> But none of that is showing me generated asm for the various cases. As
> such, I don't consider myself informed enough.

Gotcha. For the compile-time stuff it's all just looking at
known-at-compile-time sizes. So for something like this, we get a
__compiletime_warning() emitted:

const char src[] = "Hello there";
char dst[10];

strscpy(dst, src); /* Compiler yells since src is bigger than dst. */

For run-time checks it's basically just using the regular WARN()
infrastructure with __builtin_dynamic_object_size(). Here's a simplified
userspace example with assert():

https://godbolt.org/z/zMrKnMxn5

The kernel's FORTIFY_SOURCE is much more complex in how it does the
checking, how it does the reporting (for helping people figure out what's
gone weird), etc.

> > Anyway! What about the patch that takes the 2 allocations down to 1?
> > That seems like an obvious improvement.
> 
> Separate it from the struct_size() nonsense and Cc the author of that
> code (Sandipan IIRC) and I might just apply it.

Okay, thanks!

-Kees

-- 
Kees Cook



[PATCH v3 0/2] usercopy: Convert test_user_copy to KUnit test

2024-06-12 Thread Kees Cook
Hi,

This builds on the proposal[1] from Mark and lets me convert the
existing usercopy selftest to KUnit. Besides adding this basic test to
the KUnit collection, it also opens the door for execve testing (which
depends on having a functional current->mm), and should provide the
basic infrastructure for adding Mark's much more complete usercopy tests.

 v3:
  - use MEMEQ KUnit helper (David)
  - exclude pathological address confusion test for systems with separate
address spaces, noticed by David
  - add KUnit-conditional exports for alloc_mm() and arch_pick_mmap_layout()
noticed by 0day
 v2: https://lore.kernel.org/lkml/20240610213055.it.075-k...@kernel.org/
 v1: https://lore.kernel.org/lkml/20240519190422.work.715-k...@kernel.org/

-Kees

[1] https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/

Kees Cook (2):
  kunit: test: Add vm_mmap() allocation resource manager
  usercopy: Convert test_user_copy to KUnit test

 MAINTAINERS|   1 +
 include/kunit/test.h   |  17 ++
 kernel/fork.c  |   3 +
 lib/Kconfig.debug  |  21 +-
 lib/Makefile   |   2 +-
 lib/kunit/Makefile |   1 +
 lib/kunit/user_alloc.c | 113 +
 lib/{test_user_copy.c => usercopy_kunit.c} | 282 ++---
 mm/util.c  |   3 +
 9 files changed, 288 insertions(+), 155 deletions(-)
 create mode 100644 lib/kunit/user_alloc.c
 rename lib/{test_user_copy.c => usercopy_kunit.c} (46%)

-- 
2.34.1




[PATCH v3 1/2] kunit: test: Add vm_mmap() allocation resource manager

2024-06-12 Thread Kees Cook
For tests that need to allocate using vm_mmap() (e.g. usercopy and
execve), provide the interface to have the allocation tracked by KUnit
itself. This requires bringing up a placeholder userspace mm.

This combines my earlier attempt at this with Mark Rutland's version[1].

Normally alloc_mm() and arch_pick_mmap_layout() aren't exported for
modules, so export these only for KUnit testing.

Link: 
https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ [1]
Co-developed-by: Mark Rutland 
Signed-off-by: Mark Rutland 
Reviewed-by: David Gow 
Signed-off-by: Kees Cook 
---
 include/kunit/test.h   |  17 +++
 kernel/fork.c  |   3 ++
 lib/kunit/Makefile |   1 +
 lib/kunit/user_alloc.c | 113 +
 mm/util.c  |   3 ++
 5 files changed, 137 insertions(+)
 create mode 100644 lib/kunit/user_alloc.c

diff --git a/include/kunit/test.h b/include/kunit/test.h
index e32b4cb7afa2..ec61cad6b71d 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -480,6 +480,23 @@ static inline void *kunit_kcalloc(struct kunit *test, 
size_t n, size_t size, gfp
return kunit_kmalloc_array(test, n, size, gfp | __GFP_ZERO);
 }
 
+/**
+ * kunit_vm_mmap() - Allocate KUnit-tracked vm_mmap() area
+ * @test: The test context object.
+ * @file: struct file pointer to map from, if any
+ * @addr: desired address, if any
+ * @len: how many bytes to allocate
+ * @prot: mmap PROT_* bits
+ * @flag: mmap flags
+ * @offset: offset into @file to start mapping from.
+ *
+ * See vm_mmap() for more information.
+ */
+unsigned long kunit_vm_mmap(struct kunit *test, struct file *file,
+   unsigned long addr, unsigned long len,
+   unsigned long prot, unsigned long flag,
+   unsigned long offset);
+
 void kunit_cleanup(struct kunit *test);
 
 void __printf(2, 3) kunit_log_append(struct string_stream *log, const char 
*fmt, ...);
diff --git a/kernel/fork.c b/kernel/fork.c
index 99076dbe27d8..cea203197136 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -115,6 +115,8 @@
 #define CREATE_TRACE_POINTS
 #include 
 
+#include 
+
 /*
  * Minimum number of threads to boot the kernel
  */
@@ -1334,6 +1336,7 @@ struct mm_struct *mm_alloc(void)
memset(mm, 0, sizeof(*mm));
return mm_init(mm, current, current_user_ns());
 }
+EXPORT_SYMBOL_IF_KUNIT(mm_alloc);
 
 static inline void __mmput(struct mm_struct *mm)
 {
diff --git a/lib/kunit/Makefile b/lib/kunit/Makefile
index 309659a32a78..56dd67dc6e57 100644
--- a/lib/kunit/Makefile
+++ b/lib/kunit/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_KUNIT) +=  kunit.o
 
 kunit-objs +=  test.o \
resource.o \
+   user_alloc.o \
static_stub.o \
string-stream.o \
assert.o \
diff --git a/lib/kunit/user_alloc.c b/lib/kunit/user_alloc.c
new file mode 100644
index ..76d3d1345ed7
--- /dev/null
+++ b/lib/kunit/user_alloc.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KUnit userspace memory allocation resource management.
+ */
+#include 
+#include 
+#include 
+#include 
+
+struct kunit_vm_mmap_resource {
+   unsigned long addr;
+   size_t size;
+};
+
+/* vm_mmap() arguments */
+struct kunit_vm_mmap_params {
+   struct file *file;
+   unsigned long addr;
+   unsigned long len;
+   unsigned long prot;
+   unsigned long flag;
+   unsigned long offset;
+};
+
+/* Create and attach a new mm if it doesn't already exist. */
+static int kunit_attach_mm(void)
+{
+   struct mm_struct *mm;
+
+   if (current->mm)
+   return 0;
+
+   mm = mm_alloc();
+   if (!mm)
+   return -ENOMEM;
+
+   /* Define the task size. */
+   mm->task_size = TASK_SIZE;
+
+   /* Make sure we can allocate new VMAs. */
+   arch_pick_mmap_layout(mm, >signal->rlim[RLIMIT_STACK]);
+
+   /* Attach the mm. It will be cleaned up when the process dies. */
+   kthread_use_mm(mm);
+
+   return 0;
+}
+
+static int kunit_vm_mmap_init(struct kunit_resource *res, void *context)
+{
+   struct kunit_vm_mmap_params *p = context;
+   struct kunit_vm_mmap_resource vres;
+   int ret;
+
+   ret = kunit_attach_mm();
+   if (ret)
+   return ret;
+
+   vres.size = p->len;
+   vres.addr = vm_mmap(p->file, p->addr, p->len, p->prot, p->flag, 
p->offset);
+   if (!vres.addr)
+   return -ENOMEM;
+   res->data = kmemdup(, sizeof(vres), GFP_KERNEL);
+   if (!res->data) {
+   vm_munmap(vres.addr, vres.size);
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
+static void kunit_vm_mmap_free(struct kunit_resource *res)
+{
+   struct k

[PATCH v3 2/2] usercopy: Convert test_user_copy to KUnit test

2024-06-12 Thread Kees Cook
Convert the runtime tests of hardened usercopy to standard KUnit tests.

Additionally disable usercopy_test_invalid() for systems with separate
address spaces (or no MMU) since it's not sensible to test for address
confusion there (e.g. m68k).

Co-developed-by: Vitor Massaru Iha 
Signed-off-by: Vitor Massaru Iha 
Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org
Tested-by: Ivan Orlov 
Reviewed-by: David Gow 
Signed-off-by: Kees Cook 
---
 MAINTAINERS|   1 +
 lib/Kconfig.debug  |  21 +-
 lib/Makefile   |   2 +-
 lib/{test_user_copy.c => usercopy_kunit.c} | 282 ++---
 4 files changed, 151 insertions(+), 155 deletions(-)
 rename lib/{test_user_copy.c => usercopy_kunit.c} (46%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8754ac2c259d..0cd171ec6010 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11962,6 +11962,7 @@ F:  arch/*/configs/hardening.config
 F: include/linux/overflow.h
 F: include/linux/randomize_kstack.h
 F: kernel/configs/hardening.config
+F: lib/usercopy_kunit.c
 F: mm/usercopy.c
 K: \b(add|choose)_random_kstack_offset\b
 K: \b__check_(object_size|heap_object)\b
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 59b6765d86b8..561e346f5cb0 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2505,18 +2505,6 @@ config TEST_VMALLOC
 
  If unsure, say N.
 
-config TEST_USER_COPY
-   tristate "Test user/kernel boundary protections"
-   depends on m
-   help
- This builds the "test_user_copy" module that runs sanity checks
- on the copy_to/from_user infrastructure, making sure basic
- user/kernel boundary testing is working. If it fails to load,
- a regression has been detected in the user/kernel memory boundary
- protections.
-
- If unsure, say N.
-
 config TEST_BPF
tristate "Test BPF filter functionality"
depends on m && NET
@@ -2814,6 +2802,15 @@ config SIPHASH_KUNIT_TEST
  This is intended to help people writing architecture-specific
  optimized versions.  If unsure, say N.
 
+config USERCOPY_KUNIT_TEST
+   tristate "KUnit Test for user/kernel boundary protections"
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+   help
+ This builds the "usercopy_kunit" module that runs sanity checks
+ on the copy_to/from_user infrastructure, making sure basic
+ user/kernel boundary testing is working.
+
 config TEST_UDELAY
tristate "udelay test driver"
help
diff --git a/lib/Makefile b/lib/Makefile
index 3b1769045651..fae5cc67b95a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -78,7 +78,6 @@ obj-$(CONFIG_TEST_LKM) += test_module.o
 obj-$(CONFIG_TEST_VMALLOC) += test_vmalloc.o
 obj-$(CONFIG_TEST_RHASHTABLE) += test_rhashtable.o
 obj-$(CONFIG_TEST_SORT) += test_sort.o
-obj-$(CONFIG_TEST_USER_COPY) += test_user_copy.o
 obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_keys.o
 obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o
 obj-$(CONFIG_TEST_DYNAMIC_DEBUG) += test_dynamic_debug.o
@@ -388,6 +387,7 @@ CFLAGS_fortify_kunit.o += $(call cc-disable-warning, 
stringop-truncation)
 CFLAGS_fortify_kunit.o += $(DISABLE_STRUCTLEAK_PLUGIN)
 obj-$(CONFIG_FORTIFY_KUNIT_TEST) += fortify_kunit.o
 obj-$(CONFIG_SIPHASH_KUNIT_TEST) += siphash_kunit.o
+obj-$(CONFIG_USERCOPY_KUNIT_TEST) += usercopy_kunit.o
 
 obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
 
diff --git a/lib/test_user_copy.c b/lib/usercopy_kunit.c
similarity index 46%
rename from lib/test_user_copy.c
rename to lib/usercopy_kunit.c
index 5ff04d8fe971..45f1e558c464 100644
--- a/lib/test_user_copy.c
+++ b/lib/usercopy_kunit.c
@@ -15,7 +15,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 /*
  * Several 32-bit architectures support 64-bit {get,put}_user() calls.
@@ -31,26 +31,27 @@
 # define TEST_U64
 #endif
 
-#define test(condition, msg, ...)  \
-({ \
-   int cond = (condition); \
-   if (cond)   \
-   pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \
-   cond;   \
-})
+struct usercopy_test_priv {
+   char *kmem;
+   char __user *umem;
+   size_t size;
+};
 
 static bool is_zeroed(void *from, size_t size)
 {
return memchr_inv(from, 0x0, size) == NULL;
 }
 
-static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
+/* Test usage of check_nonzero_user(). */
+static void usercopy_test_check_nonzero_user(struct kunit *test)
 {
-   int ret = 0;
size_t start, end, i, zero_start, zero_end

Re: [PATCH v2 2/2] usercopy: Convert test_user_copy to KUnit test

2024-06-12 Thread Kees Cook
On Wed, Jun 12, 2024 at 09:21:52PM +0200, Geert Uytterhoeven wrote:
> Hi Kees,
> 
> On Wed, Jun 12, 2024 at 6:51 PM Kees Cook  wrote:
> > On Wed, Jun 12, 2024 at 05:13:39PM +0800, David Gow wrote:
> > > On Tue, 11 Jun 2024 at 05:33, Kees Cook  wrote:
> > > > Convert the runtime tests of hardened usercopy to standard KUnit tests.
> > > >
> > > > Co-developed-by: Vitor Massaru Iha 
> > > > Signed-off-by: Vitor Massaru Iha 
> > > > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org
> > > > Tested-by: Ivan Orlov 
> > > > Signed-off-by: Kees Cook 
> > > > ---
> > >
> > > This looks good, particularly with the x86 fix applied.
> > >
> > > It's still hanging on m68k -- I think at the 'illegal reversed
> > > copy_to_user passed' test -- but I'll admit to not having tried to
> > > debug it further.
> > >
> > > One other (set of) notes below about using KUNIT_EXPECT_MEMEQ_MSG(),
> > > otherwise (assuming the m68k stuff isn't actually a regression, which
> > > I haven't tested but I imagine is unlikely),
> >
> > I'm trying to debug a hang on m68k in the usercopy behavioral testing
> > routines. It's testing for the pathological case of having inverted
> > arguments to copy_to_user():
> >
> > user_addr = kunit_vm_mmap(test, NULL, 0, priv->size,
> > PROT_READ | PROT_WRITE | PROT_EXEC,
> > MAP_ANONYMOUS | MAP_PRIVATE, 0);
> > ...
> > bad_usermem = (char *)user_addr;
> > ...
> > KUNIT_EXPECT_NE_MSG(test, copy_to_user((char __user *)kmem, 
> > bad_usermem,
> >PAGE_SIZE), 0,
> > "illegal reversed copy_to_user passed");
> >
> > On other architectures, this immediate fails because the access_ok()
> > check rejects it. On m68k with CONFIG_ALTERNATE_USER_ADDRESS_SPACE,
> > access_ok() short-circuits to "true". I've tried reading
> > arch/m68k/include/asm/uaccess.h but I'm not sure what's happening under
> > CONFIG_CPU_HAS_ADDRESS_SPACES.
> 
> On m68k CPUs that support CPU_HAS_ADDRESS_SPACES (i.e. all traditional
> 680x0 that can run real Linux), the CPU has separate address spaces
> for kernel and user addresses.  Accessing userspace addresses is done
> using the special "moves" instruction, so we can just use the MMU to
> catch invalid accesses.

Okay, that's what I suspected. I think I'll need to just not test this
particular case for archs with separate address spaces, since it would
be meaningless.

> 
> > For now I've excluded that test for m68k, but I'm not sure what's
> > expected to happen here on m68k for this set of bad arguments. Can you
> > advise?
> 
> Perhaps the kernel address is actually a valid user address, or
> vice versa?

Right -- I think that's what's happened.

> 
> Does the test work on systems that use 4G/4G for kernel/userspace
> instead of the usual 1G/3G split?
> 
> /me runs the old test_user_copy.ko on ARAnyM
> Seems to take a while? Or it hangs, too?

Sounds like the same behavior.

> Related reading material
> https://lore.kernel.org/all/camuhmduzhwm5_tl7tnaof+uqhejnkgsqf+_vzqgrzb_3euf...@mail.gmail.com/
> https://lore.kernel.org/all/camuhmdvq93ihgcxubjpttahdpjxxlyvasar-m3twbjv0krs...@mail.gmail.com/

Thanks!

-Kees

-- 
Kees Cook



Re: [PATCH v4 0/3] Hardening perf subsystem

2024-06-12 Thread Kees Cook
On Tue, Jun 11, 2024 at 09:55:42AM +0200, Peter Zijlstra wrote:
> On Mon, Jun 10, 2024 at 02:46:09PM -0700, Kees Cook wrote:
> 
> > > I really detest this thing because it makes what was trivially readable
> > > into something opaque. Get me that type qualifier that traps on overflow
> > > and write plain C. All this __builtin_overflow garbage is just that,
> > > unreadable nonsense.
> > 
> > It's more readable than container_of(), 
> 
> Yeah, no. container_of() is absolutely trivial and very readable.
> container_of_const() a lot less so.

I mean, we have complex macros in the kernel. This isn't uncommon. Look
at cleanup.h. ;)

But that's why we write kern-doc:

 * struct_size() - Calculate size of structure with trailing flexible
 * array.
 * @p: Pointer to the structure.
 * @member: Name of the array member.
 * @count: Number of elements in the array.

> #define struct_size(p, member, num) \
>   mult_add_no_overflow(num, sizeof(p->member), sizeof(*p))
> 
> would be *FAR* more readable. And then I still think struct_size() is
> less readable than its expansion. ]]

I'm happy to take patches. And for this bikeshed, this would be better
named under the size_*() helpers which are trying to keep size_t
calculations from overflowing (by saturating). i.e.:

size_add_mult(sizeof(*p), sizeof(*p->member), num)

Generalized overflow handing (check_add/sub/mul_overflow()) helpers
already exist and requires a destination variable to determine type
sizes. It is not safe to allow C semantics to perform
promotion/truncation, so we have to be explicit.

> Some day I'll have to look at this FORTIFY_SOURCE and see what it
> actually does I suppose :/

LOL. It's basically doing compile-time (__builtin_object_size) and
run-time (__builtin_dynamic_object_size) bounds checking on destination
(and source) object sizes, mainly driven by the mentioned builtins:
https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html

> I coulnd't quickly find a single instance in the code I care about. So
> nothing is sailing afaict.

I have to care about all of Linux's code. :P We generally don't have to
defend the kernel from perf, so the hardening changes tend to end up in
core APIs along with compiler improvements.

Anyway! What about the patch that takes the 2 allocations down to 1?
That seems like an obvious improvement.

-Kees

-- 
Kees Cook



Re: [PATCH v3 2/2] pstore/ramoops: Add ramoops.mem_name= command line option

2024-06-12 Thread Kees Cook
On Tue, Jun 11, 2024 at 10:49:13AM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> Add a method to find a region specified by reserve_mem=nn:align:name for
> ramoops. Adding a kernel command line parameter:
> 
>   reserve_mem=12M:4096:oops ramoops.mem_name=oops
> 
> Will use the size and location defined by the memmap parameter where it
> finds the memory and labels it "oops". The "oops" in the ramoops option
> is used to search for it.
> 
> This allows for arbitrary RAM to be used for ramoops if it is known that
> the memory is not cleared on kernel crashes or soft reboots.
> 
> Signed-off-by: Steven Rostedt (Google) 

Acked-by: Kees Cook 

Let me know if this should go via the pstore tree, if you'd rather carry
it?

-- 
Kees Cook



Re: [PATCH v3] selftests: seccomp: fix format-zero-length warnings

2024-06-12 Thread Kees Cook
On Tue, Jun 11, 2024 at 05:16:08PM +0200, Amer Al Shanawany wrote:
> fix the following errors by using string format specifier and an empty
> parameter:
> 
> seccomp_benchmark.c:197:24: warning: zero-length gnu_printf format
>  string [-Wformat-zero-length]
>   197 | ksft_print_msg("");
>   |^~
> seccomp_benchmark.c:202:24: warning: zero-length gnu_printf format
>  string [-Wformat-zero-length]
>   202 | ksft_print_msg("");
>   |^~
> seccomp_benchmark.c:204:24: warning: zero-length gnu_printf format
>  string [-Wformat-zero-length]
>   204 | ksft_print_msg("");
>   |^~
> 
> Reported-by: kernel test robot 
> Closes: 
> https://lore.kernel.org/oe-kbuild-all/202312260235.uj5ug8k9-...@intel.com/
> Suggested-by: Kees Cook 
> Signed-off-by: Amer Al Shanawany 

Thanks!

Reviewed-by: Kees Cook 

-- 
Kees Cook



Re: [PATCH v2] x86/traps: Enable UBSAN traps on x86

2024-06-12 Thread Kees Cook
On Tue, Jun 11, 2024 at 01:26:09PM -0700, Gatlin Newhouse wrote:
> On Mon, Jun 03, 2024 at 06:13:53PM UTC, Thomas Gleixner wrote:
> > On Sat, Jun 01 2024 at 03:10, Gatlin Newhouse wrote:
> > 
> > > Bring x86 to parity with arm64, similar to commit 25b84002afb9
> > > ("arm64: Support Clang UBSAN trap codes for better reporting").
> > > Enable the output of UBSAN type information on x86 architectures
> > > compiled with clang when CONFIG_UBSAN_TRAP=y. Currently ARM
> > > architectures output which specific sanitizer caused the trap,
> > > via the encoded data in the trap instruction. Clang on x86
> > > currently encodes the same data in ud1 instructions but the x86
> > > handle_bug() and is_valid_bugaddr() functions currently only look
> > > at ud2s.
> > 
> > Please structure your change log properly instead of one paragraph of
> > unstructured word salad. See:
> > 
> >   
> > https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog
> >   
> > > +/*
> > > + * Check for UD1, UD2, with or without Address Size Override Prefixes 
> > > instructions.
> > > + */
> > >  __always_inline int is_valid_bugaddr(unsigned long addr)
> > >  {
> > >   if (addr < TASK_SIZE_MAX)
> > > @@ -88,7 +92,13 @@ __always_inline int is_valid_bugaddr(unsigned long 
> > > addr)
> > >* We got #UD, if the text isn't readable we'd have gotten
> > >* a different exception.
> > >*/
> > > - return *(unsigned short *)addr == INSN_UD2;
> > > + if (*(u16 *)addr == INSN_UD2)
> > > + return INSN_UD2;
> > > + if (*(u16 *)addr == INSN_UD1)
> > > + return INSN_UD1;
> > > + if (*(u8 *)addr == INSN_ASOP && *(u16 *)(addr + 1) == INSN_UD1)
> > 
> > s/1/LEN_ASOP/ ?
> > 
> > > + return INSN_ASOP;
> > > + return 0;
> > 
> > I'm not really a fan of the reuse of the INSN defines here. Especially
> > not about INSN_ASOP. Also 0 is just lame.
> > 
> > Neither does the function name make sense anymore. is_valid_bugaddr() is
> > clearly telling that it's a boolean check (despite the return value
> > being int for hysterical raisins). But now you turn it into a
> > non-boolean integer which returns a instruction encoding. That's
> > hideous. Programming should result in obvious code and that should be
> > pretty obvious to people who create tools to validate code.
> > 
> > Also all UBSAN cares about is the actual failure type and not the
> > instruction itself:
> > 
> > #define INSN_UD_MASK0x
> > #define INSN_ASOP_MASK  0x00FF
> > 
> > #define BUG_UD_NONE 0x
> > #define BUG_UD2 0xFFFE
> > 
> > __always_inline u16 get_ud_type(unsigned long addr)
> > {
> > u16 insn;
> > 
> > if (addr < TASK_SIZE_MAX)
> > return BUD_UD_NONE;
> > 
> > insn = *(u16 *)addr;
> > if ((insn & INSN_UD_MASK) == INSN_UD2)
> > return BUG_UD2;
> > 
> > if ((insn & INSN_ASOP_MASK) == INSN_ASOP)
> > insn = *(u16 *)(++addr);
> > 
> > // UBSAN encodes the failure type in the two bytes after UD1
> > if ((insn & INSN_UD_MASK) == INSN_UD1)
> > return *(u16 *)(addr + LEN_UD1);
> > 
> > return BUG_UD_NONE;
> > }
> > 
> > No?
> 
> Thanks for the feedback.
> 
> It seems that is_valid_bugaddr() needs to be implemented on all architectures
> and the function get_ud_type() replaces it here. So how should the patch 
> handle
> is_valid_bugaddr()? Should the function remain as-is in traps.c despite no
> longer being used?

Yeah, this is why I'd suggested to Gatlin in early designs to reuse
is_valid_bugaddr()'s int value. It's a required function, so it seemed
sensible to just repurpose it from yes/no to no/type1/type2/type3/etc.

-Kees

-- 
Kees Cook



Re: [PATCH v3] wifi: mac80211: Avoid address calculations via out of bounds array indexing

2024-06-12 Thread Kees Cook
On Wed, Jun 12, 2024 at 10:20:41AM +0200, Johannes Berg wrote:
> On Wed, 2024-06-05 at 11:22 -0400, Kenton Groombridge wrote:
> 
> 
> > Co-authored-by: Kees Cook 
> > Signed-off-by: Kenton Groombridge 
> > 
> 
> Wait ... I don't know what Kees did here, but seems then he should sign-
> off too.
> 
> Kees?

I had helped mainly in private emails with Kenton. I'm fine with
whatever tags people want. :)

Signed-off-by: Kees Cook 

and/or

Reviewed-by: Kees Cook 

(Though note the email address change just to keep checkpatch happy
about Co-authored-by vs S-o-b mismatches.)

Thanks!

-- 
Kees Cook



Re: [PATCH v2 2/2] usercopy: Convert test_user_copy to KUnit test

2024-06-12 Thread Kees Cook
On Wed, Jun 12, 2024 at 05:13:39PM +0800, David Gow wrote:
> On Tue, 11 Jun 2024 at 05:33, Kees Cook  wrote:
> >
> > Convert the runtime tests of hardened usercopy to standard KUnit tests.
> >
> > Co-developed-by: Vitor Massaru Iha 
> > Signed-off-by: Vitor Massaru Iha 
> > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org
> > Tested-by: Ivan Orlov 
> > Signed-off-by: Kees Cook 
> > ---
> 
> This looks good, particularly with the x86 fix applied.
> 
> It's still hanging on m68k -- I think at the 'illegal reversed
> copy_to_user passed' test -- but I'll admit to not having tried to
> debug it further.
> 
> One other (set of) notes below about using KUNIT_EXPECT_MEMEQ_MSG(),
> otherwise (assuming the m68k stuff isn't actually a regression, which
> I haven't tested but I imagine is unlikely),

Hi Geert,

I'm trying to debug a hang on m68k in the usercopy behavioral testing
routines. It's testing for the pathological case of having inverted
arguments to copy_to_user():

user_addr = kunit_vm_mmap(test, NULL, 0, priv->size,
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_ANONYMOUS | MAP_PRIVATE, 0);
...
bad_usermem = (char *)user_addr;
...
KUNIT_EXPECT_NE_MSG(test, copy_to_user((char __user *)kmem, bad_usermem,
   PAGE_SIZE), 0,
"illegal reversed copy_to_user passed");

On other architectures, this immediate fails because the access_ok()
check rejects it. On m68k with CONFIG_ALTERNATE_USER_ADDRESS_SPACE,
access_ok() short-circuits to "true". I've tried reading
arch/m68k/include/asm/uaccess.h but I'm not sure what's happening under
CONFIG_CPU_HAS_ADDRESS_SPACES.

For now I've excluded that test for m68k, but I'm not sure what's
expected to happen here on m68k for this set of bad arguments. Can you
advise?

Thanks!

-Kees

-- 
Kees Cook



Re: [PATCH v2 2/2] usercopy: Convert test_user_copy to KUnit test

2024-06-12 Thread Kees Cook
On Wed, Jun 12, 2024 at 05:13:39PM +0800, David Gow wrote:
> On Tue, 11 Jun 2024 at 05:33, Kees Cook  wrote:
> >
> > Convert the runtime tests of hardened usercopy to standard KUnit tests.
> >
> > Co-developed-by: Vitor Massaru Iha 
> > Signed-off-by: Vitor Massaru Iha 
> > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org
> > Tested-by: Ivan Orlov 
> > Signed-off-by: Kees Cook 
> > ---
> 
> This looks good, particularly with the x86 fix applied.

Thanks!

> It's still hanging on m68k -- I think at the 'illegal reversed
> copy_to_user passed' test -- but I'll admit to not having tried to
> debug it further.

For my own future reference, I have reproduced this with:

$ sudo apt install gcc-m68k-linux-gnu
$ tools/testing/kunit/kunit.py run --arch m68k --make_option 
CROSS_COMPILE=m68k-linux-gnu- usercopy

I'll figure it out...

> One other (set of) notes below about using KUNIT_EXPECT_MEMEQ_MSG(),
> otherwise (assuming the m68k stuff isn't actually a regression, which
> I haven't tested but I imagine is unlikely),

I should really read all the API docs every few releases. :) I will
switch to KUNIT_EXPECT_MEMEQ_MSG!

-- 
Kees Cook



Re: [PATCH v4 0/3] Hardening perf subsystem

2024-06-10 Thread Kees Cook
On Mon, Jun 10, 2024 at 10:05:44PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 10, 2024 at 10:28:52AM -0700, Kees Cook wrote:
> > On Sat, Jun 01, 2024 at 06:56:15PM +0200, Erick Archer wrote:
> > > Hi everyone,
> > > 
> > > This is an effort to get rid of all multiplications from allocation
> > > functions in order to prevent integer overflows [1][2].
> > 
> > I didn't actually see these 3 patches in this thread nor via lore.
> 
> He managed to break threading between 0/n and the rest.
> 
> > > In the first patch, the "struct amd_uncore_ctx" can be refactored to
> > > use a flex array for the "events" member. This way, the allocation/
> > > freeing of the memory can be simplified. Then, the struct_size()
> > > helper can be used to do the arithmetic calculation for the memory
> > > to be allocated.
> > 
> > I like this patch because it reduces the allocation from 2 to 1. This
> > isn't what Peter might see as "churn": this is an improvement in resource
> > utilization.
> 
> But then he went and used that struct_size() abomination :/
> 
> > I prefer this style, as it makes things unambiguous ("this will never
> > wrap around") without having to check the associated types and doesn't make
> > the resulting binary code different in the "can never overflow" case.
> > 
> > In this particular case:
> > 
> > int size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg);
> > 
> > "int numshared" comes from struct intel_uncore_type::num_shared_regs,
> > which is:
> > 
> > unsigned num_shared_regs:8;
> > 
> > And the struct sizes are:
> > 
> > $ pahole -C intel_uncore_box gcc-boot/vmlinux | grep size:
> > /* size: 488, cachelines: 8, members: 19 */
> > $ pahole -C intel_uncore_extra_reg gcc-boot/vmlinux | grep size:
> > /* size: 96, cachelines: 2, members: 5 */
> > 
> > So we have:
> > 
> > s32 size = 488 + u8 * 96
> > 
> > Max size here is 24968 so it can never overflow an s32, so I can see
> > why Peter views this as "churn".
> > 
> > I still think the patch is a coding style improvement, but okay.
> 
> I really detest this thing because it makes what was trivially readable
> into something opaque. Get me that type qualifier that traps on overflow
> and write plain C. All this __builtin_overflow garbage is just that,
> unreadable nonsense.

It's more readable than container_of(), IMO. "give me the struct size
for variable VAR, which has a flexible array MEMBER, when we have COUNT
many of them": struct_size(VAR, MEMBER, COUNT). It's more readable, more
robust, and provides saturation in the face of potential wrap-around.

> > This provides __counted_by coverage, and I think this is important to
> > gain in ever place we can. Given that this is part of a ring buffer
> > implementation that is arbitrarily sized, this is exactly the kind of
> > place I'd like to see __counted_by used. This is a runtime robustness
> > improvement, so I don't see this a "churn" at all.
> 
> Again, mixed in with that other crap. Anyway, remind me wth this
> __counted_by thing actually does?

It provides annotation for the compiler to perform run-time bounds
checking on dynamically sized arrays. i.e. CONFIG_FORTIFY_SOURCE and
CONFIG_UBSAN_BOUNDS can actually reason about annotated flexible arrays
instead of just saying "oh no a flexible array, I give up".

> > Peter, for patches 1 and 3, if you'd prefer not to carry them, I could
> > put them in the hardening tree to keep them out of your way. It seems
> > clear you don't want patch 2 at all.
> 
> I prefer to not have struct_size() anywhere at all. Please just write
> readable code.

That ship has sailed, and it has been keeping things at bay for a while
now. As we make progress on making the compiler able to do this more
naturally, we can work on replacing struct_size(), but it's in use
globally and it's useful both for catching runtime mistakes and for
catching compile-time mistakes (the flexible array has to match the
variable's struct).

-Kees

-- 
Kees Cook



[PATCH v2 1/2] kunit: test: Add vm_mmap() allocation resource manager

2024-06-10 Thread Kees Cook
For tests that need to allocate using vm_mmap() (e.g. usercopy and
execve), provide the interface to have the allocation tracked by KUnit
itself. This requires bringing up a placeholder userspace mm.

This combines my earlier attempt at this with Mark Rutland's version[1].

Link: 
https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ [1]
Co-developed-by: Mark Rutland 
Signed-off-by: Mark Rutland 
Signed-off-by: Kees Cook 
---
 include/kunit/test.h   |  17 +++
 lib/kunit/Makefile |   1 +
 lib/kunit/user_alloc.c | 111 +
 3 files changed, 129 insertions(+)
 create mode 100644 lib/kunit/user_alloc.c

diff --git a/include/kunit/test.h b/include/kunit/test.h
index e32b4cb7afa2..ec61cad6b71d 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -480,6 +480,23 @@ static inline void *kunit_kcalloc(struct kunit *test, 
size_t n, size_t size, gfp
return kunit_kmalloc_array(test, n, size, gfp | __GFP_ZERO);
 }
 
+/**
+ * kunit_vm_mmap() - Allocate KUnit-tracked vm_mmap() area
+ * @test: The test context object.
+ * @file: struct file pointer to map from, if any
+ * @addr: desired address, if any
+ * @len: how many bytes to allocate
+ * @prot: mmap PROT_* bits
+ * @flag: mmap flags
+ * @offset: offset into @file to start mapping from.
+ *
+ * See vm_mmap() for more information.
+ */
+unsigned long kunit_vm_mmap(struct kunit *test, struct file *file,
+   unsigned long addr, unsigned long len,
+   unsigned long prot, unsigned long flag,
+   unsigned long offset);
+
 void kunit_cleanup(struct kunit *test);
 
 void __printf(2, 3) kunit_log_append(struct string_stream *log, const char 
*fmt, ...);
diff --git a/lib/kunit/Makefile b/lib/kunit/Makefile
index 309659a32a78..56dd67dc6e57 100644
--- a/lib/kunit/Makefile
+++ b/lib/kunit/Makefile
@@ -2,6 +2,7 @@ obj-$(CONFIG_KUNIT) +=  kunit.o
 
 kunit-objs +=  test.o \
resource.o \
+   user_alloc.o \
static_stub.o \
string-stream.o \
assert.o \
diff --git a/lib/kunit/user_alloc.c b/lib/kunit/user_alloc.c
new file mode 100644
index ..d66f42282f43
--- /dev/null
+++ b/lib/kunit/user_alloc.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KUnit userspace memory allocation resource management.
+ */
+#include 
+#include 
+#include 
+#include 
+
+struct kunit_vm_mmap_resource {
+   unsigned long addr;
+   size_t size;
+};
+
+/* vm_mmap() arguments */
+struct kunit_vm_mmap_params {
+   struct file *file;
+   unsigned long addr;
+   unsigned long len;
+   unsigned long prot;
+   unsigned long flag;
+   unsigned long offset;
+};
+
+/* Create and attach a new mm if it doesn't already exist. */
+static int kunit_attach_mm(void)
+{
+   struct mm_struct *mm;
+
+   if (current->mm)
+   return 0;
+
+   mm = mm_alloc();
+   if (!mm)
+   return -ENOMEM;
+
+   /* Define the task size. */
+   mm->task_size = TASK_SIZE;
+
+   /* Make sure we can allocate new VMAs. */
+   arch_pick_mmap_layout(mm, >signal->rlim[RLIMIT_STACK]);
+
+   /* Attach the mm. It will be cleaned up when the process dies. */
+   kthread_use_mm(mm);
+
+   return 0;
+}
+
+static int kunit_vm_mmap_init(struct kunit_resource *res, void *context)
+{
+   struct kunit_vm_mmap_params *p = context;
+   struct kunit_vm_mmap_resource vres;
+   int ret;
+
+   ret = kunit_attach_mm();
+   if (ret)
+   return ret;
+
+   vres.size = p->len;
+   vres.addr = vm_mmap(p->file, p->addr, p->len, p->prot, p->flag, 
p->offset);
+   if (!vres.addr)
+   return -ENOMEM;
+   res->data = kmemdup(, sizeof(vres), GFP_KERNEL);
+   if (!res->data) {
+   vm_munmap(vres.addr, vres.size);
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
+static void kunit_vm_mmap_free(struct kunit_resource *res)
+{
+   struct kunit_vm_mmap_resource *vres = res->data;
+
+   /*
+* Since this is executed from the test monitoring process,
+* the test's mm has already been torn down. We don't need
+* to run vm_munmap(vres->addr, vres->size), only clean up
+* the vres.
+*/
+
+   kfree(vres);
+   res->data = NULL;
+}
+
+unsigned long kunit_vm_mmap(struct kunit *test, struct file *file,
+   unsigned long addr, unsigned long len,
+   unsigned long prot, unsigned long flag,
+   unsigned long offset)
+{
+   struct kunit_vm_mmap_params params = {
+   .file = file,
+   .addr = addr

[PATCH v2 0/2] usercopy: Convert test_user_copy to KUnit test

2024-06-10 Thread Kees Cook
Hi,

This builds on the proposal[1] from Mark and lets me convert the
existing usercopy selftest to KUnit. Besides adding this basic test to
the KUnit collection, it also opens the door for execve testing (which
depends on having a functional current->mm), and should provide the
basic infrastructure for adding Mark's much more complete usercopy tests.

 v2:
 - dropped "initial VMA", turns out it wasn't needed (Mark)
 - various cleanups in the test itself (Vitor)
 - moved new kunit resource to a separate file (David)
 v1: https://lore.kernel.org/all/20240519190422.work.715-k...@kernel.org/

-Kees

[1] https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/

Kees Cook (2):
  kunit: test: Add vm_mmap() allocation resource manager
  usercopy: Convert test_user_copy to KUnit test

 MAINTAINERS|   1 +
 include/kunit/test.h   |  17 ++
 lib/Kconfig.debug  |  21 +-
 lib/Makefile   |   2 +-
 lib/kunit/Makefile |   1 +
 lib/kunit/user_alloc.c | 111 +
 lib/{test_user_copy.c => usercopy_kunit.c} | 273 ++---
 7 files changed, 271 insertions(+), 155 deletions(-)
 create mode 100644 lib/kunit/user_alloc.c
 rename lib/{test_user_copy.c => usercopy_kunit.c} (47%)

-- 
2.34.1




[PATCH v2 2/2] usercopy: Convert test_user_copy to KUnit test

2024-06-10 Thread Kees Cook
Convert the runtime tests of hardened usercopy to standard KUnit tests.

Co-developed-by: Vitor Massaru Iha 
Signed-off-by: Vitor Massaru Iha 
Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org
Tested-by: Ivan Orlov 
Signed-off-by: Kees Cook 
---
 MAINTAINERS|   1 +
 lib/Kconfig.debug  |  21 +-
 lib/Makefile   |   2 +-
 lib/{test_user_copy.c => usercopy_kunit.c} | 273 ++---
 4 files changed, 142 insertions(+), 155 deletions(-)
 rename lib/{test_user_copy.c => usercopy_kunit.c} (47%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8754ac2c259d..0cd171ec6010 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11962,6 +11962,7 @@ F:  arch/*/configs/hardening.config
 F: include/linux/overflow.h
 F: include/linux/randomize_kstack.h
 F: kernel/configs/hardening.config
+F: lib/usercopy_kunit.c
 F: mm/usercopy.c
 K: \b(add|choose)_random_kstack_offset\b
 K: \b__check_(object_size|heap_object)\b
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 59b6765d86b8..561e346f5cb0 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2505,18 +2505,6 @@ config TEST_VMALLOC
 
  If unsure, say N.
 
-config TEST_USER_COPY
-   tristate "Test user/kernel boundary protections"
-   depends on m
-   help
- This builds the "test_user_copy" module that runs sanity checks
- on the copy_to/from_user infrastructure, making sure basic
- user/kernel boundary testing is working. If it fails to load,
- a regression has been detected in the user/kernel memory boundary
- protections.
-
- If unsure, say N.
-
 config TEST_BPF
tristate "Test BPF filter functionality"
depends on m && NET
@@ -2814,6 +2802,15 @@ config SIPHASH_KUNIT_TEST
  This is intended to help people writing architecture-specific
  optimized versions.  If unsure, say N.
 
+config USERCOPY_KUNIT_TEST
+   tristate "KUnit Test for user/kernel boundary protections"
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+   help
+ This builds the "usercopy_kunit" module that runs sanity checks
+ on the copy_to/from_user infrastructure, making sure basic
+ user/kernel boundary testing is working.
+
 config TEST_UDELAY
tristate "udelay test driver"
help
diff --git a/lib/Makefile b/lib/Makefile
index 3b1769045651..fae5cc67b95a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -78,7 +78,6 @@ obj-$(CONFIG_TEST_LKM) += test_module.o
 obj-$(CONFIG_TEST_VMALLOC) += test_vmalloc.o
 obj-$(CONFIG_TEST_RHASHTABLE) += test_rhashtable.o
 obj-$(CONFIG_TEST_SORT) += test_sort.o
-obj-$(CONFIG_TEST_USER_COPY) += test_user_copy.o
 obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_keys.o
 obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o
 obj-$(CONFIG_TEST_DYNAMIC_DEBUG) += test_dynamic_debug.o
@@ -388,6 +387,7 @@ CFLAGS_fortify_kunit.o += $(call cc-disable-warning, 
stringop-truncation)
 CFLAGS_fortify_kunit.o += $(DISABLE_STRUCTLEAK_PLUGIN)
 obj-$(CONFIG_FORTIFY_KUNIT_TEST) += fortify_kunit.o
 obj-$(CONFIG_SIPHASH_KUNIT_TEST) += siphash_kunit.o
+obj-$(CONFIG_USERCOPY_KUNIT_TEST) += usercopy_kunit.o
 
 obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
 
diff --git a/lib/test_user_copy.c b/lib/usercopy_kunit.c
similarity index 47%
rename from lib/test_user_copy.c
rename to lib/usercopy_kunit.c
index 5ff04d8fe971..f1689f2c5c7b 100644
--- a/lib/test_user_copy.c
+++ b/lib/usercopy_kunit.c
@@ -15,7 +15,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 /*
  * Several 32-bit architectures support 64-bit {get,put}_user() calls.
@@ -31,26 +31,27 @@
 # define TEST_U64
 #endif
 
-#define test(condition, msg, ...)  \
-({ \
-   int cond = (condition); \
-   if (cond)   \
-   pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \
-   cond;   \
-})
+struct usercopy_test_priv {
+   char *kmem;
+   char __user *umem;
+   size_t size;
+};
 
 static bool is_zeroed(void *from, size_t size)
 {
return memchr_inv(from, 0x0, size) == NULL;
 }
 
-static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
+/* Test usage of check_nonzero_user(). */
+static void usercopy_test_check_nonzero_user(struct kunit *test)
 {
-   int ret = 0;
size_t start, end, i, zero_start, zero_end;
+   struct usercopy_test_priv *priv = test->priv;
+   char __user *umem = priv->umem;
+   char *kmem = priv->kmem;
+   size_t size = priv->size;
 
-   if (test(size < 

[PATCH] x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking

2024-06-10 Thread Kees Cook
When reworking the range checking for get_user(), the get_user_8() case
on 32-bit wasn't zeroing the high register. (The jump to bad_get_user_8
was accidentally dropped.) Restore the correct error handling
destination (and rename the jump to using the expected ".L" prefix).

While here, switch to using a named argument ("size") for the call
template ("%c4" to "%c[size]") as already used in the other call
templates in this file.

Found after moving the usercopy selftests to KUnit:

  # usercopy_test_invalid: EXPECTATION FAILED at
  lib/usercopy_kunit.c:278
  Expected val_u64 == 0, but
  val_u64 == -60129542144 (0xfff2)

Reported-by: David Gow 
Closes: 
https://lore.kernel.org/all/CABVgOSn=tb=lj9sxhut4_9mtjjkvxsq-ikdxc4kgho4cfkv...@mail.gmail.com
Fixes: b19b74bc99b1 ("x86/mm: Rework address range check in get_user() and 
put_user()")
Signed-off-by: Kees Cook 
---
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
Cc: Sean Christopherson 
Cc: Peter Zijlstra 
Cc: Arnd Bergmann 
Cc: "Kirill A. Shutemov" 
Cc: Qiuxu Zhuo 
Cc: Nadav Amit 
Cc: Masahiro Yamada 
---
 arch/x86/include/asm/uaccess.h | 4 ++--
 arch/x86/lib/getuser.S | 6 +-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 0f9bab92a43d..3a7755c1a441 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -78,10 +78,10 @@ extern int __get_user_bad(void);
int __ret_gu;   \
register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX);\
__chk_user_ptr(ptr);\
-   asm volatile("call __" #fn "_%c4"   \
+   asm volatile("call __" #fn "_%c[size]"  \
 : "=a" (__ret_gu), "=r" (__val_gu),\
ASM_CALL_CONSTRAINT \
-: "0" (ptr), "i" (sizeof(*(ptr;\
+: "0" (ptr), [size] "i" (sizeof(*(ptr; \
instrument_get_user(__val_gu);  \
(x) = (__force __typeof__(*(ptr))) __val_gu;\
__builtin_expect(__ret_gu, 0);  \
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
index 10d5ed8b5990..a1cb3a4e6742 100644
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -44,7 +44,11 @@
or %rdx, %rax
 .else
cmp $TASK_SIZE_MAX-\size+1, %eax
+.if \size != 8
jae .Lbad_get_user
+.else
+   jae .Lbad_get_user_8
+.endif
sbb %edx, %edx  /* array_index_mask_nospec() */
and %edx, %eax
 .endif
@@ -154,7 +158,7 @@ SYM_CODE_END(__get_user_handle_exception)
 #ifdef CONFIG_X86_32
 SYM_CODE_START_LOCAL(__get_user_8_handle_exception)
ASM_CLAC
-bad_get_user_8:
+.Lbad_get_user_8:
xor %edx,%edx
xor %ecx,%ecx
mov $(-EFAULT),%_ASM_AX
-- 
2.34.1




Re: [PATCH 2/2] usercopy: Convert test_user_copy to KUnit test

2024-06-10 Thread Kees Cook
On Sat, Jun 08, 2024 at 04:44:10PM +0800, David Gow wrote:
> On Mon, 20 May 2024 at 03:12, Kees Cook  wrote:
> >
> > Convert the runtime tests of hardened usercopy to standard KUnit tests.
> >
> > Co-developed-by: Vitor Massaru Iha 
> > Signed-off-by: Vitor Massaru Iha 
> > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org
> > Signed-off-by: Kees Cook 
> > ---
> 
> This fails here on i386:
> >  # usercopy_test_invalid: EXPECTATION FAILED at lib/usercopy_kunit.c:278
> >  Expected val_u64 == 0, but
> >  val_u64 == -60129542144 (0xfff2)

Hunh. I can reproduce this with "--arch=i386" but not under UML with
SUBARCH=i386. But perhaps it's a difference in the get_user()
implementations between the two.

And this looks like a bug in the get_user() failure path on i386. I will
investigate...

> It also seems to be hanging somewhere in usercopy_test_invalid on my
> m68k/qemu setup:
> ./tools/testing/kunit/kunit.py run --build_dir=.kunit-m68k --arch m68k 
> usercopy

Oh, that's weird. I'll need to get an m68k environment set up...

> Otherwise, it looks fine. Maybe it'd make sense to split some of the
> tests up a bit more, but it's a matter of taste (and only really an
> advantage for debugging hangs where more detailed progress is nice).

Yeah. I can do this in follow-up patches, perhaps.

> With those architecture-specific hangs either fixed, or documented (if
> they're actual problems, not issues with the test), this is:
> 
> Reviewed-by: David Gow 

Thanks!

-- 
Kees Cook



Re: [PATCH 1/2] kunit: test: Add vm_mmap() allocation resource manager

2024-06-10 Thread Kees Cook
On Sat, Jun 08, 2024 at 04:44:16PM +0800, David Gow wrote:
> On Mon, 20 May 2024 at 03:12, Kees Cook  wrote:
> >
> > For tests that need to allocate using vm_mmap() (e.g. usercopy and
> > execve), provide the interface to have the allocation tracked by KUnit
> > itself. This requires bringing up a placeholder userspace mm.
> >
> > This combines my earlier attempt at this with Mark Rutland's version[1].
> >
> > Link: 
> > https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ 
> > [1]
> > Co-developed-by: Mark Rutland 
> > Signed-off-by: Mark Rutland 
> > Signed-off-by: Kees Cook 
> > ---
> 
> Thanks very much for this!
> 
> A few high-level thoughts:
> - Do we want to move this into a separate file? test.{c,h} is already
> getting pretty big, and this would probably fit more comfortably with
> some of the resource-management bits, which are in their own files.
> Not every test will need mm support.

I'm happy to do that -- I was just following where kunit_kmalloc() was
defined. I'll create a new file for it.

> - It'd be nice for there to be a way to explicitly teardown/reset
> this: I agree that this is made more awkward by KUnit cleanup normally
> running on a different thread, but I could definitely see why a test
> might want to unset/reset this, and it would be more consistent with
> other resources.

Yeah, it's weird, but it's naturally managed?

> Otherwise, I have a few small questions below, but nothing essential.
> There are a couple of test failures/hangs for the usercopy test (on
> i386 and m68k), which may have origins here: I've mentioned them
> there.

I'll look into this. I must have some 64/32 oversight...

> Reviewed-by: David Gow 

Thanks!

> > +/*
> > + * Arbitrarily chosen user address for the base allocation.
> > + */
> > +#define UBUF_ADDR_BASE SZ_2M
> 
> Are there any circumstances where we'd want a _different_ base address
> here? Could it conflict with something / could tests require something
> different?
> 
> I suspect it's fine to leave it like this until such a case actually shows up.

Yeah, it shouldn't be important, and as Mark has pointed out, it might
not be needed at all. I'll see what I can do.

> > +   vres = kunit_alloc_resource(test,
> > +   kunit_vm_mmap_init,
> > +   kunit_vm_mmap_free,
> > +   GFP_KERNEL,
> > +   );
> 
> It could be easier to use kunit_add_action() here, rather than
> kunit_alloc_resource(), as you wouldn't need the params struct to pass
> things through.
> 
> The advantage to keeping the separate resource is that we can more
> easily look it up later if we, for example, wanted to be able to make
> it current on other threads (is that something we'd ever want to do?).

I like having it follow the pattern of the other resource allocators,
but if there's not a strong reason to switch, I'll leave it as-is.

-- 
Kees Cook



Re: [PATCH 2/2] usercopy: Convert test_user_copy to KUnit test

2024-06-10 Thread Kees Cook
On Wed, May 29, 2024 at 01:17:35PM +0100, Ivan Orlov wrote:
> On 5/19/24 20:12, Kees Cook wrote:
> >   #define test(condition, msg, ...) \
> >   ({
> > \
> > int cond = (condition); \
> > if (cond)   \
> > -   pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \
> > +   KUNIT_EXPECT_FALSE_MSG(test, cond, msg, ##__VA_ARGS__); \
> > cond;   \
> >   })
> It looks like the 'test' macro is not used anymore, so probably it should be
> removed.

Oops, yes. Thanks!

> > +static int usercopy_test_init(struct kunit *test)
> > +{
> > +   struct usercopy_test_priv *priv;
> > +   unsigned long user_addr;
> > -   if (ret == 0) {
> > -   pr_info("tests passed.\n");
> > -   return 0;
> > -   }
> > +   priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL);
> > +   if (!priv)
> > +   return -ENOMEM;
> 
> Should the check be done with KUNIT_ASSERT_NOT_ERR_OR_NULL here as well, as
> it is done with priv->kmem?

Yes, that's much more idiomatic. I'll adjust this too.

> Other than that,
> 
> Tested-by: Ivan Orlov 

Thanks!

-- 
Kees Cook



Re: [PATCH 1/2] kunit: test: Add vm_mmap() allocation resource manager

2024-06-10 Thread Kees Cook
On Mon, May 20, 2024 at 10:29:06AM +0100, Mark Rutland wrote:
> On Sun, May 19, 2024 at 12:12:52PM -0700, Kees Cook wrote:
> > +/* Create and attach a new mm if it doesn't already exist. */
> > +static int kunit_attach_mm(void)
> > +{
> > +   struct vm_area_struct *vma;
> > +   struct mm_struct *mm;
> > +
> > +   if (current->mm)
> > +   return 0;
> 
> My tests deliberately created/destroyed the mm for each test; surely we
> don't want to inherit an MM in some arbitrary state? ... or is this just
> so the mm can be allocated lazily upon the first mmap() within a test?

It's for lazily creation and for supporting running the KUnit test as a
module (where a userspace would exist). The old usercopy test worked
against the existing userspace, so I'd want to continue to support that.

> 
> > +
> > +   mm = mm_alloc();
> > +   if (!mm)
> > +   return -ENOMEM;
> > +
> > +   if (mmap_write_lock_killable(mm))
> > +   goto out_free;
> > +
> > +   /* Define the task size. */
> > +   mm->task_size = TASK_SIZE;
> > +
> > +   /* Prepare the base VMA. */
> > +   vma = vm_area_alloc(mm);
> > +   if (!vma)
> > +   goto out_unlock;
> > +
> > +   vma_set_anonymous(vma);
> > +   vma->vm_start = UBUF_ADDR_BASE;
> > +   vma->vm_end = UBUF_ADDR_BASE + PAGE_SIZE;
> > +   vm_flags_init(vma, VM_READ | VM_MAYREAD | VM_WRITE | VM_MAYWRITE);
> > +   vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
> > +
> > +   if (insert_vm_struct(mm, vma))
> > +   goto out_free_vma;
> > +
> > +   mmap_write_unlock(mm);
> 
> Why do we need this VMA given you have kunit_vm_mmap()?

When I was originally testing this, it seemed like I couldn't perform a
vm_mmap() without an existing VMA.

> This existed in my uaccess tests because I didn't use vm_mmap(), and I
> wanted complete control over the addresses used.
> 
> Given you add kunit_vm_mmap(), I don't think we want this VMA -- it
> doesn't serve any real purpose to tests, and accesses can erroneously
> hit it, which is problematic.
> 
> UBUF_ADDR_BASE shouldn't be necessary either with kunit_vm_mmap(),
> unless you want to use fixed addresses. That was just arbitrarily chosen
> to be above NULL and the usual minimum mmap limit.

I'll recheck whether this is needed. I think I had to make some other
changes as well, so maybe something here ended up being redundant
without my noticing it the first time.

Thanks for looking this over!

-- 
Kees Cook



Re: [PATCH 2/2] can: mcp251xfd: decorate mcp251xfd_rx_ring.obj with __counted_by()

2024-06-10 Thread Kees Cook
On Sun, Jun 09, 2024 at 01:54:19PM +0900, Vincent Mailhol wrote:
> A new __counted_by() attribute was introduced in [1]. It makes the
> compiler's sanitizer aware of the actual size of a flexible array
> member, allowing for additional runtime checks.
> 
> Apply the __counted_by() attribute to the obj flexible array member of
> struct mcp251xfd_rx_ring.
> 
> Note that the mcp251xfd_rx_ring.obj member is polymorphic: it can be
> either of:
> 
>   * an array of struct mcp251xfd_hw_rx_obj_can
>   * an array of struct mcp251xfd_hw_rx_obj_canfd
> 
> The canfd type was chosen in the declaration by the original author to
> reflect the upper bound. We pursue the same logic here: the sanitizer
> will only see the accurate size of canfd frames. For classical can
> frames, it will see a size bigger than the reality, making the check
> incorrect but silent (false negative).
> 
> [1] commit dd06e72e68bc ("Compiler Attributes: Add __counted_by macro")
> Link: https://git.kernel.org/torvalds/c/dd06e72e68bc
> 
> CC: Kees Cook 
> Signed-off-by: Vincent Mailhol 
> ---
>  drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h 
> b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h
> index 24510b3b8020..b7579fba9457 100644
> --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h
> +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h
> @@ -565,7 +565,7 @@ struct mcp251xfd_rx_ring {
>   union mcp251xfd_write_reg_buf uinc_buf;
>   union mcp251xfd_write_reg_buf uinc_irq_disable_buf;
>   struct spi_transfer uinc_xfer[MCP251XFD_FIFO_DEPTH];
> - struct mcp251xfd_hw_rx_obj_canfd obj[];
> + struct mcp251xfd_hw_rx_obj_canfd obj[] __counted_by(obj_num);
>  };
>  
>  struct __packed mcp251xfd_map_buf_nocrc {

This one seems safe:

rx_ring = kzalloc(sizeof(*rx_ring) + rx_obj_size * rx_obj_num,
  GFP_KERNEL);
...
rx_ring->obj_num = rx_obj_num;

But I would like to see the above allocation math replaced with
struct_size() usage:

rx_ring = kzalloc(struct_size(rx_ring, obj, rx_obj_num), 
GFP_KERNEL);

But that leaves me with a question about the use of rx_obj_size in the
original code. Why is this a variable size? rx_ring has only struct
mcp251xfd_hw_rx_obj_canfd objects in the flexible array...

I suspect that struct mcp251xfd_rx_ring needs to actually be using a
union for its flexible array, but I can't find anything that is casting
struct mcp251xfd_rx_ring::obj to struct mcp251xfd_hw_rx_obj_can.

I'm worried about __counted_by getting used here if the flexible array
isn't actually the right type.

-- 
Kees Cook



Re: [PATCH 1/2] can: peak_canfd: decorate pciefd_board.can with __counted_by()

2024-06-10 Thread Kees Cook
On Sun, Jun 09, 2024 at 01:54:18PM +0900, Vincent Mailhol wrote:
> A new __counted_by() attribute was introduced in [1]. It makes the
> compiler's sanitizer aware of the actual size of a flexible array
> member, allowing for additional runtime checks.
> 
> Move the end of line comments to the previous line to make room and
> apply the __counted_by() attribute to the can flexible array member of
> struct pciefd_board.
> 
> [1] commit dd06e72e68bc ("Compiler Attributes: Add __counted_by macro")
> Link: https://git.kernel.org/torvalds/c/dd06e72e68bc
> 
> CC: Kees Cook 
> Signed-off-by: Vincent Mailhol 
> ---
>  drivers/net/can/peak_canfd/peak_pciefd_main.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/can/peak_canfd/peak_pciefd_main.c 
> b/drivers/net/can/peak_canfd/peak_pciefd_main.c
> index 1df3c4b54f03..636102103a88 100644
> --- a/drivers/net/can/peak_canfd/peak_pciefd_main.c
> +++ b/drivers/net/can/peak_canfd/peak_pciefd_main.c
> @@ -190,8 +190,10 @@ struct pciefd_board {
>   void __iomem *reg_base;
>   struct pci_dev *pci_dev;
>   int can_count;
> - spinlock_t cmd_lock;/* 64-bits cmds must be atomic */
> - struct pciefd_can *can[];   /* array of network devices */
> + /* 64-bits cmds must be atomic */
> + spinlock_t cmd_lock;
> + /* array of network devices */
> + struct pciefd_can *can[] __counted_by(can_count);
>  };
>  
>  /* supported device ids. */

You'll need to adjust the code logic that manipulates "can_count", as
accesses to "can" will trap when they're seen as out of bounds. For
example:

pciefd = devm_kzalloc(>dev, struct_size(pciefd, can, can_count),
  GFP_KERNEL);
...
/* pciefd->can_count is "0" now */


while (pciefd->can_count < can_count) {
...
pciefd_can_probe(pciefd);
/* which does: */
pciefd->can[pciefd->can_count] = priv;  /// HERE
...
pciefd->can_count++;
}

The access at "HERE" above will trap: "can" is believed to have
"can_count" elements (0 on the first time through the loop).
This needs to be adjusted to increment "can_count" first.

Perhaps:


diff --git a/drivers/net/can/peak_canfd/peak_pciefd_main.c 
b/drivers/net/can/peak_canfd/peak_pciefd_main.c
index 1df3c4b54f03..df8304b2d291 100644
--- a/drivers/net/can/peak_canfd/peak_pciefd_main.c
+++ b/drivers/net/can/peak_canfd/peak_pciefd_main.c
@@ -676,7 +676,8 @@ static int pciefd_can_probe(struct pciefd_board *pciefd)
spin_lock_init(>tx_lock);
 
/* save the object address in the board structure */
-   pciefd->can[pciefd->can_count] = priv;
+   pciefd->can_count++;
+   pciefd->can[pciefd->can_count - 1] = priv;
 
dev_info(>pci_dev->dev, "%s at reg_base=0x%p irq=%d\n",
 ndev->name, priv->reg_base, ndev->irq);
@@ -800,8 +801,6 @@ static int peak_pciefd_probe(struct pci_dev *pdev,
err = pciefd_can_probe(pciefd);
if (err)
goto err_free_canfd;
-
-   pciefd->can_count++;
}
 
/* set system timestamps counter in RST mode */


-- 
Kees Cook



[PATCH] kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()

2024-06-10 Thread Kees Cook
When a flexible array structure has a __counted_by annotation, its use
with DEFINE_RAW_FLEX() will result in the count being zero-initialized.
This is expected since one doesn't want to use RAW with a counted_by
struct. Adjust the tests to check for the condition and for compiler
support.

Reported-by: Christian Schrefl 
Closes: 
https://lore.kernel.org/all/0bfc6b38-8bc5-4971-b6fb-dc642a73f...@gmail.com/
Suggested-by: Nathan Chancellor 
Signed-off-by: Kees Cook 
---
Cc: "Gustavo A. R. Silva" 
Cc: linux-hardening@vger.kernel.org
---
 lib/overflow_kunit.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/lib/overflow_kunit.c b/lib/overflow_kunit.c
index 4ef31b0bb74d..d305b0c054bb 100644
--- a/lib/overflow_kunit.c
+++ b/lib/overflow_kunit.c
@@ -1178,14 +1178,28 @@ struct foo {
s16 array[] __counted_by(counter);
 };
 
+struct bar {
+   int a;
+   u32 counter;
+   s16 array[];
+};
+
 static void DEFINE_FLEX_test(struct kunit *test)
 {
-   DEFINE_RAW_FLEX(struct foo, two, array, 2);
+   /* Using _RAW_ on a __counted_by struct will initialize "counter" to 
zero */
+   DEFINE_RAW_FLEX(struct foo, two_but_zero, array, 2);
+#if __has_attribute(__counted_by__)
+   int expected_raw_size = sizeof(struct foo);
+#else
+   int expected_raw_size = sizeof(struct foo) + 2 * sizeof(s16);
+#endif
+   /* Without annotation, it will always be on-stack size. */
+   DEFINE_RAW_FLEX(struct bar, two, array, 2);
DEFINE_FLEX(struct foo, eight, array, counter, 8);
DEFINE_FLEX(struct foo, empty, array, counter, 0);
 
-   KUNIT_EXPECT_EQ(test, __struct_size(two),
-   sizeof(struct foo) + sizeof(s16) + sizeof(s16));
+   KUNIT_EXPECT_EQ(test, __struct_size(two_but_zero), expected_raw_size);
+   KUNIT_EXPECT_EQ(test, __struct_size(two), sizeof(struct bar) + 2 * 
sizeof(s16));
KUNIT_EXPECT_EQ(test, __struct_size(eight), 24);
KUNIT_EXPECT_EQ(test, __struct_size(empty), sizeof(struct foo));
 }
-- 
2.34.1




Re: [PATCH v4 0/3] Hardening perf subsystem

2024-06-10 Thread Kees Cook
On Sat, Jun 01, 2024 at 06:56:15PM +0200, Erick Archer wrote:
> Hi everyone,
> 
> This is an effort to get rid of all multiplications from allocation
> functions in order to prevent integer overflows [1][2].

I didn't actually see these 3 patches in this thread nor via lore.

> In the first patch, the "struct amd_uncore_ctx" can be refactored to
> use a flex array for the "events" member. This way, the allocation/
> freeing of the memory can be simplified. Then, the struct_size()
> helper can be used to do the arithmetic calculation for the memory
> to be allocated.

I like this patch because it reduces the allocation from 2 to 1. This
isn't what Peter might see as "churn": this is an improvement in resource
utilization.

I think it's here:
https://lore.kernel.org/linux-hardening/as8pr02mb7237e4848b44a5226bd3551c8b...@as8pr02mb7237.eurprd02.prod.outlook.com/

> In the second patch, as the "struct intel_uncore_box" ends in a
> flexible array, the preferred way in the kernel is to use the
> struct_size() helper to do the arithmetic instead of the calculation
> "size + count * size" in the kzalloc_node() function.

This is
https://lore.kernel.org/linux-hardening/as8pr02mb7237f4d39bf6aa0ff40e91638b...@as8pr02mb7237.eurprd02.prod.outlook.com/

I prefer this style, as it makes things unambiguous ("this will never
wrap around") without having to check the associated types and doesn't make
the resulting binary code different in the "can never overflow" case.

In this particular case:

int size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg);

"int numshared" comes from struct intel_uncore_type::num_shared_regs,
which is:

unsigned num_shared_regs:8;

And the struct sizes are:

$ pahole -C intel_uncore_box gcc-boot/vmlinux | grep size:
/* size: 488, cachelines: 8, members: 19 */
$ pahole -C intel_uncore_extra_reg gcc-boot/vmlinux | grep size:
/* size: 96, cachelines: 2, members: 5 */

So we have:

s32 size = 488 + u8 * 96

Max size here is 24968 so it can never overflow an s32, so I can see
why Peter views this as "churn".

I still think the patch is a coding style improvement, but okay.

> In the third patch, as the "struct perf_buffer" also ends in a
> flexible array, the preferred way in the kernel is to use the
> struct_size() helper to do the arithmetic instead of the calculation
> "size + count * size" in the kzalloc_node() functions. At the same
> time, prepare for the coming implementation by GCC and Clang of the
> __counted_by attribute.

This is
https://lore.kernel.org/linux-hardening/as8pr02mb72379b4807f3951a1b926ba58b...@as8pr02mb7237.eurprd02.prod.outlook.com/

This provides __counted_by coverage, and I think this is important to
gain in ever place we can. Given that this is part of a ring buffer
implementation that is arbitrarily sized, this is exactly the kind of
place I'd like to see __counted_by used. This is a runtime robustness
improvement, so I don't see this a "churn" at all.


Peter, for patches 1 and 3, if you'd prefer not to carry them, I could
put them in the hardening tree to keep them out of your way. It seems
clear you don't want patch 2 at all.

-Kees

-- 
Kees Cook



Re: [PATCH v2] selftests: seccomp: fix format-zero-length warnings

2024-06-10 Thread Kees Cook
On Fri, Jun 07, 2024 at 02:58:47PM -0600, Shuah Khan wrote:
> On 6/7/24 06:41, Amer Al Shanawany wrote:
> > fix the following errors by removing empty print statements:
> > seccomp_benchmark.c:197:24: warning: zero-length gnu_printf format
> >   string [-Wformat-zero-length]
> >197 | ksft_print_msg("");
> >|^~
> > seccomp_benchmark.c:202:24: warning: zero-length gnu_printf format
> >   string [-Wformat-zero-length]
> >202 | ksft_print_msg("");
> >|^~
> > seccomp_benchmark.c:204:24: warning: zero-length gnu_printf format
> >   string [-Wformat-zero-length]
> >204 | ksft_print_msg("");
> >|^~
> > 
> > Reported-by: kernel test robot 
> > Closes: 
> > https://lore.kernel.org/oe-kbuild-all/202312260235.uj5ug8k9-...@intel.com/
> > Signed-off-by: Amer Al Shanawany 
> > ---
> > Changes v1 -> v2:
> > removed empty print statements
> 
> Kees,
> 
> Is this change okay with you. I didn't see any use for
> these empty ksft_print_msg().
> 
> I will take this patch if you are okay with the change.

Dropping these means that the "#" marks go missing. Currently:

# Running on:
# Linux proton 6.5.0-25-generic #25~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Feb 
20 16:09:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

with the proposed patch:

# Running on:
Linux proton 6.5.0-25-generic #25~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Feb 20 
16:09:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

This breaks the TAP syntax for the test, so we should find a different
solution.

Perhaps:

ksft_print_msg("%s", "");

?

-Kees


> 
> > ---
> >   tools/testing/selftests/seccomp/seccomp_benchmark.c | 3 ---
> >   1 file changed, 3 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/seccomp/seccomp_benchmark.c 
> > b/tools/testing/selftests/seccomp/seccomp_benchmark.c
> > index b83099160fbc..6fe34be6c693 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_benchmark.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_benchmark.c
> > @@ -194,14 +194,11 @@ int main(int argc, char *argv[])
> > ksft_set_plan(7);
> > ksft_print_msg("Running on:\n");
> > -   ksft_print_msg("");
> > system("uname -a");
> > ksft_print_msg("Current BPF sysctl settings:\n");
> > /* Avoid using "sysctl" which may not be installed. */
> > -   ksft_print_msg("");
> > system("grep -H . /proc/sys/net/core/bpf_jit_enable");
> > -   ksft_print_msg("");
> > system("grep -H . /proc/sys/net/core/bpf_jit_harden");
> > affinity();
> 
> 
> thanks,
> -- Shuah

-- 
Kees Cook



Re: [PATCH v2 0/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up

2024-06-10 Thread Kees Cook
On Fri, Jun 07, 2024 at 04:54:41PM -0300, Guilherme G. Piccoli wrote:
> On 06/06/2024 12:01, Steven Rostedt wrote:
> > Reserve unspecified location of physical memory from kernel command line
> > [...]
> > Solution:
> > 
> > The solution I have come up with is to introduce a new "reserve_mem=" kernel
> > command line. This parameter takes the following format:
> > 
> >   reserve_mem=nn:align:label
> > 
> > Where nn is the size of memory to reserve, the align is the alignment of
> > that memory, and label is the way for other sub-systems to find that memory.
> > This way the kernel command line could have:
> > 
> >   reserve_mem=12M:4096:oops   ramoops.mem_name=oops
> > 
> > At boot up, the kernel will search for 12 megabytes in usable memory regions
> > with an alignment of 4096. It will start at the highest regions and work its
> > way down (for those old devices that want access to lower address DMA). When
> > it finds a region, it will save it off in a small table and mark it with the
> > "oops" label. Then the pstore ramoops sub-system could ask for that memory
> > and location, and it will map itself there.
> > 
> > This prototype allows for 8 different mappings (which may be overkill, 4 is
> > probably plenty) with 16 byte size to store the label.
> > 
> > I have tested this and it works for us to solve the above problem. We can
> > update the kernel and command line and increase the size of pstore without
> > needing to update the firmware, or knowing every memory layout of each
> > board. I only tested this locally, it has not been tested in the field.
> > 
> 
> Hi Steve, first of all, thanks for this work! This is much appreciated.
> The kdumpst tooling (Arch Linux) makes use of pstore when available, and
> the recommendation so far was to reserve memory somehow, like "mem=" or
> use kdump instead, if no free RAM area was available.
> 
> With your solution, things get way more "elegant". Also, I think we all
> know pstore is not 100% reliable, specially the RAM backend due to
> already mentioned reasons (like FW memory retraining, ECC memory, etc),
> but it's great we have a mechanism to **try it**. If it works, awesome -
> for statistical analysis, this is very useful; pstore has been used with
> success in the Steam Deck, for example.
> 
> With all that said, I've tested your patches on top of 6.10-rc2 in 2
> qemu VMs (one running legacy BIOS - seabios - and the other UEFI - using
> ovmf) and on Steam Deck, and it's working flawlessly. I've tested only
> using ramoops as module.
> 
> Some code review in the patches themselves (like a missing
> EXPORT_SYMBOL_GPL), but all in all, that's a great addition! Feel free
> to add my:
> 
> Tested-by: Guilherme G. Piccoli 

Yeah, I think this looks good as long as it's understood to be a "best
effort", and will radically simplify doing qemu testing, etc. I expect I
can take v3 into -next with the fixes Guilherme noted.

-Kees

-- 
Kees Cook



Re: [PATCH] mm/util: Swap kmemdup_array() arguments

2024-06-06 Thread Kees Cook
On Thu, Jun 06, 2024 at 08:48:37PM +0300, Andy Shevchenko wrote:
> On Thu, Jun 6, 2024 at 8:46 PM Kees Cook  wrote:
> >
> > On Thu, Jun 06, 2024 at 08:35:13PM +0300, Andy Shevchenko wrote:
> > > On Thu, Jun 6, 2024 at 6:56 PM Kees Cook  wrote:
> > > > On Thu, 06 Jun 2024 15:46:09 +0100, Jean-Philippe Brucker wrote:
> > >
> > > [...]
> > >
> > > > Applied to for-next/hardening, thanks!
> > >
> > > Btw, is it possible to get this for v6.10, so we may start enabling it
> > > for others?
> >
> > Which others do you mean?
> 
> There are a lot of users of kmemdup(x*y) which I want to convert
> sooner than later to kmemdup_array(x,y).

Ah-ha, I see what you mean. Well, I'm not sure we can do v6.10 for this
because rc2 is behind us, and that's what most subsystems merge to. I
can land the patch for rc3 so there will be no warnings in Linus's
tree/-next, but conversions in subsystem trees will gain warnings, I
think...

-- 
Kees Cook



Re: [PATCH] mm/util: Swap kmemdup_array() arguments

2024-06-06 Thread Kees Cook
On Thu, Jun 06, 2024 at 08:35:13PM +0300, Andy Shevchenko wrote:
> On Thu, Jun 6, 2024 at 6:56 PM Kees Cook  wrote:
> > On Thu, 06 Jun 2024 15:46:09 +0100, Jean-Philippe Brucker wrote:
> 
> [...]
> 
> > Applied to for-next/hardening, thanks!
> 
> Btw, is it possible to get this for v6.10, so we may start enabling it
> for others?

Which others do you mean?

-- 
Kees Cook



Re: [PATCH v2] pstore/ram: Replace of_node_put with __free() for automatic cleanup

2024-06-06 Thread Kees Cook
On Wed, Jun 05, 2024 at 09:49:44PM +, Abhinav Jain wrote:
> Add __free(device_node) to the parent_node struct declaration
> Add changes to incorporate the review comments from v1
> 
> Signed-off-by: Abhinav Jain 
> 
> PATCH v1 link:
> https://lore.kernel.org/all/20240415161409.8375-1-jain.abhinav...@gmail.com/
> 
> Changes since v1:
>  - Moved the variable definition back to the top of the function body

The history and version links should be below the "---" line.

But more importantly, please base your patch on the upstream tree,
rather than on your v1 patch. :) (i.e. squash your v1 and v2 together).

-Kees

> ---
>  fs/pstore/ram.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
> index 14f2f4864e48..f8258e4567c3 100644
> --- a/fs/pstore/ram.c
> +++ b/fs/pstore/ram.c
> @@ -644,6 +644,7 @@ static int ramoops_parse_dt(struct platform_device *pdev,
>   struct ramoops_platform_data *pdata)
>  {
>   struct device_node *of_node = pdev->dev.of_node;
> + struct device_node *parent_node __free(device_node) = 
> of_node_parent(of_node);
>   struct resource *res;
>   u32 value;
>   int ret;
> @@ -703,7 +704,6 @@ static int ramoops_parse_dt(struct platform_device *pdev,
>* we're not a child of "reserved-memory" and mimicking the
>* expected behavior.
>*/
> - struct device_node *parent_node __free(device_node) = 
> of_node_parent(of_node);
>   if (!of_node_name_eq(parent_node, "reserved-memory") &&
>   !pdata->console_size && !pdata->ftrace_size &&
>   !pdata->pmsg_size && !pdata->ecc_info.ecc_size) {
> -- 
> 2.34.1
> 

-- 
Kees Cook



Re: [PATCH] mm/util: Swap kmemdup_array() arguments

2024-06-06 Thread Kees Cook
On Thu, 06 Jun 2024 15:46:09 +0100, Jean-Philippe Brucker wrote:
> GCC 14.1 complains about the argument usage of kmemdup_array():
> 
>   drivers/soc/tegra/fuse/fuse-tegra.c:130:65: error: 'kmemdup_array' sizes 
> specified with 'sizeof' in the earlier argument and not in the later argument 
> [-Werror=calloc-transposed-args]
> 130 | fuse->lookups = kmemdup_array(fuse->soc->lookups, 
> sizeof(*fuse->lookups),
> | ^
>   drivers/soc/tegra/fuse/fuse-tegra.c:130:65: note: earlier argument should 
> specify number of elements, later size of each element
> 
> [...]

Applied to for-next/hardening, thanks!

[1/1] mm/util: Swap kmemdup_array() arguments
  https://git.kernel.org/kees/c/0ee14725471c

Take care,

-- 
Kees Cook




Re: [PATCH v4 4/6] mm/slab: Introduce kmem_buckets_create() and family

2024-06-04 Thread Kees Cook
On Tue, Jun 04, 2024 at 04:13:32PM -0600, Tycho Andersen wrote:
> On Tue, Jun 04, 2024 at 04:02:28PM +0100, Simon Horman wrote:
> > On Fri, May 31, 2024 at 12:14:56PM -0700, Kees Cook wrote:
> > > + for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) {
> > > + char *short_size, *cache_name;
> > > + unsigned int cache_useroffset, cache_usersize;
> > > + unsigned int size;
> > > +
> > > + if (!kmalloc_caches[KMALLOC_NORMAL][idx])
> > > + continue;
> > > +
> > > + size = kmalloc_caches[KMALLOC_NORMAL][idx]->object_size;
> > > + if (!size)
> > > + continue;
> > > +
> > > + short_size = strchr(kmalloc_caches[KMALLOC_NORMAL][idx]->name, 
> > > '-');
> > > + if (WARN_ON(!short_size))
> > > + goto fail;
> > > +
> > > + cache_name = kasprintf(GFP_KERNEL, "%s-%s", name, short_size + 
> > > 1);
> > > + if (WARN_ON(!cache_name))
> > > + goto fail;
> > > +
> > > + if (useroffset >= size) {
> > > + cache_useroffset = 0;
> > > + cache_usersize = 0;
> > > + } else {
> > > + cache_useroffset = useroffset;
> > > + cache_usersize = min(size - cache_useroffset, usersize);
> > > + }
> > > + (*b)[idx] = kmem_cache_create_usercopy(cache_name, size,
> > > + align, flags, cache_useroffset,
> > > + cache_usersize, ctor);
> > > + kfree(cache_name);
> > > + if (WARN_ON(!(*b)[idx]))
> > > + goto fail;
> > > + }
> > > +
> > > + return b;
> > > +
> > > +fail:
> > > + for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) {
> > > + if ((*b)[idx])
> > > + kmem_cache_destroy((*b)[idx]);
> > 
> > nit: I don't think it is necessary to guard this with a check for NULL.
> 
> Isn't it? What if a kasprintf() fails halfway through the loop?

He means that kmem_cache_destroy() already checks for NULL. Quite right!

void kmem_cache_destroy(struct kmem_cache *s)
{
int err = -EBUSY;
bool rcu_set;

if (unlikely(!s) || !kasan_check_byte(s))
return;


-- 
Kees Cook



Re: [PATCH] HID: usbhid: fix recurrent out-of-bounds bug in usbhid_parse()

2024-06-04 Thread Kees Cook
On Tue, Jun 04, 2024 at 10:09:43AM -0700, Nikita Zhandarovich wrote:
> Hi,
> 
> On 6/4/24 07:15, Jiri Kosina wrote:
> > On Tue, 4 Jun 2024, Kees Cook wrote:
> > 
> >> This isn't the right solution. The problem is that hid_class_descriptor 
> >> is a flexible array but was sized as a single element fake flexible 
> >> array:
> >>
> >> struct hid_descriptor {
> >>   __u8  bLength;
> >>   __u8  bDescriptorType;
> >>   __le16 bcdHID;
> >>   __u8  bCountryCode;
> >>   __u8  bNumDescriptors;
> >>
> >>   struct hid_class_descriptor desc[1];
> >> } __attribute__ ((packed));
> >>
> >> This likely needs to be: 
> >>
> >> struct hid_class_descriptor desc[] __counted_by(bNumDescriptors);
> >>
> >> And then check for any sizeof() uses of the struct that might have changed.
> > 
> > Ah, you are of course right, not sure what I was thinking. Thanks a lot 
> > for catching my brainfart.
> > 
> > I am dropping the patch for now; Nikita, will you please send a refreshed 
> > one?
> > 
> 
> Thanks for catching my mistake.
> 
> I'll gladly send a revised version, hoping to do it very soon.

I spent a little more time looking at this, and I'm not sure I
understand where the actual space for the descriptors comes from?
There's interface->extra that is being parsed, and effectively
hid_descriptor is being mapped into it, but it uses "sizeof(struct
hid_descriptor)" for the limit. Is more than 1 descriptor expected to
work correctly? Or is the limit being ignored? I'm a bit confused by
this code...

-- 
Kees Cook



Re: [PATCH] HID: usbhid: fix recurrent out-of-bounds bug in usbhid_parse()

2024-06-04 Thread Kees Cook



On June 4, 2024 1:15:35 AM PDT, Jiri Kosina  wrote:
>On Fri, 24 May 2024, Nikita Zhandarovich wrote:
>
>> Syzbot reports [1] a reemerging out-of-bounds bug regarding hid
>> descriptors possibly having incorrect bNumDescriptors values in
>> usbhid_parse().
>> 
>> Build on the previous fix in "HID: usbhid: fix out-of-bounds bug"
>> and run a sanity-check ensuring that number of descriptors doesn't
>> exceed the size of desc[] in struct hid_descriptor.
>> 
>> [1] Syzbot report:
>> Link: https://syzkaller.appspot.com/bug?extid=c52569baf0c843f35495
>> 
>> UBSAN: array-index-out-of-bounds in drivers/hid/usbhid/hid-core.c:1024:7
>> index 1 is out of range for type 'struct hid_class_descriptor[1]'
>> CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 
>> 6.9.0-rc6-syzkaller-00290-gb9158815de52 #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
>> Google 03/27/2024
>> Workqueue: usb_hub_wq hub_event
>> Call Trace:
>>  
>>  __dump_stack lib/dump_stack.c:88 [inline]
>>  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
>>  ubsan_epilogue lib/ubsan.c:231 [inline]
>>  __ubsan_handle_out_of_bounds+0x121/0x150 lib/ubsan.c:429
>>  usbhid_parse+0x5a7/0xc80 drivers/hid/usbhid/hid-core.c:1024
>>  hid_add_device+0x132/0x520 drivers/hid/hid-core.c:2790
>>  usbhid_probe+0xb38/0xea0 drivers/hid/usbhid/hid-core.c:1429
>>  usb_probe_interface+0x645/0xbb0 drivers/usb/core/driver.c:399
>>  really_probe+0x2b8/0xad0 drivers/base/dd.c:656
>>  __driver_probe_device+0x1a2/0x390 drivers/base/dd.c:798
>>  driver_probe_device+0x50/0x430 drivers/base/dd.c:828
>>  __device_attach_driver+0x2d6/0x530 drivers/base/dd.c:956
>>  bus_for_each_drv+0x24e/0x2e0 drivers/base/bus.c:457
>>  __device_attach+0x333/0x520 drivers/base/dd.c:1028
>>  bus_probe_device+0x189/0x260 drivers/base/bus.c:532
>>  device_add+0x8ff/0xca0 drivers/base/core.c:3720
>>  usb_set_configuration+0x1976/0x1fb0 drivers/usb/core/message.c:2210
>>  usb_generic_driver_probe+0x88/0x140 drivers/usb/core/generic.c:254
>>  usb_probe_device+0x1b8/0x380 drivers/usb/core/driver.c:294
>> 
>> Reported-and-tested-by: syzbot+c52569baf0c843f35...@syzkaller.appspotmail.com
>> Fixes: f043bfc98c19 ("HID: usbhid: fix out-of-bounds bug")
>> Signed-off-by: Nikita Zhandarovich 
>
>Applied, thanks.

This isn't the right solution. The problem is that hid_class_descriptor is a 
flexible array but was sized as a single element fake flexible array: 

struct hid_descriptor {
   __u8  bLength;
   __u8  bDescriptorType;
   __le16 bcdHID;
   __u8  bCountryCode;
       __u8  bNumDescriptors;

   struct hid_class_descriptor desc[1];
} __attribute__ ((packed));

This likely needs to be: 

struct hid_class_descriptor desc[] __counted_by(bNumDescriptors);

And then check for any sizeof() uses of the struct that might have changed.



-- 
Kees Cook



Re: [PATCH 1/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up

2024-06-03 Thread Kees Cook



On June 3, 2024 4:33:31 PM PDT, Steven Rostedt  wrote:
>From: "Steven Rostedt (Google)" 
>
>In order to allow for requesting a memory region that can be used for
>things like pstore on multiple machines where the memory layout is not the
>same, add a new option to the kernel command line called "reserve_mem".
>
>The format is:  reserve_mem=nn:align:name
>
>Where it will find nn amount of memory at the given alignment of align.
>The name field is to allow another subsystem to retrieve where the memory
>was found. For example:
>
>  reserve_mem=12M:4096:oops ramoops.mem_name=oops

How does this interact with KASLR? It has chosen its physical location before 
this parsing happens, so I'd expect this to fail once in a while, unless the 
size/alignment is lucky enough that KASLR never uses that portion of the 
physical memory...

-Kees

>
>Where ramoops.mem_name will tell ramoops that memory was reserved for it
>via the reserve_mem option and it can find it by calling:
>
>  if (reserve_mem_find_by_name("oops", , )) {
>   // start holds the start address and size holds the size given
>
>Link: https://lore.kernel.org/all/zjjvnzux3nzig...@kernel.org/
>
>Suggested-by: Mike Rapoport 
>Signed-off-by: Steven Rostedt (Google) 
>---
> include/linux/mm.h |  2 +
> mm/memblock.c  | 97 ++
> 2 files changed, 99 insertions(+)
>
>diff --git a/include/linux/mm.h b/include/linux/mm.h
>index 9849dfda44d4..b4455cc02f2c 100644
>--- a/include/linux/mm.h
>+++ b/include/linux/mm.h
>@@ -4263,4 +4263,6 @@ static inline bool pfn_is_unaccepted_memory(unsigned 
>long pfn)
> void vma_pgtable_walk_begin(struct vm_area_struct *vma);
> void vma_pgtable_walk_end(struct vm_area_struct *vma);
> 
>+int reserve_mem_find_by_name(const char *name, unsigned long *start, unsigned 
>long *size);
>+
> #endif /* _LINUX_MM_H */
>diff --git a/mm/memblock.c b/mm/memblock.c
>index d09136e040d3..a8bf0ee9e2b4 100644
>--- a/mm/memblock.c
>+++ b/mm/memblock.c
>@@ -2244,6 +2244,103 @@ void __init memblock_free_all(void)
>   totalram_pages_add(pages);
> }
> 
>+/* Keep a table to reserve named memory */
>+#define RESERVE_MEM_MAX_ENTRIES   8
>+#define RESERVE_MEM_NAME_SIZE 16
>+struct reserve_mem_table {
>+  charname[RESERVE_MEM_NAME_SIZE];
>+  unsigned long   start;
>+  unsigned long   size;
>+};
>+static struct reserve_mem_table reserved_mem_table[RESERVE_MEM_MAX_ENTRIES];
>+static int reserved_mem_count;
>+
>+/* Add wildcard region with a lookup name */
>+static int __init reserved_mem_add(unsigned long start, unsigned long size,
>+ const char *name)
>+{
>+  struct reserve_mem_table *map;
>+
>+  if (!name || !name[0] || strlen(name) >= RESERVE_MEM_NAME_SIZE)
>+  return -EINVAL;
>+
>+  if (reserved_mem_count >= RESERVE_MEM_MAX_ENTRIES)
>+  return -1;
>+
>+  map = _mem_table[reserved_mem_count++];
>+  map->start = start;
>+  map->size = size;
>+  strscpy(map->name, name);
>+  return 0;
>+}
>+
>+/**
>+ * reserve_mem_find_by_name - Find reserved memory region with a given name
>+ * @name: The name that is attached to a reserved memory region
>+ * @start: If found, holds the start address
>+ * @size: If found, holds the size of the address.
>+ *
>+ * Returns: 1 if found or 0 if not found.
>+ */
>+int reserve_mem_find_by_name(const char *name, unsigned long *start, unsigned 
>long *size)
>+{
>+  struct reserve_mem_table *map;
>+  int i;
>+
>+  for (i = 0; i < reserved_mem_count; i++) {
>+  map = _mem_table[i];
>+  if (!map->size)
>+  continue;
>+  if (strcmp(name, map->name) == 0) {
>+  *start = map->start;
>+  *size = map->size;
>+  return 1;
>+  }
>+  }
>+  return 0;
>+}
>+
>+/*
>+ * Parse early_reserve_mem=nn:align:name
>+ */
>+static int __init reserve_mem(char *p)
>+{
>+  phys_addr_t start, size, align;
>+  char *oldp;
>+  int err;
>+
>+  if (!p)
>+  return -EINVAL;
>+
>+  oldp = p;
>+  size = memparse(p, );
>+  if (p == oldp)
>+  return -EINVAL;
>+
>+  if (*p != ':')
>+  return -EINVAL;
>+
>+  align = memparse(p+1, );
>+  if (*p != ':')
>+  return -EINVAL;
>+
>+  start = memblock_phys_alloc(size, align);
>+  if (!start)
>+  return -ENOMEM;
>+
>+  p++;
>+  err = reserved_mem_add(start, size, p);
>+  if (err) {
>+  memblock_phys_free(start, size);
>+  return err;
>+  }
>+
>+  p += strlen(p);
>+
>+  return *p == '\0' ? 0: -EINVAL;
>+}
>+__setup("reserve_mem=", reserve_mem);
>+
> #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_ARCH_KEEP_MEMBLOCK)
> static const char * const flagname[] = {
>   [ilog2(MEMBLOCK_HOTPLUG)] = "HOTPLUG",

-- 
Kees Cook



Re: [PATCH v4 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()

2024-06-03 Thread Kees Cook
On Mon, Jun 03, 2024 at 07:06:15PM +0200, Vlastimil Babka wrote:
> On 5/31/24 9:14 PM, Kees Cook wrote:
> > Introduce CONFIG_SLAB_BUCKETS which provides the infrastructure to
> > support separated kmalloc buckets (in the follow kmem_buckets_create()
> > patches and future codetag-based separation). Since this will provide
> > a mitigation for a very common case of exploits, enable it by default.
> 
> Are you sure? I thought there was a policy that nobody is special enough
> to have stuff enabled by default. Is it worth risking Linus shouting? :)

I think it's important to have this enabled given how common the
exploitation methodology is and how cheap this solution is. Regardless,
if you want it "default n", I can change it.

> I found this too verbose and tried a different approach, in the end rewrote
> everything to verify the idea works. So I'll just link to the result in git:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-buckets-v4-rewrite
> 
> It's also rebased on slab.git:slab/for-6.11/cleanups with some alloc_hooks()
> cleanups that would cause conflicts otherwkse.
> 
> But the crux of that approach is:
> 
> /*
>  * These macros allow declaring a kmem_buckets * parameter alongside size, 
> which
>  * can be compiled out with CONFIG_SLAB_BUCKETS=n so that a large number of 
> call
>  * sites don't have to pass NULL.
>  */
> #ifdef CONFIG_SLAB_BUCKETS
> #define DECL_BUCKET_PARAMS(_size, _b)   size_t (_size), kmem_buckets *(_b)
> #define PASS_BUCKET_PARAMS(_size, _b)   (_size), (_b)
> #define PASS_BUCKET_PARAM(_b)   (_b)
> #else
> #define DECL_BUCKET_PARAMS(_size, _b)   size_t (_size)
> #define PASS_BUCKET_PARAMS(_size, _b)   (_size)
> #define PASS_BUCKET_PARAM(_b)   NULL
> #endif
> 
> Then we have declaration e.g.
> 
> void *__kmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int 
> node)
> __assume_kmalloc_alignment __alloc_size(1);
> 
> and the function is called like (from code not using buckets)
> return __kmalloc_node_noprof(PASS_BUCKET_PARAMS(size, NULL), flags, node);
> 
> or (from code using buckets)
> #define kmem_buckets_alloc(_b, _size, _flags)   \
> alloc_hooks(__kmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, _b), 
> _flags, NUMA_NO_NODE))
> 
> And implementation looks like:
> 
> void *__kmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int 
> node)
> {
> return __do_kmalloc_node(size, PASS_BUCKET_PARAM(b), flags, node, 
> _RET_IP_);
> }
> 
> The size param is always the first, so the __alloc_size(1) doesn't need 
> tweaking.
> size is also used in the macros even if it's never mangled, because it's easy
> to pass one param instead of two, but not zero params instead of one, if we 
> want
> the ending comma not be part of the macro (which would look awkward).
> 
> Does it look ok to you? Of course names of the macros could be tweaked. 
> Anyway feel
> free to use the branch for the followup. Hopefully this way is also 
> compatible with
> the planned codetag based followup.

This looks really nice, thank you! This is well aligned with the codetag
followup, which also needs to have "size" be very easy to find (to the
macros can check for compile-time-constant or not).

I will go work from your branch...

Thanks!

-Kees

-- 
Kees Cook



Re: [GIT PULL] bcachefs updates fro 6.10-rc1

2024-06-01 Thread Kees Cook
On Sat, Jun 01, 2024 at 12:33:31PM +0100, Mark Brown wrote:
> On Mon, May 20, 2024 at 12:10:31PM -0400, James Bottomley wrote:
> > On Sun, 2024-05-19 at 23:52 -0400, Kent Overstreet wrote:
> 
> > > I also do (try to) post patches to the list that are doing something
> > > interesting and worth discussion; the vast majority this cycle has
> > > been boring syzbot crap...
> 
> > you still don't say what problem not posting most patches solves?  You
> > imply it would slow you down, but getting git-send-email to post to a
> > mailing list can actually be automated through a pre-push commit hook
> > with no slowdown in the awesome rate at which you apply patches to your
> > own tree.
> 
> > Linux kernel process exists because it's been found to work over time.
> > That's not to say it can't be changed, but it usually requires at least
> > some stab at a reason before that happens.
> 
> Even if no meaningful review ever happens on the actual posts there's
> still utility in having the patches on a list and findable in lore,
> since everything is normally on the list people end up with workflows
> that assume that they'll be able to find things there.  For example it's
> common for test people who identify which patch introduces an issue to
> grab the patch from lore in order to review any discussion of the patch,
> then report by replying to the patch to help with context for their
> report and get some help with figuring out a CC list.  Posting costs
> very little and makes people's lives easier.

Exactly. This is the standard workflow that everyone depends on.

So, for example, for my -next trees, I only ever add patches to them via
"b4 am lore-url-here...".

If I've got patches to add to -next from some devel tree, I don't
cherry-pick them to my -next tree: I send them to lore, and then pull
them back down.

But the point is: send your stuff to lore. :)

-- 
Kees Cook



Re: [PATCH v2] x86/traps: Enable UBSAN traps on x86

2024-06-01 Thread Kees Cook
On Sat, Jun 01, 2024 at 03:10:05AM +, Gatlin Newhouse wrote:
> +void handle_ubsan_failure(struct pt_regs *regs, int insn)
> +{
> + u32 type = 0;
> +
> + if (insn == INSN_ASOP) {
> + type = (*(u16 *)(regs->ip + LEN_ASOP + LEN_UD1));
> + if ((type & 0xFF) == 0x40)
> + type = (type >> 8) & 0xFF;
> + } else {
> + type = (*(u16 *)(regs->ip + LEN_UD1));
> + if ((type & 0xFF) == 0x40)
> + type = (type >> 8) & 0xFF;
> + }

The if/else code is repeated, but the only difference is the offset to
read from. Also, if the 0x40 is absent, we likely don't want to report
anything. So, perhaps:

u16 offset = LEN_UD1;
u32 type;

if (insn == INSN_ASOP)
offset += INSN_ASOP;
type = *(u16 *)(regs->ip + offset);
if ((type & 0xFF) != 0x40)
return;

type = (type >> 8) & 0xFF;
    pr_crit("%s at %pS\n", report_ubsan_failure(regs, type), (void 
*)regs->ip);



-- 
Kees Cook



Re: [PATCH] x86/boot: add prototype for __fortify_panic()

2024-05-31 Thread Kees Cook
On Fri, May 31, 2024 at 11:20:09PM +0200, Borislav Petkov wrote:
> So I get an allergic reaction everytime we wag the dog - i.e., fix the
> code because some tool or option can't handle it even if it is
> a perfectly fine code. In that case it is an unused symbol.
> 
> And frankly, I'd prefer the silly warning to denote that fortify doesn't
> need to do any checking there vs shutting it up just because.

If we want to declare that x86 boot will never perform string handling
on strings with unknown lengths, we could just delete the boot/
implementation of __fortify_panic(), and make it a hard failure if such
cases are introduced in the future. This hasn't been a particularly
friendly solution in the past, though, as the fortify routines do tend
to grow additional coverage over time, so there may be future cases that
do trip the runtime checking...

-- 
Kees Cook



Re: [PATCH] x86/boot: add prototype for __fortify_panic()

2024-05-31 Thread Kees Cook
On Fri, May 31, 2024 at 10:49:47PM +0200, Borislav Petkov wrote:
> On Fri, May 31, 2024 at 01:46:37PM -0700, Kees Cook wrote:
> > Please do not do this. It still benefits from compile-time sanity
> > checking.
> 
> Care to elaborate how exactly it benefits?

Because when new code gets added that accidentally does improper string
handling, fortify will yell about it at compile time. e.g, if someone
typos something like:


#define BUF_LEN_FOO 16
...
#define BUF_LEN_BAR 10

struct foo {
...
char buf[BUF_LEN_FOO];
...
};

...

void process_stuff(struct foo *p)
{
...
char local_copy[BUF_LEN_BAR];
...

strcpy(local_copy, p->buf);
...
}

or refactors and forgets to change some name, etc. It's all for catching
bugs before they happen, etc. And when source string lengths aren't
known, the runtime checking can kick in too. It happens x86 boot doesn't
have any of those (good!) so __fortify_panic() goes unused there. But
that's a larger topic covered by stuff like
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION, etc.

-- 
Kees Cook



Re: [PATCH v3 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()

2024-05-31 Thread Kees Cook
On Fri, May 31, 2024 at 12:51:29PM -0400, Kent Overstreet wrote:
> On Fri, May 31, 2024 at 09:48:49AM -0700, Kees Cook wrote:
> > On Fri, May 24, 2024 at 11:01:40AM -0400, Kent Overstreet wrote:
> > > On Wed, Apr 24, 2024 at 02:40:59PM -0700, Kees Cook wrote:
> > > > To be able to choose which buckets to allocate from, make the buckets
> > > > available to the lower level kmalloc interfaces by adding them as the
> > > > first argument. Where the bucket is not available, pass NULL, which 
> > > > means
> > > > "use the default system kmalloc bucket set" (the prior existing 
> > > > behavior),
> > > > as implemented in kmalloc_slab().
> > > 
> > > I thought the plan was to use codetags for this? That would obviate the
> > > need for all this plumbing.
> > > 
> > > Add fields to the alloc tag for:
> > >  - allocation size (or 0 if it's not a compile time constant)
> > >  - union of kmem_cache, kmem_buckets, depending on whether the
> > >allocation size is constant or not
> > 
> > I want to provide "simple" (low-hanging fruit) coverage that can live
> > separately from the codetags-based coverage. The memory overhead for
> > this patch series is negligible, but I suspect the codetags expansion,
> > while not giant, will be more than some deployments will want. I want
> > to avoid an all-or-nothing solution -- which is why I had intended this
> > to be available "by default".
> 
> technically there's no reason for your thing to depend on
> CONFIG_CODETAGGING at all, that's the infrastructure for finding
> codetags for e.g. /proc/allocinfo. you'd just be using the alloc_hoos()
> macro and struct alloc_tag as a place to stash the kmem_buckets pointer.

It's the overhead of separate kmem_cache and kmem_buckets for every
allocation location that I meant. So I'd like the "simple" version for
gaining coverage over the currently-being-regularly-exploited cases, and
then allow for the "big hammer" solution too.

However, I do think I'll still need the codetag infra because of the
sections, etc. I think we'll need to pre-build the caches, but maybe
that could be avoided by adding some kind of per-site READ_ONCE/lock
thingy to create them on demand. We'll see! :)

-- 
Kees Cook



Re: [PATCH] x86/boot: add prototype for __fortify_panic()

2024-05-31 Thread Kees Cook
On Fri, May 31, 2024 at 09:08:16PM +0200, Borislav Petkov wrote:
> On Fri, May 31, 2024 at 09:53:28AM -0700, Kees Cook wrote:
> > Under CONFIG_FORTIFY_SOURCE, the boot code *does* still uses
> > fortify-string.h. It lets us both catch mistakes we can discover at
> > compile and will catch egregious runtime mistakes, though the reporting
> > is much simpler in the boot code.
> 
> From where I'm standing, we're not catching anything in the
> decompressor:
> 
> $  objdump -D arch/x86/boot/compressed/vmlinux | grep __fortify_panic
> 01bec250 <__fortify_panic>:
> $
> 
> Sure, in vmlinux proper (allmodconfig) we do:
> 
> objdump -D vmlinux | grep __fortify_panic | wc -l
> 1417
> 
> but not in the decompressor which is special anyway.
> 
> So we can just as well disable CONFIG_FORTIFY_SOURCE in the decompressor
> and not do silly prototypes.

Please do not do this. It still benefits from compile-time sanity
checking.

-- 
Kees Cook



[PATCH v4 5/6] ipc, msg: Use dedicated slab buckets for alloc_msg()

2024-05-31 Thread Kees Cook
The msg subsystem is a common target for exploiting[1][2][3][4][5][6][7]
use-after-free type confusion flaws in the kernel for both read and write
primitives. Avoid having a user-controlled dynamically-size allocation
share the global kmalloc cache by using a separate set of kmalloc buckets
via the kmem_buckets API.

Link: 
https://blog.hacktivesecurity.com/index.php/2022/06/13/linux-kernel-exploit-development-1day-case-study/
 [1]
Link: https://hardenedvault.net/blog/2022-11-13-msg_msg-recon-mitigation-ved/ 
[2]
Link: 
https://www.willsroot.io/2021/08/corctf-2021-fire-of-salvation-writeup.html [3]
Link: https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html [4]
Link: 
https://google.github.io/security-research/pocs/linux/cve-2021-22555/writeup.html
 [5]
Link: https://zplin.me/papers/ELOISE.pdf [6]
Link: https://syst3mfailure.io/wall-of-perdition/ [7]
Signed-off-by: Kees Cook 
---
Cc: "GONG, Ruiqi" 
Cc: Xiu Jianfeng 
Cc: Suren Baghdasaryan 
Cc: Kent Overstreet 
Cc: Jann Horn 
Cc: Matteo Rizzo 
Cc: jvoisin 
---
 ipc/msgutil.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/ipc/msgutil.c b/ipc/msgutil.c
index d0a0e877cadd..f392f30a057a 100644
--- a/ipc/msgutil.c
+++ b/ipc/msgutil.c
@@ -42,6 +42,17 @@ struct msg_msgseg {
 #define DATALEN_MSG((size_t)PAGE_SIZE-sizeof(struct msg_msg))
 #define DATALEN_SEG((size_t)PAGE_SIZE-sizeof(struct msg_msgseg))
 
+static kmem_buckets *msg_buckets __ro_after_init;
+
+static int __init init_msg_buckets(void)
+{
+   msg_buckets = kmem_buckets_create("msg_msg", 0, SLAB_ACCOUNT,
+ sizeof(struct msg_msg),
+ DATALEN_MSG, NULL);
+
+   return 0;
+}
+subsys_initcall(init_msg_buckets);
 
 static struct msg_msg *alloc_msg(size_t len)
 {
@@ -50,7 +61,7 @@ static struct msg_msg *alloc_msg(size_t len)
size_t alen;
 
alen = min(len, DATALEN_MSG);
-   msg = kmalloc(sizeof(*msg) + alen, GFP_KERNEL_ACCOUNT);
+   msg = kmem_buckets_alloc(msg_buckets, sizeof(*msg) + alen, GFP_KERNEL);
if (msg == NULL)
return NULL;
 
-- 
2.34.1




[PATCH v4 6/6] mm/util: Use dedicated slab buckets for memdup_user()

2024-05-31 Thread Kees Cook
Both memdup_user() and vmemdup_user() handle allocations that are
regularly used for exploiting use-after-free type confusion flaws in
the kernel (e.g. prctl() PR_SET_VMA_ANON_NAME[1] and setxattr[2][3][4]
respectively).

Since both are designed for contents coming from userspace, it allows
for userspace-controlled allocation sizes. Use a dedicated set of kmalloc
buckets so these allocations do not share caches with the global kmalloc
buckets.

After a fresh boot under Ubuntu 23.10, we can see the caches are already
in active use:

 # grep ^memdup /proc/slabinfo
 memdup_user-8k 4  4   819248 : ...
 memdup_user-4k 8  8   409688 : ...
 memdup_user-2k16 16   2048   168 : ...
 memdup_user-1k 0  0   1024   164 : ...
 memdup_user-5120  0512   162 : ...
 memdup_user-2560  0256   161 : ...
 memdup_user-1280  0128   321 : ...
 memdup_user-64   256256 64   641 : ...
 memdup_user-32   512512 32  1281 : ...
 memdup_user-16  1024   1024 16  2561 : ...
 memdup_user-8   2048   2048  8  5121 : ...
 memdup_user-1920  0192   211 : ...
 memdup_user-96   168168 96   421 : ...

Link: 
https://starlabs.sg/blog/2023/07-prctl-anon_vma_name-an-amusing-heap-spray/ [1]
Link: https://duasynt.com/blog/linux-kernel-heap-spray [2]
Link: https://etenal.me/archives/1336 [3]
Link: 
https://github.com/a13xp0p0v/kernel-hack-drill/blob/master/drill_exploit_uaf.c 
[4]
Signed-off-by: Kees Cook 
---
Cc: "GONG, Ruiqi" 
Cc: Xiu Jianfeng 
Cc: Suren Baghdasaryan 
Cc: Kent Overstreet 
Cc: Jann Horn 
Cc: Matteo Rizzo 
Cc: jvoisin 
Cc: linux...@kvack.org
---
 mm/util.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/mm/util.c b/mm/util.c
index 53f7fc5912bd..f30460c82641 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -198,6 +198,16 @@ char *kmemdup_nul(const char *s, size_t len, gfp_t gfp)
 }
 EXPORT_SYMBOL(kmemdup_nul);
 
+static kmem_buckets *user_buckets __ro_after_init;
+
+static int __init init_user_buckets(void)
+{
+   user_buckets = kmem_buckets_create("memdup_user", 0, 0, 0, INT_MAX, 
NULL);
+
+   return 0;
+}
+subsys_initcall(init_user_buckets);
+
 /**
  * memdup_user - duplicate memory region from user space
  *
@@ -211,7 +221,7 @@ void *memdup_user(const void __user *src, size_t len)
 {
void *p;
 
-   p = kmalloc_track_caller(len, GFP_USER | __GFP_NOWARN);
+   p = kmem_buckets_alloc_track_caller(user_buckets, len, GFP_USER | 
__GFP_NOWARN);
if (!p)
return ERR_PTR(-ENOMEM);
 
@@ -237,7 +247,7 @@ void *vmemdup_user(const void __user *src, size_t len)
 {
void *p;
 
-   p = kvmalloc(len, GFP_USER);
+   p = kmem_buckets_valloc(user_buckets, len, GFP_USER);
if (!p)
return ERR_PTR(-ENOMEM);
 
-- 
2.34.1




[PATCH v4 1/6] mm/slab: Introduce kmem_buckets typedef

2024-05-31 Thread Kees Cook
Encapsulate the concept of a single set of kmem_caches that are used
for the kmalloc size buckets. Redefine kmalloc_caches as an array
of these buckets (for the different global cache buckets).

Signed-off-by: Kees Cook 
---
Cc: Vlastimil Babka 
Cc: Christoph Lameter 
Cc: Pekka Enberg 
Cc: David Rientjes 
Cc: Joonsoo Kim 
Cc: jvoisin 
Cc: Andrew Morton 
Cc: Roman Gushchin 
Cc: Hyeonggon Yoo <42.hye...@gmail.com>
Cc: linux...@kvack.org
---
 include/linux/slab.h | 5 +++--
 mm/slab_common.c | 3 +--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 7247e217e21b..de2b7209cd05 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -426,8 +426,9 @@ enum kmalloc_cache_type {
NR_KMALLOC_TYPES
 };
 
-extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
+typedef struct kmem_cache * kmem_buckets[KMALLOC_SHIFT_HIGH + 1];
+
+extern kmem_buckets kmalloc_caches[NR_KMALLOC_TYPES];
 
 /*
  * Define gfp bits that should not be set for KMALLOC_NORMAL.
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 1560a1546bb1..e0b1c109bed2 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -653,8 +653,7 @@ static struct kmem_cache *__init create_kmalloc_cache(const 
char *name,
return s;
 }
 
-struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1] __ro_after_init =
+kmem_buckets kmalloc_caches[NR_KMALLOC_TYPES] __ro_after_init =
 { /* initialization for https://llvm.org/pr42570 */ };
 EXPORT_SYMBOL(kmalloc_caches);
 
-- 
2.34.1




[PATCH v4 4/6] mm/slab: Introduce kmem_buckets_create() and family

2024-05-31 Thread Kees Cook
Dedicated caches are available for fixed size allocations via
kmem_cache_alloc(), but for dynamically sized allocations there is only
the global kmalloc API's set of buckets available. This means it isn't
possible to separate specific sets of dynamically sized allocations into
a separate collection of caches.

This leads to a use-after-free exploitation weakness in the Linux
kernel since many heap memory spraying/grooming attacks depend on using
userspace-controllable dynamically sized allocations to collide with
fixed size allocations that end up in same cache.

While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense
against these kinds of "type confusion" attacks, including for fixed
same-size heap objects, we can create a complementary deterministic
defense for dynamically sized allocations that are directly user
controlled. Addressing these cases is limited in scope, so isolating these
kinds of interfaces will not become an unbounded game of whack-a-mole. For
example, many pass through memdup_user(), making isolation there very
effective.

In order to isolate user-controllable dynamically-sized
allocations from the common system kmalloc allocations, introduce
kmem_buckets_create(), which behaves like kmem_cache_create(). Introduce
kmem_buckets_alloc(), which behaves like kmem_cache_alloc(). Introduce
kmem_buckets_alloc_track_caller() for where caller tracking is
needed. Introduce kmem_buckets_valloc() for cases where vmalloc fallback
is needed.

This can also be used in the future to extend allocation profiling's use
of code tagging to implement per-caller allocation cache isolation[1]
even for dynamic allocations.

Memory allocation pinning[2] is still needed to plug the Use-After-Free
cross-allocator weakness, but that is an existing and separate issue
which is complementary to this improvement. Development continues for
that feature via the SLAB_VIRTUAL[3] series (which could also provide
guard pages -- another complementary improvement).

Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1]
Link: 
https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html
 [2]
Link: 
https://lore.kernel.org/lkml/20230915105933.495735-1-matteori...@google.com/ [3]
Signed-off-by: Kees Cook 
---
Cc: Vlastimil Babka 
Cc: Christoph Lameter 
Cc: Pekka Enberg 
Cc: David Rientjes 
Cc: Joonsoo Kim 
Cc: jvoisin 
Cc: Andrew Morton 
Cc: Roman Gushchin 
Cc: Hyeonggon Yoo <42.hye...@gmail.com>
Cc: linux...@kvack.org
---
 include/linux/slab.h | 12 +++
 mm/slab_common.c | 80 
 2 files changed, 92 insertions(+)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 8853c6eb20b4..b48c50d90aae 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -552,6 +552,11 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, 
struct list_lru *lru,
 
 void kmem_cache_free(struct kmem_cache *s, void *objp);
 
+kmem_buckets *kmem_buckets_create(const char *name, unsigned int align,
+ slab_flags_t flags,
+ unsigned int useroffset, unsigned int 
usersize,
+ void (*ctor)(void *));
+
 /*
  * Bulk allocation and freeing operations. These are accelerated in an
  * allocator specific way to avoid taking locks repeatedly or building
@@ -675,6 +680,12 @@ static __always_inline __alloc_size(1) void 
*kmalloc_noprof(size_t size, gfp_t f
 }
 #define kmalloc(...)   
alloc_hooks(kmalloc_noprof(__VA_ARGS__))
 
+#define kmem_buckets_alloc(_b, _size, _flags)  \
+   alloc_hooks(__kmalloc_node_noprof(_b, _size, _flags, NUMA_NO_NODE))
+
+#define kmem_buckets_alloc_track_caller(_b, _size, _flags) \
+   alloc_hooks(kmalloc_node_track_caller_noprof(_b, _size, _flags, 
NUMA_NO_NODE, _RET_IP_))
+
 static __always_inline __alloc_size(1) void *kmalloc_node_noprof(size_t size, 
gfp_t flags, int node)
 {
if (__builtin_constant_p(size) && size) {
@@ -818,6 +829,7 @@ extern void *kvmalloc_buckets_node_noprof(size_t size, 
gfp_t flags, int node)
 #define kvzalloc(_size, _flags)kvmalloc(_size, 
(_flags)|__GFP_ZERO)
 
 #define kvzalloc_node(_size, _flags, _node)kvmalloc_node(_size, 
(_flags)|__GFP_ZERO, _node)
+#define kmem_buckets_valloc(_b, _size, _flags) kvmalloc_buckets_node(_b, 
_size, _flags, NUMA_NO_NODE)
 
 static inline __alloc_size(1, 2) void *
 kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index b5c879fa66bc..f42a98d368a9 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -392,6 +392,82 @@ kmem_cache_create(const char *name, unsigned int size, 
unsigned int align,
 }
 EXPORT_SYMBOL(kmem_cache_create);
 
+static struct kmem_cache *kmem_buckets_cache __ro_after_init;
+
+kmem_buckets *kmem_buckets_create(const char *name, unsigned int align,
+   

[PATCH v4 3/6] mm/slab: Introduce kvmalloc_buckets_node() that can take kmem_buckets argument

2024-05-31 Thread Kees Cook
Plumb kmem_buckets arguments through kvmalloc_node_noprof() so it is
possible to provide an API to perform kvmalloc-style allocations with
a particular set of buckets. Introduce kvmalloc_buckets_node() that takes a
kmem_buckets argument.

Signed-off-by: Kees Cook 
---
Cc: Vlastimil Babka 
Cc: Christoph Lameter 
Cc: Pekka Enberg 
Cc: David Rientjes 
Cc: Joonsoo Kim 
Cc: jvoisin 
Cc: Andrew Morton 
Cc: Roman Gushchin 
Cc: Hyeonggon Yoo <42.hye...@gmail.com>
Cc: linux...@kvack.org
---
 include/linux/slab.h | 19 +++
 lib/rhashtable.c |  2 +-
 mm/util.c| 13 +
 3 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index b1165b22cc6f..8853c6eb20b4 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -799,11 +799,22 @@ static inline __alloc_size(1) void *kzalloc_noprof(size_t 
size, gfp_t flags)
 #define kzalloc(...)   
alloc_hooks(kzalloc_noprof(__VA_ARGS__))
 #define kzalloc_node(_size, _flags, _node) kmalloc_node(_size, 
(_flags)|__GFP_ZERO, _node)
 
-extern void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node) 
__alloc_size(1);
-#define kvmalloc_node(...) 
alloc_hooks(kvmalloc_node_noprof(__VA_ARGS__))
+#ifdef CONFIG_SLAB_BUCKETS
+extern void *kvmalloc_buckets_node_noprof(kmem_buckets *b, size_t size, gfp_t 
flags, int node)
+   __alloc_size(2);
+# define kvmalloc_node_noprof(b, size, flags, node)\
+   kvmalloc_buckets_node_noprof(b, size, flags, node)
+#else
+extern void *kvmalloc_buckets_node_noprof(size_t size, gfp_t flags, int node)
+   __alloc_size(1);
+# define kvmalloc_node_noprof(b, size, flags, node)\
+   kvmalloc_buckets_node_noprof(size, flags, node)
+#endif
+#define kvmalloc_buckets_node(...) 
alloc_hooks(kvmalloc_node_noprof(__VA_ARGS__))
+#define kvmalloc_node(...) kvmalloc_buckets_node(NULL, 
__VA_ARGS__)
 
 #define kvmalloc(_size, _flags)kvmalloc_node(_size, 
_flags, NUMA_NO_NODE)
-#define kvmalloc_noprof(_size, _flags) kvmalloc_node_noprof(_size, 
_flags, NUMA_NO_NODE)
+#define kvmalloc_noprof(_size, _flags) kvmalloc_node_noprof(NULL, 
_size, _flags, NUMA_NO_NODE)
 #define kvzalloc(_size, _flags)kvmalloc(_size, 
(_flags)|__GFP_ZERO)
 
 #define kvzalloc_node(_size, _flags, _node)kvmalloc_node(_size, 
(_flags)|__GFP_ZERO, _node)
@@ -816,7 +827,7 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t 
flags, int node)
if (unlikely(check_mul_overflow(n, size, )))
return NULL;
 
-   return kvmalloc_node_noprof(bytes, flags, node);
+   return kvmalloc_node_noprof(NULL, bytes, flags, node);
 }
 
 #define kvmalloc_array_noprof(...) 
kvmalloc_array_node_noprof(__VA_ARGS__, NUMA_NO_NODE)
diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index dbbed19f8fff..ef0f496e4aed 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -184,7 +184,7 @@ static struct bucket_table *bucket_table_alloc(struct 
rhashtable *ht,
static struct lock_class_key __key;
 
tbl = alloc_hooks_tag(ht->alloc_tag,
-   kvmalloc_node_noprof(struct_size(tbl, buckets, 
nbuckets),
+   kvmalloc_node_noprof(NULL, struct_size(tbl, buckets, 
nbuckets),
 gfp|__GFP_ZERO, NUMA_NO_NODE));
 
size = nbuckets;
diff --git a/mm/util.c b/mm/util.c
index 80430e5ba981..53f7fc5912bd 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -593,9 +593,11 @@ unsigned long vm_mmap(struct file *file, unsigned long 
addr,
 }
 EXPORT_SYMBOL(vm_mmap);
 
+#ifdef CONFIG_SLAB_BUCKETS
 /**
- * kvmalloc_node - attempt to allocate physically contiguous memory, but upon
+ * kvmalloc_buckets_node_noprof - attempt to allocate physically contiguous 
memory, but upon
  * failure, fall back to non-contiguous (vmalloc) allocation.
+ * @b: which set of kmalloc buckets to allocate from.
  * @size: size of the request.
  * @flags: gfp mask for the allocation - must be compatible (superset) with 
GFP_KERNEL.
  * @node: numa node to allocate from
@@ -609,7 +611,10 @@ EXPORT_SYMBOL(vm_mmap);
  *
  * Return: pointer to the allocated memory of %NULL in case of failure
  */
-void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node)
+void *kvmalloc_buckets_node_noprof(kmem_buckets *b, size_t size, gfp_t flags, 
int node)
+#else
+void *kvmalloc_buckets_node_noprof(size_t size, gfp_t flags, int node)
+#endif
 {
gfp_t kmalloc_flags = flags;
void *ret;
@@ -631,7 +636,7 @@ void *kvmalloc_node_noprof(size_t size, gfp_t flags, int 
node)
kmalloc_flags &= ~__GFP_NOFAIL;
}
 
-   ret = kmalloc_node_noprof(size, kmalloc_flags, node);
+   ret = __kmalloc_node_noprof(b, size, kmalloc_flags, node);
 
/*
 * It doesn't real

[PATCH v4 0/6] slab: Introduce dedicated bucket allocator

2024-05-31 Thread Kees Cook
Hi,

 v4:
  - Rebase to v6.10-rc1
  - Add CONFIG_SLAB_BUCKETS to turn off the feature
 v3: https://lore.kernel.org/lkml/20240424213019.make.366-k...@kernel.org/
 v2: https://lore.kernel.org/lkml/20240305100933.it.923-k...@kernel.org/
 v1: https://lore.kernel.org/lkml/20240304184252.work.496-k...@kernel.org/

For the cover letter, I'm repeating commit log for patch 4 here, which has
additional clarifications and rationale since v2:

Dedicated caches are available for fixed size allocations via
kmem_cache_alloc(), but for dynamically sized allocations there is only
the global kmalloc API's set of buckets available. This means it isn't
possible to separate specific sets of dynamically sized allocations into
a separate collection of caches.

This leads to a use-after-free exploitation weakness in the Linux
kernel since many heap memory spraying/grooming attacks depend on using
userspace-controllable dynamically sized allocations to collide with
fixed size allocations that end up in same cache.

While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense
against these kinds of "type confusion" attacks, including for fixed
same-size heap objects, we can create a complementary deterministic
defense for dynamically sized allocations that are directly user
controlled. Addressing these cases is limited in scope, so isolation these
kinds of interfaces will not become an unbounded game of whack-a-mole. For
example, pass through memdup_user(), making isolation there very
effective.

In order to isolate user-controllable sized allocations from system
allocations, introduce kmem_buckets_create(), which behaves like
kmem_cache_create(). Introduce kmem_buckets_alloc(), which behaves like
kmem_cache_alloc(). Introduce kmem_buckets_alloc_track_caller() for
where caller tracking is needed. Introduce kmem_buckets_valloc() for
cases where vmalloc callback is needed.

Allows for confining allocations to a dedicated set of sized caches
(which have the same layout as the kmalloc caches).

This can also be used in the future to extend codetag allocation
annotations to implement per-caller allocation cache isolation[1] even
for dynamic allocations.

Memory allocation pinning[2] is still needed to plug the Use-After-Free
cross-allocator weakness, but that is an existing and separate issue
which is complementary to this improvement. Development continues for
that feature via the SLAB_VIRTUAL[3] series (which could also provide
guard pages -- another complementary improvement).

Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1]
Link: 
https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html
 [2]
Link: 
https://lore.kernel.org/lkml/20230915105933.495735-1-matteori...@google.com/ [3]

After the core implementation are 2 patches that cover the most heavily
abused "repeat offenders" used in exploits. Repeating those details here:

The msg subsystem is a common target for exploiting[1][2][3][4][5][6]
use-after-free type confusion flaws in the kernel for both read and
write primitives. Avoid having a user-controlled size cache share the
global kmalloc allocator by using a separate set of kmalloc buckets.

Link: 
https://blog.hacktivesecurity.com/index.php/2022/06/13/linux-kernel-exploit-development-1day-case-study/
 [1]
Link: 
https://hardenedvault.net/blog/2022-11-13-msg_msg-recon-mitigation-ved/ [2]
Link: 
https://www.willsroot.io/2021/08/corctf-2021-fire-of-salvation-writeup.html [3]
Link: https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html [4]
Link: 
https://google.github.io/security-research/pocs/linux/cve-2021-22555/writeup.html
 [5]
Link: https://zplin.me/papers/ELOISE.pdf [6]
Link: https://syst3mfailure.io/wall-of-perdition/ [7]

Both memdup_user() and vmemdup_user() handle allocations that are
regularly used for exploiting use-after-free type confusion flaws in
the kernel (e.g. prctl() PR_SET_VMA_ANON_NAME[1] and setxattr[2][3][4]
respectively).

Since both are designed for contents coming from userspace, it allows
for userspace-controlled allocation sizes. Use a dedicated set of kmalloc
buckets so these allocations do not share caches with the global kmalloc
buckets.

Link: 
https://starlabs.sg/blog/2023/07-prctl-anon_vma_name-an-amusing-heap-spray/ [1]
Link: https://duasynt.com/blog/linux-kernel-heap-spray [2]
Link: https://etenal.me/archives/1336 [3]
Link: 
https://github.com/a13xp0p0v/kernel-hack-drill/blob/master/drill_exploit_uaf.c 
[4]

Thanks!

-Kees


Kees Cook (6):
  mm/slab: Introduce kmem_buckets typedef
  mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
  mm/slab: Introduce kvmalloc_buckets_node() that can take kmem_buckets
argument
  mm/slab: Introduce

[PATCH v4 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()

2024-05-31 Thread Kees Cook
Introduce CONFIG_SLAB_BUCKETS which provides the infrastructure to
support separated kmalloc buckets (in the follow kmem_buckets_create()
patches and future codetag-based separation). Since this will provide
a mitigation for a very common case of exploits, enable it by default.

To be able to choose which buckets to allocate from, make the buckets
available to the internal kmalloc interfaces by adding them as the
first argument, rather than depending on the buckets being chosen from
the fixed set of global buckets. Where the bucket is not available,
pass NULL, which means "use the default system kmalloc bucket set"
(the prior existing behavior), as implemented in kmalloc_slab().

To avoid adding the extra argument when !CONFIG_SLAB_BUCKETS, only the
top-level macros and static inlines use the buckets argument (where
they are stripped out and compiled out respectively). The actual extern
functions can then been built without the argument, and the internals
fall back to the global kmalloc buckets unconditionally.

Signed-off-by: Kees Cook 
---
Cc: Vlastimil Babka 
Cc: Christoph Lameter 
Cc: Pekka Enberg 
Cc: David Rientjes 
Cc: Joonsoo Kim 
Cc: jvoisin 
Cc: Andrew Morton 
Cc: Roman Gushchin 
Cc: Hyeonggon Yoo <42.hye...@gmail.com>
Cc: linux...@kvack.org
Cc: linux-hardening@vger.kernel.org
---
 include/linux/slab.h | 34 ++
 mm/Kconfig   | 15 +++
 mm/slab.h|  6 --
 mm/slab_common.c |  4 ++--
 mm/slub.c| 34 --
 mm/util.c|  2 +-
 6 files changed, 72 insertions(+), 23 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index de2b7209cd05..b1165b22cc6f 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -569,8 +569,17 @@ static __always_inline void kfree_bulk(size_t size, void 
**p)
kmem_cache_free_bulk(NULL, size, p);
 }
 
-void *__kmalloc_node_noprof(size_t size, gfp_t flags, int node) 
__assume_kmalloc_alignment
-__alloc_size(1);
+#ifdef CONFIG_SLAB_BUCKETS
+void *__kmalloc_buckets_node_noprof(kmem_buckets *b, size_t size, gfp_t flags, 
int node)
+   __assume_kmalloc_alignment __alloc_size(2);
+# define __kmalloc_node_noprof(b, size, flags, node)   \
+   __kmalloc_buckets_node_noprof(b, size, flags, node)
+#else
+void *__kmalloc_buckets_node_noprof(size_t size, gfp_t flags, int node)
+   __assume_kmalloc_alignment __alloc_size(1);
+# define __kmalloc_node_noprof(b, size, flags, node)   \
+   __kmalloc_buckets_node_noprof(size, flags, node)
+#endif
 #define __kmalloc_node(...)
alloc_hooks(__kmalloc_node_noprof(__VA_ARGS__))
 
 void *kmem_cache_alloc_node_noprof(struct kmem_cache *s, gfp_t flags,
@@ -679,7 +688,7 @@ static __always_inline __alloc_size(1) void 
*kmalloc_node_noprof(size_t size, gf
kmalloc_caches[kmalloc_type(flags, 
_RET_IP_)][index],
flags, node, size);
}
-   return __kmalloc_node_noprof(size, flags, node);
+   return __kmalloc_node_noprof(NULL, size, flags, node);
 }
 #define kmalloc_node(...)  
alloc_hooks(kmalloc_node_noprof(__VA_ARGS__))
 
@@ -730,10 +739,19 @@ static inline __realloc_size(2, 3) void * __must_check 
krealloc_array_noprof(voi
  */
 #define kcalloc(n, size, flags)kmalloc_array(n, size, (flags) 
| __GFP_ZERO)
 
-void *kmalloc_node_track_caller_noprof(size_t size, gfp_t flags, int node,
- unsigned long caller) __alloc_size(1);
+#ifdef CONFIG_SLAB_BUCKETS
+void *__kmalloc_node_track_caller_noprof(kmem_buckets *b, size_t size, gfp_t 
flags, int node,
+unsigned long caller) __alloc_size(2);
+# define kmalloc_node_track_caller_noprof(b, size, flags, node, caller)
\
+   __kmalloc_node_track_caller_noprof(b, size, flags, node, caller)
+#else
+void *__kmalloc_node_track_caller_noprof(size_t size, gfp_t flags, int node,
+unsigned long caller) __alloc_size(1);
+# define kmalloc_node_track_caller_noprof(b, size, flags, node, caller)
\
+   __kmalloc_node_track_caller_noprof(size, flags, node, caller)
+#endif
 #define kmalloc_node_track_caller(...) \
-   alloc_hooks(kmalloc_node_track_caller_noprof(__VA_ARGS__, _RET_IP_))
+   alloc_hooks(kmalloc_node_track_caller_noprof(NULL, __VA_ARGS__, 
_RET_IP_))
 
 /*
  * kmalloc_track_caller is a special version of kmalloc that records the
@@ -746,7 +764,7 @@ void *kmalloc_node_track_caller_noprof(size_t size, gfp_t 
flags, int node,
 #define kmalloc_track_caller(...)  
kmalloc_node_track_caller(__VA_ARGS__, NUMA_NO_NODE)
 
 #define kmalloc_track_caller_noprof(...)   \
-   kmalloc_node_track_caller_noprof(__VA_ARGS__, NUMA_NO

[PATCH] kunit/fortify: Remove __kmalloc_node() test

2024-05-31 Thread Kees Cook
__kmalloc_node() is considered an "internal" function to the Slab, so
drop it from explicit testing.

Signed-off-by: Kees Cook 
---
Cc: Vlastimil Babka 
Cc: linux...@kvack.org
Cc: linux-hardening@vger.kernel.org
---
 lib/fortify_kunit.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/lib/fortify_kunit.c b/lib/fortify_kunit.c
index 39da5b3bc649..f9cc467334ce 100644
--- a/lib/fortify_kunit.c
+++ b/lib/fortify_kunit.c
@@ -235,9 +235,6 @@ static void 
fortify_test_alloc_size_##allocator##_dynamic(struct kunit *test) \
kmalloc_array_node(alloc_size, 1, gfp, NUMA_NO_NODE),   \
kfree(p));  \
checker(expected_size, __kmalloc(alloc_size, gfp),  \
-   kfree(p));  \
-   checker(expected_size,  \
-   __kmalloc_node(alloc_size, gfp, NUMA_NO_NODE),  \
kfree(p));  \
\
orig = kmalloc(alloc_size, gfp);\
-- 
2.34.1




Re: [PATCH] x86/boot: add prototype for __fortify_panic()

2024-05-31 Thread Kees Cook
On Thu, May 30, 2024 at 06:46:39PM +0200, Borislav Petkov wrote:
> On May 30, 2024 6:23:36 PM GMT+02:00, Jeff Johnson 
>  wrote:
> >On 5/30/2024 8:42 AM, Nikolay Borisov wrote:
> >> 
> >> 
> >> On 29.05.24 г. 21:09 ч., Jeff Johnson wrote:
> >>> As discussed in [1] add a prototype for __fortify_panic() to fix the
> >>> 'make W=1 C=1' warning:
> >>>
> >>> arch/x86/boot/compressed/misc.c:535:6: warning: symbol '__fortify_panic' 
> >>> was not declared. Should it be static?
> >> 
> >> Actually doesn't it make sense to have this defined under ../string.h ? 
> >> Actually given that we don't have any string fortification under the 
> >> boot/  why have the fortify _* functions at all ?
> >
> >I'll let Kees answer these questions since I just took guidance from him :)
> 
> The more important question is how does the decompressor build even know of 
> this symbol? And then make it forget it again instead of adding silly 
> prototypes...

Under CONFIG_FORTIFY_SOURCE, the boot code *does* still uses
fortify-string.h. It lets us both catch mistakes we can discover at
compile and will catch egregious runtime mistakes, though the reporting
is much simpler in the boot code.

-- 
Kees Cook



Re: [PATCH v3 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()

2024-05-31 Thread Kees Cook
On Fri, May 24, 2024 at 11:01:40AM -0400, Kent Overstreet wrote:
> On Wed, Apr 24, 2024 at 02:40:59PM -0700, Kees Cook wrote:
> > To be able to choose which buckets to allocate from, make the buckets
> > available to the lower level kmalloc interfaces by adding them as the
> > first argument. Where the bucket is not available, pass NULL, which means
> > "use the default system kmalloc bucket set" (the prior existing behavior),
> > as implemented in kmalloc_slab().
> 
> I thought the plan was to use codetags for this? That would obviate the
> need for all this plumbing.
> 
> Add fields to the alloc tag for:
>  - allocation size (or 0 if it's not a compile time constant)
>  - union of kmem_cache, kmem_buckets, depending on whether the
>allocation size is constant or not

I want to provide "simple" (low-hanging fruit) coverage that can live
separately from the codetags-based coverage. The memory overhead for
this patch series is negligible, but I suspect the codetags expansion,
while not giant, will be more than some deployments will want. I want
to avoid an all-or-nothing solution -- which is why I had intended this
to be available "by default".

But I will respin this with kmem_buckets under a Kconfig.

-- 
Kees Cook



Re: [PATCH v3 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()

2024-05-31 Thread Kees Cook
On Fri, May 24, 2024 at 03:38:58PM +0200, Vlastimil Babka wrote:
> On 4/24/24 11:40 PM, Kees Cook wrote:
> > To be able to choose which buckets to allocate from, make the buckets
> > available to the lower level kmalloc interfaces by adding them as the
> > first argument. Where the bucket is not available, pass NULL, which means
> > "use the default system kmalloc bucket set" (the prior existing behavior),
> > as implemented in kmalloc_slab().
> > 
> > Signed-off-by: Kees Cook 
> > ---
> > Cc: Vlastimil Babka 
> > Cc: Christoph Lameter 
> > Cc: Pekka Enberg 
> > Cc: David Rientjes 
> > Cc: Joonsoo Kim 
> > Cc: Andrew Morton 
> > Cc: Roman Gushchin 
> > Cc: Hyeonggon Yoo <42.hye...@gmail.com>
> > Cc: linux...@kvack.org
> > Cc: linux-hardening@vger.kernel.org
> > ---
> >  include/linux/slab.h | 16 
> >  lib/fortify_kunit.c  |  2 +-
> >  mm/slab.h|  6 --
> >  mm/slab_common.c |  4 ++--
> >  mm/slub.c| 14 +++---
> >  mm/util.c|  2 +-
> >  6 files changed, 23 insertions(+), 21 deletions(-)
> > 
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index c8164d5db420..07373b680894 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -569,8 +569,8 @@ static __always_inline void kfree_bulk(size_t size, 
> > void **p)
> > kmem_cache_free_bulk(NULL, size, p);
> >  }
> >  
> > -void *__kmalloc_node_noprof(size_t size, gfp_t flags, int node) 
> > __assume_kmalloc_alignment
> > -__alloc_size(1);
> > +void *__kmalloc_node_noprof(kmem_buckets *b, size_t size, gfp_t flags, int 
> > node)
> > +   __assume_kmalloc_alignment __alloc_size(2);
> >  #define __kmalloc_node(...)
> > alloc_hooks(__kmalloc_node_noprof(__VA_ARGS__))
> >  
> >  void *kmem_cache_alloc_node_noprof(struct kmem_cache *s, gfp_t flags,
> > @@ -679,7 +679,7 @@ static __always_inline __alloc_size(1) void 
> > *kmalloc_node_noprof(size_t size, gf
> > kmalloc_caches[kmalloc_type(flags, 
> > _RET_IP_)][index],
> > flags, node, size);
> > }
> > -   return __kmalloc_node_noprof(size, flags, node);
> > +   return __kmalloc_node_noprof(NULL, size, flags, node);
> 
> This is not ideal as now every kmalloc_node() callsite will now have to add
> the NULL parameter even if this is not enabled. Could the new parameter be
> only added depending on the respective config?

I felt like it was much simpler to add an argument to the existing call
path than to create a duplicate API that had 1 extra argument. However,
if you want this behind a Kconfig option, I can redefine the argument
list based on that?

-- 
Kees Cook



Re: [PATCH v3 0/6] slab: Introduce dedicated bucket allocator

2024-05-31 Thread Kees Cook
On Fri, May 24, 2024 at 10:54:58AM -0400, Kent Overstreet wrote:
> On Wed, Apr 24, 2024 at 02:40:57PM -0700, Kees Cook wrote:
> > Hi,
> > 
> > Series change history:
> > 
> >  v3:
> >   - clarify rationale and purpose in commit log
> >   - rebase to -next (CONFIG_CODE_TAGGING)
> >   - simplify calling styles and split out bucket plumbing more cleanly
> >   - consolidate kmem_buckets_*() family introduction patches
> >  v2: https://lore.kernel.org/lkml/20240305100933.it.923-k...@kernel.org/
> >  v1: https://lore.kernel.org/lkml/20240304184252.work.496-k...@kernel.org/
> > 
> > For the cover letter, I'm repeating commit log for patch 4 here, which has
> > additional clarifications and rationale since v2:
> > 
> > Dedicated caches are available for fixed size allocations via
> > kmem_cache_alloc(), but for dynamically sized allocations there is only
> > the global kmalloc API's set of buckets available. This means it isn't
> > possible to separate specific sets of dynamically sized allocations into
> > a separate collection of caches.
> > 
> > This leads to a use-after-free exploitation weakness in the Linux
> > kernel since many heap memory spraying/grooming attacks depend on using
> > userspace-controllable dynamically sized allocations to collide with
> > fixed size allocations that end up in same cache.
> 
> This is going to increase internal fragmentation in the slab allocator,
> so we're going to need better, more visible numbers on the amount of
> memory stranded thusly, so users can easily see the effect this has.

Yes, but not significantly. It's less than the 16-buckets randomized
kmalloc implementation. The numbers will be visible in /proc/slabinfo
just like any other.

> Please also document this effect and point users in the documentation
> where to check, so that we devs can get feedback on this.

Okay, sure. In the commit log, or did you have somewhere else in mind?

-- 
Kees Cook



Re: [PATCH v3 4/6] mm/slab: Introduce kmem_buckets_create() and family

2024-05-31 Thread Kees Cook
On Fri, May 24, 2024 at 03:43:33PM +0200, Vlastimil Babka wrote:
> On 4/24/24 11:41 PM, Kees Cook wrote:
> > Dedicated caches are available for fixed size allocations via
> > kmem_cache_alloc(), but for dynamically sized allocations there is only
> > the global kmalloc API's set of buckets available. This means it isn't
> > possible to separate specific sets of dynamically sized allocations into
> > a separate collection of caches.
> > 
> > This leads to a use-after-free exploitation weakness in the Linux
> > kernel since many heap memory spraying/grooming attacks depend on using
> > userspace-controllable dynamically sized allocations to collide with
> > fixed size allocations that end up in same cache.
> > 
> > While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense
> > against these kinds of "type confusion" attacks, including for fixed
> > same-size heap objects, we can create a complementary deterministic
> > defense for dynamically sized allocations that are directly user
> > controlled. Addressing these cases is limited in scope, so isolation these
> > kinds of interfaces will not become an unbounded game of whack-a-mole. For
> > example, pass through memdup_user(), making isolation there very
> > effective.
> > 
> > In order to isolate user-controllable sized allocations from system
> > allocations, introduce kmem_buckets_create(), which behaves like
> > kmem_cache_create(). Introduce kmem_buckets_alloc(), which behaves like
> > kmem_cache_alloc(). Introduce kmem_buckets_alloc_track_caller() for
> > where caller tracking is needed. Introduce kmem_buckets_valloc() for
> > cases where vmalloc callback is needed.
> > 
> > Allows for confining allocations to a dedicated set of sized caches
> > (which have the same layout as the kmalloc caches).
> > 
> > This can also be used in the future to extend codetag allocation
> > annotations to implement per-caller allocation cache isolation[1] even
> > for dynamic allocations.
> > 
> > Memory allocation pinning[2] is still needed to plug the Use-After-Free
> > cross-allocator weakness, but that is an existing and separate issue
> > which is complementary to this improvement. Development continues for
> > that feature via the SLAB_VIRTUAL[3] series (which could also provide
> > guard pages -- another complementary improvement).
> > 
> > Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1]
> > Link: 
> > https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html
> >  [2]
> > Link: 
> > https://lore.kernel.org/lkml/20230915105933.495735-1-matteori...@google.com/
> >  [3]
> > Signed-off-by: Kees Cook 
> 
> So this seems like it's all unconditional and not depending on a config
> option? I'd rather see one, as has been done for all similar functionality
> before, as not everyone will want this trade-off.

Okay, sure, I can do that. Since it was changing some of the core APIs
to pass in the target buckets (instead of using the global buckets), it
seemed less complex to me to just make the simple changes unconditional.

> Also AFAIU every new user (patches 5, 6) will add new bunch of lines to
> /proc/slabinfo? And when you integrate alloc tagging, do you expect every

Yes, that's true, but that's the goal: they want to live in a separate
bucket collection. At present, it's only two, and arguable almost
everything that the manual isolation provides should be using the
mem_from_user() interface anyway.

> callsite will do that as well? Any idea how many there would be and what
> kind of memory overhead it will have in the end?

I haven't measured the per-codetag-site overhead, but I don't expect it
to have giant overhead. If the buckets are created on demand, it should
be small.

But one step at a time. This provides the infrastructure needed to
immediately solve the low-hanging exploitable fruit and to pave the way
for the per-codetag-site automation.

-Kees

-- 
Kees Cook



Re: [PATCH] x86/boot: add prototype for __fortify_panic()

2024-05-31 Thread Kees Cook
On Thu, May 30, 2024 at 09:23:36AM -0700, Jeff Johnson wrote:
> On 5/30/2024 8:42 AM, Nikolay Borisov wrote:
> > 
> > 
> > On 29.05.24 г. 21:09 ч., Jeff Johnson wrote:
> >> As discussed in [1] add a prototype for __fortify_panic() to fix the
> >> 'make W=1 C=1' warning:
> >>
> >> arch/x86/boot/compressed/misc.c:535:6: warning: symbol '__fortify_panic' 
> >> was not declared. Should it be static?
> > 
> > Actually doesn't it make sense to have this defined under ../string.h ? 
> > Actually given that we don't have any string fortification under the 
> > boot/  why have the fortify _* functions at all ?
> 
> I'll let Kees answer these questions since I just took guidance from him :)

Ah-ha, I see what's happening. When not built with
CONFIG_FORTIFY_SOURCE, fortify-string.h isn't included. But since misc.c
has the function definition, we get a warning that the function
declaration was never seen. This is likely the better solution:


diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b70e4a21c15f..3f21a5e218f8 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -532,7 +532,9 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
unsigned char *output)
return output + entry_offset;
 }
 
+#ifdef CONFIG_FORTIFY_SOURCE
 void __fortify_panic(const u8 reason, size_t avail, size_t size)
 {
error("detected buffer overflow");
 }
+#endif


Jeff, can you test this? (I still haven't been able to reproduce the
warning.)

-- 
Kees Cook



Re: [PATCH] x86/traps: Enable UBSAN traps on x86

2024-05-29 Thread Kees Cook
On Wed, May 29, 2024 at 01:36:39PM -0700, Gatlin Newhouse wrote:
> On Wed, May 29, 2024 at 08:30:20PM UTC, Marco Elver wrote:
> > On Wed, 29 May 2024 at 20:17, Gatlin Newhouse  
> > wrote:
> > >
> > > On Wed, May 29, 2024 at 09:25:21AM UTC, Marco Elver wrote:
> > > > On Wed, 29 May 2024 at 04:20, Gatlin Newhouse 
> > > >  wrote:
> > > > [...]
> > > > > if (regs->flags & X86_EFLAGS_IF)
> > > > > raw_local_irq_enable();
> > > > > -   if (report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN ||
> > > > > -   handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN) {
> > > > > -   regs->ip += LEN_UD2;
> > > > > -   handled = true;
> > > > > +
> > > > > +   if (insn == INSN_UD2) {
> > > > > +   if (report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN 
> > > > > ||
> > > > > +   handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN) {
> > > > > +   regs->ip += LEN_UD2;
> > > > > +   handled = true;
> > > > > +   }
> > > > > +   } else {
> > > > > +   if (handle_ubsan_failure(regs, insn) == 
> > > > > BUG_TRAP_TYPE_WARN) {
> > > >
> > > > handle_ubsan_failure currently only returns BUG_TRAP_TYPE_NONE?
> > > >
> > > > > +   if (insn == INSN_REX)
> > > > > +   regs->ip += LEN_REX;
> > > > > +   regs->ip += LEN_UD1;
> > > > > +   handled = true;
> > > > > +   }
> > > > > }
> > > > > if (regs->flags & X86_EFLAGS_IF)
> > > > > raw_local_irq_disable();
> > > > > diff --git a/arch/x86/kernel/ubsan.c b/arch/x86/kernel/ubsan.c
> > > > > new file mode 100644
> > > > > index ..6cae11f4fe23
> > > > > --- /dev/null
> > > > > +++ b/arch/x86/kernel/ubsan.c
> > > > > @@ -0,0 +1,32 @@
> > > > > +// SPDX-License-Identifier: GPL-2.0
> > > > > +/*
> > > > > + * Clang Undefined Behavior Sanitizer trap mode support.
> > > > > + */
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +#include 
> > > > > +
> > > > > +/*
> > > > > + * Checks for the information embedded in the UD1 trap instruction
> > > > > + * for the UB Sanitizer in order to pass along debugging output.
> > > > > + */
> > > > > +enum bug_trap_type handle_ubsan_failure(struct pt_regs *regs, int 
> > > > > insn)
> > > > > +{
> > > > > +   u32 type = 0;
> > > > > +
> > > > > +   if (insn == INSN_REX) {
> > > > > +   type = (*(u16 *)(regs->ip + LEN_REX + LEN_UD1));
> > > > > +   if ((type & 0xFF) == 0x40)
> > > > > +   type = (type >> 8) & 0xFF;
> > > > > +   } else {
> > > > > +   type = (*(u16 *)(regs->ip + LEN_UD1));
> > > > > +   if ((type & 0xFF) == 0x40)
> > > > > +   type = (type >> 8) & 0xFF;
> > > > > +   }
> > > > > +   pr_crit("%s at %pS\n", report_ubsan_failure(regs, type), 
> > > > > (void *)regs->ip);
> > > > > +
> > > > > +   return BUG_TRAP_TYPE_NONE;
> > > > > +}
> > > >
> > > > Shouldn't this return BUG_TRAP_TYPE_WARN?
> > >
> > > So as far as I understand, UBSAN trap mode never warns. Perhaps it does on
> > > arm64, although it calls die() so I am unsure. Maybe the condition in
> > > handle_bug() should be rewritten in the case of UBSAN ud1s? Do you have 
> > > any
> > > suggestions?
> > 
> > AFAIK on arm64 it's basically a kernel OOPS.
> > 
> > The main thing I just wanted to point out though is that your newly added 
> > branch
> > 
> > > if (handle_ubsan_failure(regs, insn) == BUG_TRAP_TYPE_WARN) {
> > 
> > will never be taken, because I don't see where handle_ubsan_failure()
> > returns BUG_TRAP_TYPE_WARN.
> >
> 
> Initially I wrote this with some symmetry to the KCFI checks nearby, but I
> was unsure if this would be considered handled or not.

Yeah, that seemed like the right "style" to me too. Perhaps, since it
can never warn, we could just rewrite it so it's a void function avoid
the checking, etc.

-- 
Kees Cook



Re: __fortify_panic() question

2024-05-29 Thread Kees Cook
On Wed, May 29, 2024 at 10:09:45AM -0700, Jeff Johnson wrote:
> On 5/29/2024 9:55 AM, Kees Cook wrote:
> > On Wed, May 29, 2024 at 07:36:25AM -0700, Jeff Johnson wrote:
> >> 'make W=1 C=1' on x86 gives the warning:
> >> arch/x86/boot/compressed/misc.c:535:6: warning: symbol '__fortify_panic' 
> >> was not declared. Should it be static?
> > 
> > Hm, I can't reproduce this currently (but yes, it looks like arm vs x86
> > is mismatched). What tree is this?
> 
> e0cce98fe279 (linus/master, linux-master) Merge tag 'tpmdd-next-6.10-rc2' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
> 
> > 
> >> Looking at this I see for ARM there is a prototype for __fortify_panic() in
> >> arch/arm/boot/compressed/misc.h
> >> And there is a matching implementation in arch/arm/boot/compressed/misc.c
> >>
> >> But for x86 there is only the implementation in
> >> arch/x86/boot/compressed/misc.c
> >> There is not a prototype in arch/x86/boot/compressed/misc.h.
> >>
> >> The easy fix for this would be to add a prototype to
> >> arch/x86/boot/compressed/misc.h.
> > 
> > Yeah, I think this is the right solution.
> 
> You want to do this, or should I?

Please feel free! I'd appreciate it. :)

-- 
Kees Cook



Re: __fortify_panic() question

2024-05-29 Thread Kees Cook
On Wed, May 29, 2024 at 07:36:25AM -0700, Jeff Johnson wrote:
> 'make W=1 C=1' on x86 gives the warning:
> arch/x86/boot/compressed/misc.c:535:6: warning: symbol '__fortify_panic' was 
> not declared. Should it be static?

Hm, I can't reproduce this currently (but yes, it looks like arm vs x86
is mismatched). What tree is this?

> Looking at this I see for ARM there is a prototype for __fortify_panic() in
> arch/arm/boot/compressed/misc.h
> And there is a matching implementation in arch/arm/boot/compressed/misc.c
> 
> But for x86 there is only the implementation in
> arch/x86/boot/compressed/misc.c
> There is not a prototype in arch/x86/boot/compressed/misc.h.
> 
> The easy fix for this would be to add a prototype to
> arch/x86/boot/compressed/misc.h.

Yeah, I think this is the right solution.

> But it seems strange to me to add a prototype to a header file that is only 
> for the benefit of the callee and is not the prototype/header used by the 
> caller, in this case the one in include/linux/fortify-string.h

The stuff in boot/ doesn't tend to include fortify-string.h (since it's
sort of "outside" the kernel), hence the need for additional prototypes.

-- 
Kees Cook



Re: [PATCH] drm/i915: 2 GiB of relocations ought to be enough for anybody*

2024-05-23 Thread Kees Cook
On Tue, May 21, 2024 at 11:12:01AM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> Kernel test robot reports i915 can hit a warn in kvmalloc_node which has
> a purpose of dissalowing crazy size kernel allocations. This was added in
> 7661809d493b ("mm: don't allow oversized kvmalloc() calls"):
> 
>/* Don't even allow crazy sizes */
>if (WARN_ON_ONCE(size > INT_MAX))
>return NULL;
> 
> This would be kind of okay since i915 at one point dropped the need for
> making a shadow copy of the relocation list, but then it got re-added in
> fd1500fcd442 ("Revert "drm/i915/gem: Drop relocation slowpath".") a year
> after Linus added the above warning.
> 
> It is plausible that the issue was not seen until now because to trigger
> gem_exec_reloc test requires a combination of an relatively older
> generation hardware but with at least 8GiB of RAM installed. Probably even
> more depending on runtime checks.
> 
> Lets cap what we allow userspace to pass in using the matching limit.
> There should be no issue for real userspace since we are talking about
> "crazy" number of relocations which have no practical purpose.
> 
> *) Well IGT tests might get upset but they can be easily adjusted.
> 
> Signed-off-by: Tvrtko Ursulin 

Thanks for fixing this!

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH] drm/i915: 2 GiB of relocations ought to be enough for anybody*

2024-05-23 Thread Kees Cook
On Tue, May 21, 2024 at 11:12:01AM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> Kernel test robot reports i915 can hit a warn in kvmalloc_node which has
> a purpose of dissalowing crazy size kernel allocations. This was added in
> 7661809d493b ("mm: don't allow oversized kvmalloc() calls"):
> 
>/* Don't even allow crazy sizes */
>if (WARN_ON_ONCE(size > INT_MAX))
>return NULL;
> 
> This would be kind of okay since i915 at one point dropped the need for
> making a shadow copy of the relocation list, but then it got re-added in
> fd1500fcd442 ("Revert "drm/i915/gem: Drop relocation slowpath".") a year
> after Linus added the above warning.
> 
> It is plausible that the issue was not seen until now because to trigger
> gem_exec_reloc test requires a combination of an relatively older
> generation hardware but with at least 8GiB of RAM installed. Probably even
> more depending on runtime checks.
> 
> Lets cap what we allow userspace to pass in using the matching limit.
> There should be no issue for real userspace since we are talking about
> "crazy" number of relocations which have no practical purpose.
> 
> *) Well IGT tests might get upset but they can be easily adjusted.
> 
> Signed-off-by: Tvrtko Ursulin 

Thanks for fixing this!

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v10 0/5] Introduce mseal

2024-05-23 Thread Kees Cook
On Tue, May 14, 2024 at 12:52:13PM -0700, Kees Cook wrote:
> On Tue, May 14, 2024 at 10:46:46AM -0700, Andrew Morton wrote:
> > On Mon, 15 Apr 2024 16:35:19 + jef...@chromium.org wrote:
> > 
> > > This patchset proposes a new mseal() syscall for the Linux kernel.
> > 
> > I have not moved this into mm-stable for a 6.10 merge.  Mainly because
> > of the total lack of Reviewed-by:s and Acked-by:s.
> 
> Oh, I thought I had already reviewed it. FWIW, please consider it:
> 
> Reviewed-by: Kees Cook 
> 
> > The code appears to be stable enough for a merge.
> 
> Agreed.
> 
> > It's awkward that we're in conference this week, but I ask people to
> > give consideration to the desirability of moving mseal() into mainline
> > sometime over the next week, please.
> 
> Yes please. :)

Is the plan still to land this for 6.10? With the testing it's had in
-next and Liam's review, I think we're good to go?

Thanks!

-Kees

-- 
Kees Cook



[PATCH] ext4: Use memtostr_pad() for s_volume_name

2024-05-23 Thread Kees Cook
As with the other strings in struct ext4_super_block, s_volume_name is
not NUL terminated. The other strings were marked in commit 072ebb3bffe6
("ext4: add nonstring annotations to ext4.h"). Using strscpy() isn't
the right replacement for strncpy(); it should use memtostr_pad()
instead.

Reported-by: syzbot+50835f73143cc2905...@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/19f4c00619192...@google.com/
Fixes: 744a56389f73 ("ext4: replace deprecated strncpy with alternatives")
Signed-off-by: Kees Cook 
---
Cc: "Theodore Ts'o" 
Cc: Justin Stitt 
Cc: Andreas Dilger 
Cc: linux-e...@vger.kernel.org
---
 fs/ext4/ext4.h  | 2 +-
 fs/ext4/ioctl.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 983dad8c07ec..efed7f09876d 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1347,7 +1347,7 @@ struct ext4_super_block {
 /*60*/ __le32  s_feature_incompat; /* incompatible feature set */
__le32  s_feature_ro_compat;/* readonly-compatible feature set */
 /*68*/ __u8s_uuid[16]; /* 128-bit uuid for volume */
-/*78*/ chars_volume_name[EXT4_LABEL_MAX];  /* volume name */
+/*78*/ chars_volume_name[EXT4_LABEL_MAX] __nonstring; /* volume name */
 /*88*/ chars_last_mounted[64] __nonstring; /* directory where last mounted 
*/
 /*C8*/ __le32  s_algorithm_usage_bitmap; /* For compression */
/*
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index dab7acd49709..e8bf5972dd47 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -1151,7 +1151,7 @@ static int ext4_ioctl_getlabel(struct ext4_sb_info *sbi, 
char __user *user_label
BUILD_BUG_ON(EXT4_LABEL_MAX >= FSLABEL_MAX);
 
lock_buffer(sbi->s_sbh);
-   strscpy_pad(label, sbi->s_es->s_volume_name);
+   memtostr_pad(label, sbi->s_es->s_volume_name);
unlock_buffer(sbi->s_sbh);
 
if (copy_to_user(user_label, label, sizeof(label)))
-- 
2.34.1




Re: [linux-next:master] [mm/slab] 7bd230a266: WARNING:at_mm/util.c:#kvmalloc_node_noprof

2024-05-19 Thread Kees Cook
On Sun, May 19, 2024 at 07:06:45PM -0400, Kent Overstreet wrote:
> this looks like an i915 bug

Yeah, agreed.

> On Wed, May 15, 2024 at 10:41:19AM +0800, kernel test robot wrote:
[...]
> > [test failed on linux-next/master 6ba6c795dc73c22ce2c86006f17c4aa802db2a60]
[...]
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new 
> > version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot 
> > | Closes: 
> > https://lore.kernel.org/oe-lkp/202405151008.6ddd1aaf-oliver.s...@intel.com
> > 
> > 
> > [  940.101700][ T5353] [ cut here ]
> > [ 940.107107][ T5353] WARNING: CPU: 1 PID: 5353 at mm/util.c:649 
> > kvmalloc_node_noprof (mm/util.c:649 (discriminator 1)) 

This is:

/* Don't even allow crazy sizes */
if (unlikely(size > INT_MAX)) {
WARN_ON_ONCE(!(flags & __GFP_NOWARN));


> > [  940.307791][ T5353] Call Trace:
[...]
> > [ 940.351795][ T5353] eb_copy_relocations 
> > (drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1685) i915

And this is:

const unsigned int nreloc = eb->exec[i].relocation_count;
...
size = nreloc * sizeof(*relocs);

relocs = kvmalloc_array(1, size, GFP_KERNEL);

So something isn't checking the "relocation_count" size that I assume is
coming in from the ioctl?

-Kees

-- 
Kees Cook


Re: [linux-next:master] [mm/slab] 7bd230a266: WARNING:at_mm/util.c:#kvmalloc_node_noprof

2024-05-19 Thread Kees Cook
On Sun, May 19, 2024 at 07:06:45PM -0400, Kent Overstreet wrote:
> this looks like an i915 bug

Yeah, agreed.

> On Wed, May 15, 2024 at 10:41:19AM +0800, kernel test robot wrote:
[...]
> > [test failed on linux-next/master 6ba6c795dc73c22ce2c86006f17c4aa802db2a60]
[...]
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new 
> > version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot 
> > | Closes: 
> > https://lore.kernel.org/oe-lkp/202405151008.6ddd1aaf-oliver.s...@intel.com
> > 
> > 
> > [  940.101700][ T5353] [ cut here ]
> > [ 940.107107][ T5353] WARNING: CPU: 1 PID: 5353 at mm/util.c:649 
> > kvmalloc_node_noprof (mm/util.c:649 (discriminator 1)) 

This is:

/* Don't even allow crazy sizes */
if (unlikely(size > INT_MAX)) {
WARN_ON_ONCE(!(flags & __GFP_NOWARN));


> > [  940.307791][ T5353] Call Trace:
[...]
> > [ 940.351795][ T5353] eb_copy_relocations 
> > (drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1685) i915

And this is:

const unsigned int nreloc = eb->exec[i].relocation_count;
...
size = nreloc * sizeof(*relocs);

relocs = kvmalloc_array(1, size, GFP_KERNEL);

So something isn't checking the "relocation_count" size that I assume is
coming in from the ioctl?

-Kees

-- 
Kees Cook



Re: [GIT PULL] bcachefs updates fro 6.10-rc1

2024-05-19 Thread Kees Cook
On Sun, May 19, 2024 at 12:14:34PM -0400, Kent Overstreet wrote:
> [...]
> bcachefs changes for 6.10-rc1
> [...]
>   bcachefs: bch2_btree_path_to_text()

Hi Kent,

I've asked after this before[1], but there continues to be a lot of
bcachefs development going on that is only visible when it appears in
-next or during the merge window. I cannot find the above commit on
any mailing list on lore.kernel.org[2]. The rules for -next are clear:
patches _must_ appear on a list _somewhere_ before they land in -next
(much less Linus's tree). The point is to get additional reviews, and
to serve as a focal point for any discussions that pop up over a given
change. Please adjust the bcachefs development workflow to address this.

Anyway, in reference to the above commit, please scrub bcachefs of its
%px format string uses. Neither it nor %p should be used[3][4] in new
code. (Which is, again, something checkpatch.pl will warn about.)

Thanks!

-Kees

[1] https://lore.kernel.org/all/202401101525.112E8234@keescook/
[2] https://lore.kernel.org/linux-bcachefs/?q=bch2_btree_path_to_text
[3] https://docs.kernel.org/process/deprecated.html#p-format-specifier
[4] 
https://lore.kernel.org/lkml/ca+55afwqed_d40g4mucssvrzzrfpujt74vc6pppb675hynx...@mail.gmail.com/

-- 
Kees Cook



[PATCH 0/2] exec: Add KUnit test for bprm_stack_limits()

2024-05-19 Thread Kees Cook
Hi,

This adds a first KUnit test to the core exec code. With the ability to
manipulate userspace memory from KUnit coming[1], I wanted to at least
get the KUnit framework in place in exec.c. Most of the coming tests
will likely be to binfmt_elf.c, but still, this serves as a reasonable
first step.

-Kees

[1] 
https://lore.kernel.org/linux-hardening/20240519190422.work.715-k...@kernel.org/

Kees Cook (2):
  exec: Add KUnit test for bprm_stack_limits()
  exec: Avoid pathological argc, envc, and bprm->p values

 MAINTAINERS   |   2 +
 fs/Kconfig.binfmt |   8 +++
 fs/exec.c |  24 +++-
 fs/exec_test.c| 137 ++
 4 files changed, 170 insertions(+), 1 deletion(-)
 create mode 100644 fs/exec_test.c

-- 
2.34.1




[PATCH 2/2] exec: Avoid pathological argc, envc, and bprm->p values

2024-05-19 Thread Kees Cook
Make sure nothing goes wrong with the string counters or the bprm's
belief about the stack pointer. Add checks and matching self-tests.

For 32-bit validation, this was run under 32-bit UML:
$ tools/testing/kunit/kunit.py run --make_options SUBARCH=i386 exec

Signed-off-by: Kees Cook 
---
Cc: Eric Biederman 
Cc: Justin Stitt 
Cc: Alexander Viro 
Cc: Christian Brauner 
Cc: Jan Kara 
Cc: linux-fsde...@vger.kernel.org
Cc: linux...@kvack.org
---
 fs/exec.c  | 11 ++-
 fs/exec_test.c | 26 +-
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 1d45e1a2d620..5dcdd115739e 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -503,6 +503,9 @@ static int bprm_stack_limits(struct linux_binprm *bprm)
 * of argument strings even with small stacks
 */
limit = max_t(unsigned long, limit, ARG_MAX);
+   /* Reject totally pathological counts. */
+   if (bprm->argc < 0 || bprm->envc < 0)
+   return -E2BIG;
/*
 * We must account for the size of all the argv and envp pointers to
 * the argv and envp strings, since they will also take up space in
@@ -516,11 +519,17 @@ static int bprm_stack_limits(struct linux_binprm *bprm)
 * argc can never be 0, to keep them from walking envp by accident.
 * See do_execveat_common().
 */
-   ptr_size = (max(bprm->argc, 1) + bprm->envc) * sizeof(void *);
+   if (check_add_overflow(max(bprm->argc, 1), bprm->envc, _size) ||
+   check_mul_overflow(ptr_size, sizeof(void *), _size))
+   return -E2BIG;
if (limit <= ptr_size)
return -E2BIG;
limit -= ptr_size;
 
+   /* Avoid a pathological bprm->p. */
+   if (bprm->p < limit)
+   return -E2BIG;
+
bprm->argmin = bprm->p - limit;
return 0;
 }
diff --git a/fs/exec_test.c b/fs/exec_test.c
index 32a90c6f47e7..f2d4a80c861d 100644
--- a/fs/exec_test.c
+++ b/fs/exec_test.c
@@ -8,9 +8,32 @@ struct bprm_stack_limits_result {
 };
 
 static const struct bprm_stack_limits_result bprm_stack_limits_results[] = {
-   /* Giant values produce -E2BIG */
+   /* Negative argc/envc counts produce -E2BIG */
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX,
+   .argc = INT_MIN, .envc = INT_MIN }, .expected_rc = -E2BIG },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX,
+   .argc = 5, .envc = -1 }, .expected_rc = -E2BIG },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX,
+   .argc = -1, .envc = 10 }, .expected_rc = -E2BIG },
+   /* The max value of argc or envc is MAX_ARG_STRINGS. */
{ { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX,
.argc = INT_MAX, .envc = INT_MAX }, .expected_rc = -E2BIG },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX,
+   .argc = MAX_ARG_STRINGS, .envc = MAX_ARG_STRINGS }, .expected_rc = 
-E2BIG },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX,
+   .argc = 0, .envc = MAX_ARG_STRINGS }, .expected_rc = -E2BIG },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX,
+   .argc = MAX_ARG_STRINGS, .envc = 0 }, .expected_rc = -E2BIG },
+   /*
+* On 32-bit system these argc and envc counts, while likely impossible
+* to represent within the associated TASK_SIZE, could overflow the
+* limit calculation, and bypass the ptr_size <= limit check.
+*/
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX,
+   .argc = 0x2001, .envc = 0x2001 }, .expected_rc = -E2BIG },
+   /* Make sure a pathological bprm->p doesn't cause an overflow. */
+   { { .p = sizeof(void *), .rlim_stack.rlim_cur = ULONG_MAX,
+   .argc = 10, .envc = 10 }, .expected_rc = -E2BIG },
/*
 * 0 rlim_stack will get raised to ARG_MAX. With 1 string pointer,
 * we should see p - ARG_MAX + sizeof(void *).
@@ -88,6 +111,7 @@ static void exec_test_bprm_stack_limits(struct kunit *test)
/* Double-check the constants. */
KUNIT_EXPECT_EQ(test, _STK_LIM, SZ_8M);
KUNIT_EXPECT_EQ(test, ARG_MAX, 32 * SZ_4K);
+   KUNIT_EXPECT_EQ(test, MAX_ARG_STRINGS, 0x7FFF);
 
for (int i = 0; i < ARRAY_SIZE(bprm_stack_limits_results); i++) {
const struct bprm_stack_limits_result *result = 
_stack_limits_results[i];
-- 
2.34.1




[PATCH 1/2] exec: Add KUnit test for bprm_stack_limits()

2024-05-19 Thread Kees Cook
Since bprm_stack_limits() operates with very limited side-effects, add
it as the first exec.c KUnit test. Add to Kconfig and adjust MAINTAINERS
file to include it.

Tested on 64-bit UML:
$ tools/testing/kunit/kunit.py run exec

Signed-off-by: Kees Cook 
---
Cc: Eric Biederman 
Cc: Justin Stitt 
Cc: Alexander Viro 
Cc: Christian Brauner 
Cc: Jan Kara 
Cc: linux-fsde...@vger.kernel.org
Cc: linux...@kvack.org
---
 MAINTAINERS   |   2 +
 fs/Kconfig.binfmt |   8 
 fs/exec.c |  13 ++
 fs/exec_test.c| 113 ++
 4 files changed, 136 insertions(+)
 create mode 100644 fs/exec_test.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 7c121493f43d..845165dbb756 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8039,7 +8039,9 @@ S:Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
for-next/execve
 F: Documentation/userspace-api/ELF.rst
 F: fs/*binfmt_*.c
+F: fs/Kconfig.binfmt
 F: fs/exec.c
+F: fs/exec_test.c
 F: include/linux/binfmts.h
 F: include/linux/elf.h
 F: include/uapi/linux/binfmts.h
diff --git a/fs/Kconfig.binfmt b/fs/Kconfig.binfmt
index f5693164ca9a..58657f2d9719 100644
--- a/fs/Kconfig.binfmt
+++ b/fs/Kconfig.binfmt
@@ -176,4 +176,12 @@ config COREDUMP
  certainly want to say Y here. Not necessary on systems that never
  need debugging or only ever run flawless code.
 
+config EXEC_KUNIT_TEST
+   bool "Build execve tests" if !KUNIT_ALL_TESTS
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+   help
+ This builds the exec KUnit tests, which tests boundary conditions
+ of various aspects of the exec internals.
+
 endmenu
diff --git a/fs/exec.c b/fs/exec.c
index b3c40fbb325f..1d45e1a2d620 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -475,6 +475,15 @@ static int count_strings_kernel(const char *const *argv)
return i;
 }
 
+/*
+ * Calculate bprm->argmin from:
+ * - _STK_LIM
+ * - ARG_MAX
+ * - bprm->rlim_stack.rlim_cur
+ * - bprm->argc
+ * - bprm->envc
+ * - bprm->p
+ */
 static int bprm_stack_limits(struct linux_binprm *bprm)
 {
unsigned long limit, ptr_size;
@@ -2200,3 +2209,7 @@ static int __init init_fs_exec_sysctls(void)
 
 fs_initcall(init_fs_exec_sysctls);
 #endif /* CONFIG_SYSCTL */
+
+#ifdef CONFIG_EXEC_KUNIT_TEST
+#include "exec_test.c"
+#endif
diff --git a/fs/exec_test.c b/fs/exec_test.c
new file mode 100644
index ..32a90c6f47e7
--- /dev/null
+++ b/fs/exec_test.c
@@ -0,0 +1,113 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include 
+
+struct bprm_stack_limits_result {
+   struct linux_binprm bprm;
+   int expected_rc;
+   unsigned long expected_argmin;
+};
+
+static const struct bprm_stack_limits_result bprm_stack_limits_results[] = {
+   /* Giant values produce -E2BIG */
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX,
+   .argc = INT_MAX, .envc = INT_MAX }, .expected_rc = -E2BIG },
+   /*
+* 0 rlim_stack will get raised to ARG_MAX. With 1 string pointer,
+* we should see p - ARG_MAX + sizeof(void *).
+*/
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0,
+   .argc = 1, .envc = 0 }, .expected_argmin = ULONG_MAX - ARG_MAX + 
sizeof(void *)},
+   /* Validate that argc is always raised to a minimum of 1. */
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0,
+   .argc = 0, .envc = 0 }, .expected_argmin = ULONG_MAX - ARG_MAX + 
sizeof(void *)},
+   /*
+* 0 rlim_stack will get raised to ARG_MAX. With pointers filling 
ARG_MAX,
+* we should see -E2BIG. (Note argc is always raised to at least 1.)
+*/
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0,
+   .argc = ARG_MAX / sizeof(void *), .envc = 0 }, .expected_rc = 
-E2BIG },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0,
+   .argc = 0, .envc = ARG_MAX / sizeof(void *) - 1 }, .expected_rc = 
-E2BIG },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0,
+   .argc = ARG_MAX / sizeof(void *) + 1, .envc = 0 }, .expected_rc = 
-E2BIG },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0,
+   .argc = 0, .envc = ARG_MAX / sizeof(void *) }, .expected_rc = 
-E2BIG },
+   /* And with one less, we see space for exactly 1 pointer. */
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0,
+   .argc = (ARG_MAX / sizeof(void *)) - 1, .envc = 0 },
+ .expected_argmin = ULONG_MAX - sizeof(void *) },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0,
+   .argc = 0, .envc = (ARG_MAX / sizeof(void *)) - 2, },
+ .expected_argmin = ULONG_MAX - sizeof(void *) },
+   /* If we raise rlim_stack / 4 to exactly ARG_MAX, nothing changes. */
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ARG_MAX * 4,
+   .argc = ARG_MAX / sizeof(void *), .envc = 0 }, .expected_rc = 
-E2BIG },
+   { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ARG_MA

[PATCH 2/2] usercopy: Convert test_user_copy to KUnit test

2024-05-19 Thread Kees Cook
Convert the runtime tests of hardened usercopy to standard KUnit tests.

Co-developed-by: Vitor Massaru Iha 
Signed-off-by: Vitor Massaru Iha 
Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org
Signed-off-by: Kees Cook 
---
 MAINTAINERS|   1 +
 lib/Kconfig.debug  |  21 +-
 lib/Makefile   |   2 +-
 lib/{test_user_copy.c => usercopy_kunit.c} | 252 ++---
 4 files changed, 133 insertions(+), 143 deletions(-)
 rename lib/{test_user_copy.c => usercopy_kunit.c} (52%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7c121493f43d..73995b807e5a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11761,6 +11761,7 @@ F:  arch/*/configs/hardening.config
 F: include/linux/overflow.h
 F: include/linux/randomize_kstack.h
 F: kernel/configs/hardening.config
+F: lib/usercopy_kunit.c
 F: mm/usercopy.c
 K: \b(add|choose)_random_kstack_offset\b
 K: \b__check_(object_size|heap_object)\b
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c63a5fbf1f1c..fd974480aa45 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2460,18 +2460,6 @@ config TEST_VMALLOC
 
  If unsure, say N.
 
-config TEST_USER_COPY
-   tristate "Test user/kernel boundary protections"
-   depends on m
-   help
- This builds the "test_user_copy" module that runs sanity checks
- on the copy_to/from_user infrastructure, making sure basic
- user/kernel boundary testing is working. If it fails to load,
- a regression has been detected in the user/kernel memory boundary
- protections.
-
- If unsure, say N.
-
 config TEST_BPF
tristate "Test BPF filter functionality"
depends on m && NET
@@ -2779,6 +2767,15 @@ config SIPHASH_KUNIT_TEST
  This is intended to help people writing architecture-specific
  optimized versions.  If unsure, say N.
 
+config USERCOPY_KUNIT_TEST
+   tristate "KUnit Test for user/kernel boundary protections"
+   depends on KUNIT
+   default KUNIT_ALL_TESTS
+   help
+ This builds the "usercopy_kunit" module that runs sanity checks
+ on the copy_to/from_user infrastructure, making sure basic
+ user/kernel boundary testing is working.
+
 config TEST_UDELAY
tristate "udelay test driver"
help
diff --git a/lib/Makefile b/lib/Makefile
index ffc6b2341b45..6287bd6be5d7 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -78,7 +78,6 @@ obj-$(CONFIG_TEST_LKM) += test_module.o
 obj-$(CONFIG_TEST_VMALLOC) += test_vmalloc.o
 obj-$(CONFIG_TEST_RHASHTABLE) += test_rhashtable.o
 obj-$(CONFIG_TEST_SORT) += test_sort.o
-obj-$(CONFIG_TEST_USER_COPY) += test_user_copy.o
 obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_keys.o
 obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o
 obj-$(CONFIG_TEST_DYNAMIC_DEBUG) += test_dynamic_debug.o
@@ -406,6 +405,7 @@ obj-$(CONFIG_FORTIFY_KUNIT_TEST) += fortify_kunit.o
 obj-$(CONFIG_STRCAT_KUNIT_TEST) += strcat_kunit.o
 obj-$(CONFIG_STRSCPY_KUNIT_TEST) += strscpy_kunit.o
 obj-$(CONFIG_SIPHASH_KUNIT_TEST) += siphash_kunit.o
+obj-$(CONFIG_USERCOPY_KUNIT_TEST) += usercopy_kunit.o
 
 obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o
 
diff --git a/lib/test_user_copy.c b/lib/usercopy_kunit.c
similarity index 52%
rename from lib/test_user_copy.c
rename to lib/usercopy_kunit.c
index 5ff04d8fe971..515df08b3190 100644
--- a/lib/test_user_copy.c
+++ b/lib/usercopy_kunit.c
@@ -15,7 +15,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 /*
  * Several 32-bit architectures support 64-bit {get,put}_user() calls.
@@ -31,11 +31,17 @@
 # define TEST_U64
 #endif
 
+struct usercopy_test_priv {
+   char *kmem;
+   char __user *umem;
+   size_t size;
+};
+
 #define test(condition, msg, ...)  \
 ({ \
int cond = (condition); \
if (cond)   \
-   pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \
+   KUNIT_EXPECT_FALSE_MSG(test, cond, msg, ##__VA_ARGS__); \
cond;   \
 })
 
@@ -44,13 +50,16 @@ static bool is_zeroed(void *from, size_t size)
return memchr_inv(from, 0x0, size) == NULL;
 }
 
-static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size)
+/* Test usage of check_nonzero_user(). */
+static void usercopy_test_check_nonzero_user(struct kunit *test)
 {
-   int ret = 0;
size_t start, end, i, zero_start, zero_end;
+   struct usercopy_test_priv *priv = test->priv;
+   char __user *umem = priv->umem;
+   char *kmem = priv->kmem;
+   size_t size = p

[PATCH 0/2] usercopy: Convert test_user_copy to KUnit test

2024-05-19 Thread Kees Cook
Hi,

This builds on the proposal[1] from Mark and lets me convert the
existing usercopy selftest to KUnit. Besides adding this basic test to
the KUnit collection, it also opens the door for execve testing (which
depends on having a functional current->mm), and should provide the
basic infrastructure for adding Mark's much more complete usercopy tests.

-Kees

[1] https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/

Kees Cook (2):
  kunit: test: Add vm_mmap() allocation resource manager
  usercopy: Convert test_user_copy to KUnit test

 MAINTAINERS|   1 +
 include/kunit/test.h   |  17 ++
 lib/Kconfig.debug  |  21 +-
 lib/Makefile   |   2 +-
 lib/kunit/test.c   | 139 +++-
 lib/{test_user_copy.c => usercopy_kunit.c} | 252 ++---
 6 files changed, 288 insertions(+), 144 deletions(-)
 rename lib/{test_user_copy.c => usercopy_kunit.c} (52%)

-- 
2.34.1




[PATCH 1/2] kunit: test: Add vm_mmap() allocation resource manager

2024-05-19 Thread Kees Cook
For tests that need to allocate using vm_mmap() (e.g. usercopy and
execve), provide the interface to have the allocation tracked by KUnit
itself. This requires bringing up a placeholder userspace mm.

This combines my earlier attempt at this with Mark Rutland's version[1].

Link: 
https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ [1]
Co-developed-by: Mark Rutland 
Signed-off-by: Mark Rutland 
Signed-off-by: Kees Cook 
---
 include/kunit/test.h |  17 ++
 lib/kunit/test.c | 139 ++-
 2 files changed, 155 insertions(+), 1 deletion(-)

diff --git a/include/kunit/test.h b/include/kunit/test.h
index 61637ef32302..8c3835a6f282 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -478,6 +478,23 @@ static inline void *kunit_kcalloc(struct kunit *test, 
size_t n, size_t size, gfp
return kunit_kmalloc_array(test, n, size, gfp | __GFP_ZERO);
 }
 
+/**
+ * kunit_vm_mmap() - Allocate KUnit-tracked vm_mmap() area
+ * @test: The test context object.
+ * @file: struct file pointer to map from, if any
+ * @addr: desired address, if any
+ * @len: how many bytes to allocate
+ * @prot: mmap PROT_* bits
+ * @flag: mmap flags
+ * @offset: offset into @file to start mapping from.
+ *
+ * See vm_mmap() for more information.
+ */
+unsigned long kunit_vm_mmap(struct kunit *test, struct file *file,
+   unsigned long addr, unsigned long len,
+   unsigned long prot, unsigned long flag,
+   unsigned long offset);
+
 void kunit_cleanup(struct kunit *test);
 
 void __printf(2, 3) kunit_log_append(struct string_stream *log, const char 
*fmt, ...);
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 1d1475578515..09194dbffb63 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -11,13 +11,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
-#include 
 
 #include "debugfs.h"
 #include "device-impl.h"
@@ -871,6 +872,142 @@ void kunit_kfree(struct kunit *test, const void *ptr)
 }
 EXPORT_SYMBOL_GPL(kunit_kfree);
 
+struct kunit_vm_mmap_resource {
+   unsigned long addr;
+   size_t size;
+};
+
+/* vm_mmap() arguments */
+struct kunit_vm_mmap_params {
+   struct file *file;
+   unsigned long addr;
+   unsigned long len;
+   unsigned long prot;
+   unsigned long flag;
+   unsigned long offset;
+};
+
+/*
+ * Arbitrarily chosen user address for the base allocation.
+ */
+#define UBUF_ADDR_BASE SZ_2M
+
+/* Create and attach a new mm if it doesn't already exist. */
+static int kunit_attach_mm(void)
+{
+   struct vm_area_struct *vma;
+   struct mm_struct *mm;
+
+   if (current->mm)
+   return 0;
+
+   mm = mm_alloc();
+   if (!mm)
+   return -ENOMEM;
+
+   if (mmap_write_lock_killable(mm))
+   goto out_free;
+
+   /* Define the task size. */
+   mm->task_size = TASK_SIZE;
+
+   /* Prepare the base VMA. */
+   vma = vm_area_alloc(mm);
+   if (!vma)
+   goto out_unlock;
+
+   vma_set_anonymous(vma);
+   vma->vm_start = UBUF_ADDR_BASE;
+   vma->vm_end = UBUF_ADDR_BASE + PAGE_SIZE;
+   vm_flags_init(vma, VM_READ | VM_MAYREAD | VM_WRITE | VM_MAYWRITE);
+   vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
+
+   if (insert_vm_struct(mm, vma))
+   goto out_free_vma;
+
+   mmap_write_unlock(mm);
+
+   /* Make sure we can allocate new VMAs. */
+   arch_pick_mmap_layout(mm, >signal->rlim[RLIMIT_STACK]);
+
+   /* Attach the mm. It will be cleaned up when the process dies. */
+   kthread_use_mm(mm);
+
+   return 0;
+
+out_free_vma:
+   vm_area_free(vma);
+out_unlock:
+   mmap_write_unlock(mm);
+out_free:
+   mmput(mm);
+   return -ENOMEM;
+}
+
+static int kunit_vm_mmap_init(struct kunit_resource *res, void *context)
+{
+   struct kunit_vm_mmap_params *p = context;
+   struct kunit_vm_mmap_resource vres;
+   int ret;
+
+   ret = kunit_attach_mm();
+   if (ret)
+   return ret;
+
+   vres.size = p->len;
+   vres.addr = vm_mmap(p->file, p->addr, p->len, p->prot, p->flag, 
p->offset);
+   if (!vres.addr)
+   return -ENOMEM;
+   res->data = kmemdup(, sizeof(vres), GFP_KERNEL);
+   if (!res->data) {
+   vm_munmap(vres.addr, vres.size);
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
+static void kunit_vm_mmap_free(struct kunit_resource *res)
+{
+   struct kunit_vm_mmap_resource *vres = res->data;
+
+   /*
+* Since this is executed from the test monitoring process,
+* the test's mm has already been torn down. We don't need
+* to run vm_munmap(vres->addr, vres->size), only clean up
+* the vres.
+*/
+
+  

Re: [PATCH] efi: pstore: Return proper errors on UEFI failures

2024-05-19 Thread Kees Cook
On Sun, May 19, 2024 at 01:33:53PM -0300, Guilherme G. Piccoli wrote:
> Right now efi-pstore either returns 0 (success) or -EIO; but we
> do have a function to convert UEFI errors in different standard
> error codes, helping to narrow down potential issues more accurately.
> 
> So, let's use this helper here.
> 
> Signed-off-by: Guilherme G. Piccoli 

Ah yeah, good idea!

Reviewed-by: Kees Cook 

-- 
Kees Cook



Re: [PATCH] selftests: rtc: rtctest: Do not open-code TEST_HARNESS_MAIN

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 10:23:54PM +0200, Alexandre Belloni wrote:
> On 17/05/2024 17:16:58-0700, Kees Cook wrote:
> > Argument processing is specific to the test harness code. Any optional
> > information needs to be passed via environment variables. Move alternate
> > path to the RTC_DEV environment variable. Also do not open-code
> > TEST_HARNESS_MAIN because its definition may change.
> 
> Th main issue doing that is that this breaks the main use case of
> rtctest as /dev/rtc1 is usually the main target for those tests. Having
> the RTC_DEV environment variable only documented n this commit message
> is definitively not enough, I'm going to have to handle zillion of
> complaints that this is not working anymore.

Hm, maybe switch the default to /dev/rtc1? Maybe there's a better way to
integrate arguments into a test runner. Right now the core harness code
is doing the argument parsing...

-- 
Kees Cook



Re: [PATCH v2] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 11:29:23AM -0700, Guenter Roeck wrote:
> Trying to build parisc:allmodconfig with gcc 12.x or later results
> in the following build error.
> 
> drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
> drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
>   'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
> overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
>   161 | memcpy(data, args->mthd.data, size);
>   | ^~~
> drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
> drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
>   'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
> overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
>   298 | memcpy(data, args->new.data, size);
> 
> gcc assumes that 'sizeof(*args) + size' can overflow, which would result
> in the problem.
> 
> The problem is not new, only it is now no longer a warning but an error
> since W=1 has been enabled for the drm subsystem and since Werror is
> enabled for test builds.
> 
> Rearrange arithmetic and use check_add_overflow() for validating the
> allocation size to avoid the overflow.
> 
> Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
> subsystem")
> Cc: Javier Martinez Canillas 
> Cc: Jani Nikula 
> Cc: Thomas Zimmermann 
> Cc: Danilo Krummrich 
> Cc: Maxime Ripard 
> Cc: Kees Cook 
> Cc: Christophe JAILLET 
> Signed-off-by: Guenter Roeck 

Yeah, looks good to me. Thanks!

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v2] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 11:29:23AM -0700, Guenter Roeck wrote:
> Trying to build parisc:allmodconfig with gcc 12.x or later results
> in the following build error.
> 
> drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
> drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
>   'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
> overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
>   161 | memcpy(data, args->mthd.data, size);
>   | ^~~
> drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
> drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
>   'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
> overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
>   298 | memcpy(data, args->new.data, size);
> 
> gcc assumes that 'sizeof(*args) + size' can overflow, which would result
> in the problem.
> 
> The problem is not new, only it is now no longer a warning but an error
> since W=1 has been enabled for the drm subsystem and since Werror is
> enabled for test builds.
> 
> Rearrange arithmetic and use check_add_overflow() for validating the
> allocation size to avoid the overflow.
> 
> Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
> subsystem")
> Cc: Javier Martinez Canillas 
> Cc: Jani Nikula 
> Cc: Thomas Zimmermann 
> Cc: Danilo Krummrich 
> Cc: Maxime Ripard 
> Cc: Kees Cook 
> Cc: Christophe JAILLET 
> Signed-off-by: Guenter Roeck 

Yeah, looks good to me. Thanks!

Reviewed-by: Kees Cook 

-- 
Kees Cook


[PATCH] kunit/fortify: Fix memcmp() test to be amplitude agnostic

2024-05-18 Thread Kees Cook
When memcmp() returns a non-zero value, only the signed bit has any
meaning. The actual value may differ between implementations.

Reported-by: Nathan Chancellor 
Closes: https://github.com/ClangBuiltLinux/linux/issues/2025
Tested-by: Nathan Chancellor 
Signed-off-by: Kees Cook 
---
Cc: linux-hardening@vger.kernel.org
---
 lib/fortify_kunit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/fortify_kunit.c b/lib/fortify_kunit.c
index d2377e00caab..39da5b3bc649 100644
--- a/lib/fortify_kunit.c
+++ b/lib/fortify_kunit.c
@@ -990,7 +990,7 @@ static void fortify_test_memcmp(struct kunit *test)
KUNIT_ASSERT_EQ(test, memcmp(one, two, one_len), 0);
KUNIT_EXPECT_EQ(test, fortify_read_overflows, 0);
/* Still in bounds, but no longer matching. */
-   KUNIT_ASSERT_EQ(test, memcmp(one, two, one_len + 1), -32);
+   KUNIT_ASSERT_LT(test, memcmp(one, two, one_len + 1), 0);
KUNIT_EXPECT_EQ(test, fortify_read_overflows, 0);
 
/* Catch too-large ranges. */
-- 
2.34.1




Re: [PATCH] dma-buf/fence-array: Add flex array to struct dma_fence_array

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 07:47:02PM +0200, Christophe JAILLET wrote:
> This is an effort to get rid of all multiplications from allocation
> functions in order to prevent integer overflows [1][2].
> 
> The "struct dma_fence_array" can be refactored to add a flex array in order
> to have the "callback structures allocated behind the array" be more
> explicit.
> 
> Do so:
>- makes the code more readable and safer.
>- allows using __counted_by() for additional checks
>- avoids some pointer arithmetic in dma_fence_array_enable_signaling()
> 
> Link: 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
>  [1]
> Link: https://github.com/KSPP/linux/issues/160 [2]
> Signed-off-by: Christophe JAILLET 

Yes please! :)

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [WARNING] memcpy: detected field-spanning write (size 1005) of single field "_cmd->cmd.payload" at drivers/net/wireless/intel/iwlegacy/common.c:3173 (size 320)

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 11:29:39AM +0200, Stanislaw Gruszka wrote:
> Hi
> 
> On Fri, Apr 12, 2024 at 07:48:39PM +0200, Xose Vazquez Perez wrote:
> > Hi,
> > 
> > In Fedora kernel 6.8.5-301.fc40.x86_64, dmesg shows:
> > 
> > [ device: 03:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 
> > 4965 AG or AGN [Kedron] Network Connection [8086:4230] (rev 61) ]
> > 
> > Thanks.
> > 
> > [   53.407607] [ cut here ]
> > [   53.407622] memcpy: detected field-spanning write (size 1005) of single 
> > field "_cmd->cmd.payload" at 
> > drivers/net/wireless/intel/iwlegacy/common.c:3173 (size 320)
> > [   53.407721] WARNING: CPU: 1 PID: 1632 at 
> > drivers/net/wireless/intel/iwlegacy/common.c:3173 
> > il_enqueue_hcmd+0x477/0x560 [iwlegacy]
> 
> For CMD_SIZE_HUGE we have allocated 4k, so we do not do anything wrong.
> Except maybe code is convoluted, since we use same structure for
> huge and small il_device_cmd allocations.
> 
> But I'm thinking how to fix this fortify warning without refactoring and
> some extra runtime cost ...   
> 
> Xose, could you test below patch? I did not tested it, but I think
> it should make this particular warning gone and does not break
> anything. But maybe it will trigger some others fortify warnings.

tl;dr: the proposed patch should work. Refactoring will still be needed
in the future. :)

Long version:

struct il_device_cmd {
struct il_cmd_header hdr;   /* uCode API */
union {
u32 flags;
u8 val8;
u16 val16;
u32 val32;
struct il_tx_cmd tx;
u8 payload[DEF_CMD_PAYLOAD_SIZE];
} __packed cmd;
} __packed;

struct il_cmd_header {
...
/* command or response/notification data follows immediately */
u8 data[];
} __packed;

Yes, the proposed fix will silence the warning, but this struct is
certainly on Gustavo's list to fix for "flexible arrays not-at-end"
warnings[1].

This memcpy() is the perfect example of why we need to refactor these
kinds of structs: the object size is ambiguous for the compiler. It
could be as little as sizeof(struct il_device_cmd), but it could larger,
because of the "hdr" member. Right now, it depends on how we happen to
address it (as your change is doing).

Regardless, thanks for tracking this down and fixing it!

-Kees

[1] https://lore.kernel.org/lkml/ZgsAFhl90kecIR00@neat/

> 
> Regards
> Stanislaw
> 
> diff --git a/drivers/net/wireless/intel/iwlegacy/common.c 
> b/drivers/net/wireless/intel/iwlegacy/common.c
> index 9d33a66a49b5..c4ccc5df6419 100644
> --- a/drivers/net/wireless/intel/iwlegacy/common.c
> +++ b/drivers/net/wireless/intel/iwlegacy/common.c
> @@ -3170,7 +3170,7 @@ il_enqueue_hcmd(struct il_priv *il, struct il_host_cmd 
> *cmd)
>   out_meta->callback = cmd->callback;
>  
>   out_cmd->hdr.cmd = cmd->id;
> - memcpy(_cmd->cmd.payload, cmd->data, cmd->len);
> + memcpy(_cmd->hdr.data, cmd->data, cmd->len);
>  
>   /* At this point, the out_cmd now has all of the incoming cmd
>* information */

-- 
Kees Cook



Re: [PATCH] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 06:54:36PM +0200, Christophe JAILLET wrote:
> (adding linux-harden...@vger.kernel.org)
> 
> 
> Le 18/05/2024 à 16:37, Guenter Roeck a écrit :
> > Trying to build parisc:allmodconfig with gcc 12.x or later results
> > in the following build error.
> > 
> > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
> > drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
> > 'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
> > overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
> >161 | memcpy(data, args->mthd.data, size);
> >| ^~~
> > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
> > drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
> > 'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
> > overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
> >298 | memcpy(data, args->new.data, size);
> > 
> > gcc assumes that 'sizeof(*args) + size' can overflow, which would result
> > in the problem.
> > 
> > The problem is not new, only it is now no longer a warning but an error 
> > since W=1
> > has been enabled for the drm subsystem and since Werror is enabled for test 
> > builds.
> > 
> > Rearrange arithmetic and add extra size checks to avoid the overflow.
> > 
> > Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
> > subsystem")
> > Cc: Javier Martinez Canillas 
> > 
> > Cc: Jani Nikula 
> > Cc: Thomas Zimmermann 
> > Cc: Danilo Krummrich 
> > Cc: Maxime Ripard 
> > Signed-off-by: Guenter Roeck 
> > ---
> > checkpatch complains about the line length in the description and the 
> > (pre-existing)
> > assignlemts in if conditions, but I did not want to split lines in the 
> > description
> > or rearrange the code further.
> > 
> > I don't know why I only see the problem with parisc builds (at least so 
> > far).
> > 
> >   drivers/gpu/drm/nouveau/nvif/object.c | 8 +---
> >   1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/nouveau/nvif/object.c 
> > b/drivers/gpu/drm/nouveau/nvif/object.c
> > index 4d1aaee8fe15..baf623a48874 100644
> > --- a/drivers/gpu/drm/nouveau/nvif/object.c
> > +++ b/drivers/gpu/drm/nouveau/nvif/object.c
> > @@ -145,8 +145,9 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, 
> > void *data, u32 size)
> > u8 stack[128];
> > int ret;
> > -   if (sizeof(*args) + size > sizeof(stack)) {
> > -   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
> > +   if (size > sizeof(stack) - sizeof(*args)) {
> > +   if (size > INT_MAX ||
> > +   !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
> 
> Hi,
> 
> Would it be cleaner or better to use size_add(sizeof(*args), size)?

I think the INT_MAX test is actually better in this case because
nvif_object_ioctl()'s size argument is u32:

ret = nvif_object_ioctl(object, args, sizeof(*args) + size, NULL);
  

So that could wrap around, even though the allocation may not.

Better yet, since "sizeof(*args) + size" is repeated 3 times in the
function, I'd recommend:

...
u32 args_size;

if (check_add_overflow(sizeof(*args), size, _size))
return -ENOMEM;
if (args_size > sizeof(stack)) {
if (!(args = kmalloc(args_size, GFP_KERNEL)))
return -ENOMEM;
} else {
args = (void *)stack;
}
...
ret = nvif_object_ioctl(object, args, args_size, NULL);

This will catch the u32 overflow to nvif_object_ioctl(), catch an
allocation underflow on 32-bits systems, and make the code more
readable. :)

-Kees

-- 
Kees Cook


Re: [PATCH] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 06:54:36PM +0200, Christophe JAILLET wrote:
> (adding linux-harden...@vger.kernel.org)
> 
> 
> Le 18/05/2024 à 16:37, Guenter Roeck a écrit :
> > Trying to build parisc:allmodconfig with gcc 12.x or later results
> > in the following build error.
> > 
> > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
> > drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
> > 'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
> > overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
> >161 | memcpy(data, args->mthd.data, size);
> >| ^~~
> > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
> > drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
> > 'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
> > overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
> >298 | memcpy(data, args->new.data, size);
> > 
> > gcc assumes that 'sizeof(*args) + size' can overflow, which would result
> > in the problem.
> > 
> > The problem is not new, only it is now no longer a warning but an error 
> > since W=1
> > has been enabled for the drm subsystem and since Werror is enabled for test 
> > builds.
> > 
> > Rearrange arithmetic and add extra size checks to avoid the overflow.
> > 
> > Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
> > subsystem")
> > Cc: Javier Martinez Canillas 
> > 
> > Cc: Jani Nikula 
> > Cc: Thomas Zimmermann 
> > Cc: Danilo Krummrich 
> > Cc: Maxime Ripard 
> > Signed-off-by: Guenter Roeck 
> > ---
> > checkpatch complains about the line length in the description and the 
> > (pre-existing)
> > assignlemts in if conditions, but I did not want to split lines in the 
> > description
> > or rearrange the code further.
> > 
> > I don't know why I only see the problem with parisc builds (at least so 
> > far).
> > 
> >   drivers/gpu/drm/nouveau/nvif/object.c | 8 +---
> >   1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/nouveau/nvif/object.c 
> > b/drivers/gpu/drm/nouveau/nvif/object.c
> > index 4d1aaee8fe15..baf623a48874 100644
> > --- a/drivers/gpu/drm/nouveau/nvif/object.c
> > +++ b/drivers/gpu/drm/nouveau/nvif/object.c
> > @@ -145,8 +145,9 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, 
> > void *data, u32 size)
> > u8 stack[128];
> > int ret;
> > -   if (sizeof(*args) + size > sizeof(stack)) {
> > -   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
> > +   if (size > sizeof(stack) - sizeof(*args)) {
> > +   if (size > INT_MAX ||
> > +   !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
> 
> Hi,
> 
> Would it be cleaner or better to use size_add(sizeof(*args), size)?

I think the INT_MAX test is actually better in this case because
nvif_object_ioctl()'s size argument is u32:

ret = nvif_object_ioctl(object, args, sizeof(*args) + size, NULL);
  

So that could wrap around, even though the allocation may not.

Better yet, since "sizeof(*args) + size" is repeated 3 times in the
function, I'd recommend:

...
u32 args_size;

if (check_add_overflow(sizeof(*args), size, _size))
return -ENOMEM;
if (args_size > sizeof(stack)) {
if (!(args = kmalloc(args_size, GFP_KERNEL)))
return -ENOMEM;
} else {
args = (void *)stack;
}
...
ret = nvif_object_ioctl(object, args, args_size, NULL);

This will catch the u32 overflow to nvif_object_ioctl(), catch an
allocation underflow on 32-bits systems, and make the code more
readable. :)

-Kees

-- 
Kees Cook


Re: [PATCH 1/2] selftests: harness: remove unneeded __constructor_order_last()

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 12:29:00PM +0900, Masahiro Yamada wrote:
> It will be set to "true" eventually,
> but __LIST_APPEND() still sees "false"
> on backward-order systems.

Ah, yes -- you are right. I looked through the commit history (I had
to go back to when the seccomp test, and the harness, was out of tree).
There was a time when the logic happened during the list walking, rather
than during list _creation_. I was remembering the former.

So, yes, let's make this change. As you say, it also solves for defining
TEST_HARNESS_MAIN before the tests. Thank you! I'd still like to replace
all the open-coded TEST_HARNESS_MAIN calls, though.

-- 
Kees Cook



[PATCH] selftests: drivers/s390x: Use SKIP() during FIXTURE_SETUP

2024-05-17 Thread Kees Cook
Instead of mixing selftest harness and ksft helpers, perform SKIP
testing from the FIXTURE_SETUPs. This also means TEST_HARNESS_MAIN does
not need to be open-coded.

Signed-off-by: Kees Cook 
---
Cc: Christian Borntraeger 
Cc: Janosch Frank 
Cc: Claudio Imbrenda 
Cc: David Hildenbrand 
Cc: Shuah Khan 
Cc: Masahiro Yamada 
Cc: k...@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
---
I wasn't able to build or run test this, since it's very s390 specific, it 
seems?
---
 .../drivers/s390x/uvdevice/test_uvdevice.c| 32 ---
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c 
b/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c
index ea0cdc37b44f..9622fac42597 100644
--- a/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c
+++ b/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c
@@ -18,6 +18,16 @@
 
 #define UV_PATH  "/dev/uv"
 #define BUFFER_SIZE 0x200
+
+#define open_uv() \
+do { \
+   self->uv_fd = open(UV_PATH, O_ACCMODE); \
+   if (self->uv_fd < 0) \
+   SKIP(return, "No uv-device or cannot access " UV_PATH  "\n" \
+"Enable CONFIG_S390_UV_UAPI and check the access 
rights on " \
+UV_PATH ".\n"); \
+} while (0)
+
 FIXTURE(uvio_fixture) {
int uv_fd;
struct uvio_ioctl_cb uvio_ioctl;
@@ -37,7 +47,7 @@ FIXTURE_VARIANT_ADD(uvio_fixture, att) {
 
 FIXTURE_SETUP(uvio_fixture)
 {
-   self->uv_fd = open(UV_PATH, O_ACCMODE);
+   open_uv();
 
self->uvio_ioctl.argument_addr = (__u64)self->buffer;
self->uvio_ioctl.argument_len = variant->arg_size;
@@ -159,7 +169,7 @@ FIXTURE(attest_fixture) {
 
 FIXTURE_SETUP(attest_fixture)
 {
-   self->uv_fd = open(UV_PATH, O_ACCMODE);
+   open_uv();
 
self->uvio_ioctl.argument_addr = (__u64)>uvio_attest;
self->uvio_ioctl.argument_len = sizeof(self->uvio_attest);
@@ -257,20 +267,4 @@ TEST_F(attest_fixture, att_inval_addr)
att_inval_addr_test(>uvio_attest.meas_addr, _metadata, self);
 }
 
-static void __attribute__((constructor)) __constructor_order_last(void)
-{
-   if (!__constructor_order)
-   __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD;
-}
-
-int main(int argc, char **argv)
-{
-   int fd = open(UV_PATH, O_ACCMODE);
-
-   if (fd < 0)
-   ksft_exit_skip("No uv-device or cannot access " UV_PATH  "\n"
-  "Enable CONFIG_S390_UV_UAPI and check the access 
rights on "
-  UV_PATH ".\n");
-   close(fd);
-   return test_harness_run(argc, argv);
-}
+TEST_HARNESS_MAIN
-- 
2.34.1




[PATCH] selftests: hid: Do not open-code TEST_HARNESS_MAIN

2024-05-17 Thread Kees Cook
Avoid open-coding TEST_HARNESS_MAIN. (It might change, for example.)

Signed-off-by: Kees Cook 
---
Cc: Jiri Kosina 
Cc: Benjamin Tissoires 
Cc: Shuah Khan 
Cc: Masahiro Yamada 
Cc: linux-in...@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
---
 tools/testing/selftests/hid/hid_bpf.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/hid/hid_bpf.c 
b/tools/testing/selftests/hid/hid_bpf.c
index f825623e3edc..943fa62a4f78 100644
--- a/tools/testing/selftests/hid/hid_bpf.c
+++ b/tools/testing/selftests/hid/hid_bpf.c
@@ -961,17 +961,11 @@ static int libbpf_print_fn(enum libbpf_print_level level,
return 0;
 }
 
-static void __attribute__((constructor)) __constructor_order_last(void)
-{
-   if (!__constructor_order)
-   __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD;
-}
-
-int main(int argc, char **argv)
+static void __attribute__((constructor)) __one_time_init(void)
 {
/* Use libbpf 1.0 API mode */
libbpf_set_strict_mode(LIBBPF_STRICT_ALL);
libbpf_set_print(libbpf_print_fn);
-
-   return test_harness_run(argc, argv);
 }
+
+TEST_HARNESS_MAIN
-- 
2.34.1




[PATCH] selftests: rtc: rtctest: Do not open-code TEST_HARNESS_MAIN

2024-05-17 Thread Kees Cook
Argument processing is specific to the test harness code. Any optional
information needs to be passed via environment variables. Move alternate
path to the RTC_DEV environment variable. Also do not open-code
TEST_HARNESS_MAIN because its definition may change.

Additionally, setup checking can be done in the FIXTURE_SETUP(). With
this adjustment, also improve the error reporting when the device cannot
be opened.

Signed-off-by: Kees Cook 
---
Cc: Alexandre Belloni 
Cc: Shuah Khan 
Cc: Masahiro Yamada 
Cc: linux-...@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
---
 tools/testing/selftests/rtc/Makefile  |  2 +-
 tools/testing/selftests/rtc/rtctest.c | 66 +--
 2 files changed, 13 insertions(+), 55 deletions(-)

diff --git a/tools/testing/selftests/rtc/Makefile 
b/tools/testing/selftests/rtc/Makefile
index 55198ecc04db..654f9d58da3c 100644
--- a/tools/testing/selftests/rtc/Makefile
+++ b/tools/testing/selftests/rtc/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-CFLAGS += -O3 -Wl,-no-as-needed -Wall
+CFLAGS += -O3 -Wl,-no-as-needed -Wall $(KHDR_INCLUDES)
 LDLIBS += -lrt -lpthread -lm
 
 TEST_GEN_PROGS = rtctest
diff --git a/tools/testing/selftests/rtc/rtctest.c 
b/tools/testing/selftests/rtc/rtctest.c
index 63ce02d1d5cc..41cfefcc20e1 100644
--- a/tools/testing/selftests/rtc/rtctest.c
+++ b/tools/testing/selftests/rtc/rtctest.c
@@ -30,7 +30,18 @@ FIXTURE(rtc) {
 };
 
 FIXTURE_SETUP(rtc) {
+   char *alternate = getenv("RTC_DEV");
+
+   if (alternate)
+   rtc_file = alternate;
+
self->fd = open(rtc_file, O_RDONLY);
+
+   if (self->fd == -1 && errno == ENOENT)
+   SKIP(return, "Skipping test since %s does not exist", rtc_file);
+   EXPECT_NE(-1, self->fd) {
+   TH_LOG("%s: %s\n", rtc_file, strerror(errno));
+   }
 }
 
 FIXTURE_TEARDOWN(rtc) {
@@ -41,10 +52,6 @@ TEST_F(rtc, date_read) {
int rc;
struct rtc_time rtc_tm;
 
-   if (self->fd == -1 && errno == ENOENT)
-   SKIP(return, "Skipping test since %s does not exist", rtc_file);
-   ASSERT_NE(-1, self->fd);
-
/* Read the RTC time/date */
rc = ioctl(self->fd, RTC_RD_TIME, _tm);
ASSERT_NE(-1, rc);
@@ -88,10 +95,6 @@ TEST_F_TIMEOUT(rtc, date_read_loop, READ_LOOP_DURATION_SEC + 
2) {
struct rtc_time rtc_tm;
time_t start_rtc_read, prev_rtc_read;
 
-   if (self->fd == -1 && errno == ENOENT)
-   SKIP(return, "Skipping test since %s does not exist", rtc_file);
-   ASSERT_NE(-1, self->fd);
-
TH_LOG("Continuously reading RTC time for %ds (with %dms breaks after 
every read).",
   READ_LOOP_DURATION_SEC, READ_LOOP_SLEEP_MS);
 
@@ -126,10 +129,6 @@ TEST_F_TIMEOUT(rtc, uie_read, NUM_UIE + 2) {
int i, rc, irq = 0;
unsigned long data;
 
-   if (self->fd == -1 && errno == ENOENT)
-   SKIP(return, "Skipping test since %s does not exist", rtc_file);
-   ASSERT_NE(-1, self->fd);
-
/* Turn on update interrupts */
rc = ioctl(self->fd, RTC_UIE_ON, 0);
if (rc == -1) {
@@ -155,10 +154,6 @@ TEST_F(rtc, uie_select) {
int i, rc, irq = 0;
unsigned long data;
 
-   if (self->fd == -1 && errno == ENOENT)
-   SKIP(return, "Skipping test since %s does not exist", rtc_file);
-   ASSERT_NE(-1, self->fd);
-
/* Turn on update interrupts */
rc = ioctl(self->fd, RTC_UIE_ON, 0);
if (rc == -1) {
@@ -198,10 +193,6 @@ TEST_F(rtc, alarm_alm_set) {
time_t secs, new;
int rc;
 
-   if (self->fd == -1 && errno == ENOENT)
-   SKIP(return, "Skipping test since %s does not exist", rtc_file);
-   ASSERT_NE(-1, self->fd);
-
rc = ioctl(self->fd, RTC_RD_TIME, );
ASSERT_NE(-1, rc);
 
@@ -256,10 +247,6 @@ TEST_F(rtc, alarm_wkalm_set) {
time_t secs, new;
int rc;
 
-   if (self->fd == -1 && errno == ENOENT)
-   SKIP(return, "Skipping test since %s does not exist", rtc_file);
-   ASSERT_NE(-1, self->fd);
-
rc = ioctl(self->fd, RTC_RD_TIME, );
ASSERT_NE(-1, rc);
 
@@ -308,10 +295,6 @@ TEST_F_TIMEOUT(rtc, alarm_alm_set_minute, 65) {
time_t secs, new;
int rc;
 
-   if (self->fd == -1 && errno == ENOENT)
-   SKIP(return, "Skipping test since %s does not exist", rtc_file);
-   ASSERT_NE(-1, self->fd);
-
rc = ioctl(self->fd, RTC_RD_TIME, );
ASSERT_NE(-1, rc);
 
@@ -366,10 +349,6 @@ TEST_F_TIMEOUT(rtc, alarm_wkalm_set_minute, 65) {
time_t secs, new;
int rc;
 
-   if (self->fd == -1 && errno == ENOENT)
-   SKIP(return, "Skipping test since

Re: [PATCH 1/2] selftests: harness: remove unneeded __constructor_order_last()

2024-05-17 Thread Kees Cook
On Fri, May 17, 2024 at 08:45:05PM +0900, Masahiro Yamada wrote:
> __constructor_order_last() is unneeded.
> 
> If __constructor_order_last() is not called on reverse-order systems,
> __constructor_order will remain 0 instead of being set to
> _CONSTRUCTOR_ORDER_BACKWARD (= -1).
> 
> __LIST_APPEND() will still take the 'else' branch, so there is no
> difference in the behavior.
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
>  .../selftests/drivers/s390x/uvdevice/test_uvdevice.c   |  6 --
>  tools/testing/selftests/hid/hid_bpf.c  |  6 --
>  tools/testing/selftests/kselftest_harness.h| 10 +-
>  tools/testing/selftests/rtc/rtctest.c  |  7 ---
>  4 files changed, 1 insertion(+), 28 deletions(-)
> 
> diff --git a/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c 
> b/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c
> index ea0cdc37b44f..7ee7492138c6 100644
> --- a/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c
> +++ b/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c
> @@ -257,12 +257,6 @@ TEST_F(attest_fixture, att_inval_addr)
>   att_inval_addr_test(>uvio_attest.meas_addr, _metadata, self);
>  }
>  
> -static void __attribute__((constructor)) __constructor_order_last(void)
> -{
> - if (!__constructor_order)
> - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD;
> -}
> -
>  int main(int argc, char **argv)
>  {
>   int fd = open(UV_PATH, O_ACCMODE);
> diff --git a/tools/testing/selftests/hid/hid_bpf.c 
> b/tools/testing/selftests/hid/hid_bpf.c
> index 2cf96f818f25..f47feef2aced 100644
> --- a/tools/testing/selftests/hid/hid_bpf.c
> +++ b/tools/testing/selftests/hid/hid_bpf.c
> @@ -853,12 +853,6 @@ static int libbpf_print_fn(enum libbpf_print_level level,
>   return 0;
>  }
>  
> -static void __attribute__((constructor)) __constructor_order_last(void)
> -{
> - if (!__constructor_order)
> - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD;
> -}
> -
>  int main(int argc, char **argv)
>  {
>   /* Use libbpf 1.0 API mode */
> diff --git a/tools/testing/selftests/kselftest_harness.h 
> b/tools/testing/selftests/kselftest_harness.h
> index ba3ddeda24bf..60c1cf5b0f0d 100644
> --- a/tools/testing/selftests/kselftest_harness.h
> +++ b/tools/testing/selftests/kselftest_harness.h
> @@ -444,12 +444,6 @@
>   * Use once to append a main() to the test file.
>   */
>  #define TEST_HARNESS_MAIN \
> - static void __attribute__((constructor)) \
> - __constructor_order_last(void) \
> - { \
> - if (!__constructor_order) \
> - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD; \
> - } \
>   int main(int argc, char **argv) { \
>   return test_harness_run(argc, argv); \
>   }

This won't work. All constructors are executed, so we have to figure
out which is run _first_. Switching this to a boolean means we gain no
information about ordering: it'll always be set to "true". We need to
detect which constructor sets it first so that we can walk the lists
(that are built via all the constructors in between) in the correct
order.

> @@ -846,7 +840,6 @@ static struct __fixture_metadata *__fixture_list = 
> &_fixture_global;
>  static int __constructor_order;
>  
>  #define _CONSTRUCTOR_ORDER_FORWARD   1
> -#define _CONSTRUCTOR_ORDER_BACKWARD -1
>  
>  static inline void __register_fixture(struct __fixture_metadata *f)
>  {
> @@ -1272,8 +1265,7 @@ static int test_harness_run(int argc, char **argv)
>  
>  static void __attribute__((constructor)) __constructor_order_first(void)
>  {
> - if (!__constructor_order)
> - __constructor_order = _CONSTRUCTOR_ORDER_FORWARD;
> + __constructor_order = _CONSTRUCTOR_ORDER_FORWARD;
>  }
>  
>  #endif  /* __KSELFTEST_HARNESS_H */
> diff --git a/tools/testing/selftests/rtc/rtctest.c 
> b/tools/testing/selftests/rtc/rtctest.c
> index 63ce02d1d5cc..9647b14b47c5 100644
> --- a/tools/testing/selftests/rtc/rtctest.c
> +++ b/tools/testing/selftests/rtc/rtctest.c
> @@ -410,13 +410,6 @@ TEST_F_TIMEOUT(rtc, alarm_wkalm_set_minute, 65) {
>   ASSERT_EQ(new, secs);
>  }
>  
> -static void __attribute__((constructor))
> -__constructor_order_last(void)
> -{
> - if (!__constructor_order)
> - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD;
> -}
> -
>  int main(int argc, char **argv)
>  {
>   switch (argc) {

A better question is why these tests are open-coding the execution of
"main"...

-- 
Kees Cook



  1   2   3   4   5   6   7   8   9   10   >