Re: [PATCH] remove AND operation in choose_random_kstack_offset()
On Mon, Jun 17, 2024 at 10:33:08PM +0200, Arnd Bergmann wrote: > On Mon, Jun 17, 2024, at 20:22, Kees Cook wrote: > > On Mon, Jun 17, 2024 at 04:52:15PM +0100, Mark Rutland wrote: > >> On Mon, Jun 17, 2024 at 01:37:21PM +, Yuntao Liu wrote: > >> > Since the offset would be bitwise ANDed with 0x3FF in > >> > add_random_kstack_offset(), so just remove AND operation here. > >> > > >> > Signed-off-by: Yuntao Liu > >> > >> The comments in arm64 and x86 say that they're deliberately capping the > >> offset at fewer bits than the result of KSTACK_OFFSET_MAX() masking the > >> value with 0x3FF. > >> > >> Maybe it's ok to expand that, but if that's the case the commit message > >> needs to explain why it's safe add extra bits (2 on arm64, 3 on s39 and > >> x86), and those comments need to be updated accordingly. > >> > >> As-is, I do not think this patch is ok. > > > > Yeah, I agree: the truncation is intentional and tuned to the > > architecture. > > It may be intentional, but it's clearly nonsense: there is nothing > inherent to the architecture that means we have can go only 256 > bytes instead of 512 bytes into the 16KB available stack space. > > As far as I can tell, any code just gets bloated to the point > where it fills up the available memory, regardless of how > much you give it. I'm sure one can find code paths today that > exceed the 16KB, so there is no point pretending that 15.75KB > is somehow safe to use while 15.00KB is not. > > I'm definitely in favor of making this less architecture > specific, we just need to pick a good value, and we may well > end up deciding to use less than the default 1KB. We can also > go the opposite way and make the limit 4KB but then increase > the default stack size to 20KB for kernels that enable > randomization. I'm all for more entropy, but arch maintainers had wanted specific control over this value, and given the years of bikeshedding over the feature, I'm not inclined dive back into that debate, but okay. FWIW, the here's the actual entropy (due to stack alignment enforced by the compiler for the given arch ABI). standard cap is 0x3FF (10 bits). arm64: capped at 0x1FF (9 bits), 5 bits effective powerpc: uncapped (10 bits), 6 or 7 bits effective riscv: uncapped (10 bits), 6 bits effective x86: capped at 0xFF (8 bits), 5 (x86_64) or 6 (ia32) bits effective s390: capped at 0xFF (8 bits), undocumented effective entropy So if x86, arm64, and s390 maintainers are okay with it, we can try dropping the masks on those architectures. They would gain 2, 1, and 2 bits respectively. -Kees -- Kees Cook
Re: [PATCH v2] x86/traps: Enable UBSAN traps on x86
On Tue, Jun 18, 2024 at 12:13:27AM +0200, Thomas Gleixner wrote: > On Wed, Jun 12 2024 at 11:42, Kees Cook wrote: > > On Tue, Jun 11, 2024 at 01:26:09PM -0700, Gatlin Newhouse wrote: > >> It seems that is_valid_bugaddr() needs to be implemented on all > >> architectures > >> and the function get_ud_type() replaces it here. So how should the patch > >> handle > >> is_valid_bugaddr()? Should the function remain as-is in traps.c despite no > >> longer being used? > > > > Yeah, this is why I'd suggested to Gatlin in early designs to reuse > > is_valid_bugaddr()'s int value. It's a required function, so it seemed > > sensible to just repurpose it from yes/no to no/type1/type2/type3/etc. > > It's not sensible, it's just tasteless. > > If is_valid_bugaddr() is globaly required in it's boolean form then it > should just stay that way and not be abused just because it can be > abused. > > What's wrong with doing: > > __always_inline u16 get_ud_type(unsigned long addr) > { > > } > > int is_valid_bugaddr(unsigned long addr) > { > return get_ud_type() != BUG_UD_NONE; > } > > Hmm? > > In fact is_valid_bugaddr() should be globally fixed up to return bool to > match what the function name suggests. > > The UD type information is x86 specific and has zero business in a > generic architecture agnostic function return value. > > It's a sad state of affairs that I have to explain this to people who > care about code correctness. Readability and consistency are substantial > parts of correctness, really. Well, it's trade-offs. If get_ud_type() is in is_valid_bugaddr(), we have to call it _again_ outside of is_valid_bugaddr(). That's suboptimal as well. I was trying to find a reasonable way to avoid refactoring all architectures and to avoid code code. Looking at it all again, I actually think arch/x86/kernel/traps.c shouldn't call is_valid_bugaddr() at all. That usage can continue to stay in lib/bug.c, which is only ever used by x86 during very early boot, according to the comments in early_fixup_exception(). So just a direct replacement of is_valid_bugaddr() with the proposed get_ud_type() should be fine in arch/x86/kernel/traps.c. What do you think? -- Kees Cook
Re: [PATCH] init/Kconfig: extend -Wno-array-bounds to gcc 13
d, /* For each scheduling entry */ for (i = 0; i < sg->num_entries; i++) { + if (WARN_ON_ONCE(i >= ARRAY_SIZE(sg->gce))) + break; gce = >gce[i]; ips = sparx5_psfp_ipv_to_ips(gce->ipv); /* hardware needs TimeInterval to be cumulative */ -- Kees Cook
Re: [PATCH v3 2/2] usercopy: Convert test_user_copy to KUnit test
On Fri, Jun 14, 2024 at 09:50:05AM -0600, Shuah Khan wrote: > On 6/12/24 13:59, Kees Cook wrote: > > Convert the runtime tests of hardened usercopy to standard KUnit tests. > > > > Additionally disable usercopy_test_invalid() for systems with separate > > address spaces (or no MMU) since it's not sensible to test for address > > confusion there (e.g. m68k). > > > > Co-developed-by: Vitor Massaru Iha > > Signed-off-by: Vitor Massaru Iha > > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org > > Tested-by: Ivan Orlov > > Reviewed-by: David Gow > > Signed-off-by: Kees Cook > > --- > > MAINTAINERS| 1 + > > lib/Kconfig.debug | 21 +- > > lib/Makefile | 2 +- > > lib/{test_user_copy.c => usercopy_kunit.c} | 282 ++--- > > 4 files changed, 151 insertions(+), 155 deletions(-) > > rename lib/{test_user_copy.c => usercopy_kunit.c} (46%) > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 8754ac2c259d..0cd171ec6010 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -11962,6 +11962,7 @@ F: arch/*/configs/hardening.config > > F:include/linux/overflow.h > > F:include/linux/randomize_kstack.h > > F:kernel/configs/hardening.config > > +F: lib/usercopy_kunit.c > > F:mm/usercopy.c > > K:\b(add|choose)_random_kstack_offset\b > > K:\b__check_(object_size|heap_object)\b > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > > index 59b6765d86b8..561e346f5cb0 100644 > > --- a/lib/Kconfig.debug > > +++ b/lib/Kconfig.debug > > @@ -2505,18 +2505,6 @@ config TEST_VMALLOC > > If unsure, say N. > > -config TEST_USER_COPY > > - tristate "Test user/kernel boundary protections" > > - depends on m > > - help > > - This builds the "test_user_copy" module that runs sanity checks > > - on the copy_to/from_user infrastructure, making sure basic > > - user/kernel boundary testing is working. If it fails to load, > > - a regression has been detected in the user/kernel memory boundary > > - protections. > > - > > - If unsure, say N. > > - > > config TEST_BPF > > tristate "Test BPF filter functionality" > > depends on m && NET > > @@ -2814,6 +2802,15 @@ config SIPHASH_KUNIT_TEST > > This is intended to help people writing architecture-specific > > optimized versions. If unsure, say N. > > +config USERCOPY_KUNIT_TEST > > + tristate "KUnit Test for user/kernel boundary protections" > > + depends on KUNIT > > + default KUNIT_ALL_TESTS > > + help > > + This builds the "usercopy_kunit" module that runs sanity checks > > + on the copy_to/from_user infrastructure, making sure basic > > + user/kernel boundary testing is working. > > + > > Please carry the following line forward as well to be complete assuming > it is relevant. > > If unsure, say N. I've explicitly removed that because it would be repetitive if it were included for all KUnit tests. -- Kees Cook
Re: [PATCH] pstore: platform: add missing MODULE_DESCRIPTION() macro
On Thu, 13 Jun 2024 21:39:42 -0700, Jeff Johnson wrote: > With ARCH=csky, make allmodconfig && make W=1 C=1 reports: > WARNING: modpost: missing MODULE_DESCRIPTION() in fs/pstore/pstore.o > > Add the missing invocation of the MODULE_DESCRIPTION() macro. > > Applied to for-next/pstore, thanks! [1/1] pstore: platform: add missing MODULE_DESCRIPTION() macro https://git.kernel.org/kees/c/9b3c13c9ea4e Take care, -- Kees Cook
[PATCH] MAINTAINERS: Update entries for Kees Cook
Update current email address for Kees Cook in the MAINTAINER file to match the change from commit 4e173c825b19 ("mailmap: update entry for Kees Cook"). Signed-off-by: Kees Cook --- MAINTAINERS | 28 ++-- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 8754ac2c259d..f601a2fd1ebf 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5296,7 +5296,7 @@ F:drivers/infiniband/hw/usnic/ CLANG CONTROL FLOW INTEGRITY SUPPORT M: Sami Tolvanen -M: Kees Cook +M: Kees Cook R: Nathan Chancellor L: l...@lists.linux.dev S: Supported @@ -8212,7 +8212,7 @@ F:rust/kernel/net/phy.rs EXEC & BINFMT API, ELF R: Eric Biederman -R: Kees Cook +R: Kees Cook L: linux...@kvack.org S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/execve @@ -8613,7 +8613,7 @@ S:Maintained F: drivers/net/ethernet/nvidia/* FORTIFY_SOURCE -M: Kees Cook +M: Kees Cook L: linux-hardening@vger.kernel.org S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening @@ -9103,7 +9103,7 @@ F:include/linux/mfd/gsc.h F: include/linux/platform_data/gsc_hwmon.h GCC PLUGINS -M: Kees Cook +M: Kees Cook L: linux-hardening@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening @@ -9237,7 +9237,7 @@ S:Maintained F: drivers/input/touchscreen/resistive-adc-touch.c GENERIC STRING LIBRARY -M: Kees Cook +M: Kees Cook R: Andy Shevchenko L: linux-hardening@vger.kernel.org S: Supported @@ -11951,7 +11951,7 @@ F: scripts/package/ F: usr/ KERNEL HARDENING (not covered by other areas) -M: Kees Cook +M: Kees Cook R: Gustavo A. R. Silva L: linux-hardening@vger.kernel.org S: Supported @@ -12479,7 +12479,7 @@ F: drivers/scsi/53c700* LEAKING_ADDRESSES M: Tycho Andersen -R: Kees Cook +R: Kees Cook L: linux-hardening@vger.kernel.org S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening @@ -12775,7 +12775,7 @@ F: arch/powerpc/platforms/8xx/ F: arch/powerpc/platforms/83xx/ LINUX KERNEL DUMP TEST MODULE (LKDTM) -M: Kees Cook +M: Kees Cook S: Maintained F: drivers/misc/lkdtm/* F: tools/testing/selftests/lkdtm/* @@ -12905,7 +12905,7 @@ Q: http://patchwork.linuxtv.org/project/linux-media/list/ F: drivers/media/usb/dvb-usb-v2/lmedm04* LOADPIN SECURITY MODULE -M: Kees Cook +M: Kees Cook S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening F: Documentation/admin-guide/LSM/LoadPin.rst @@ -17998,7 +17998,7 @@ F: tools/testing/selftests/proc/ PROC SYSCTL M: Luis Chamberlain -M: Kees Cook +M: Kees Cook M: Joel Granados L: linux-ker...@vger.kernel.org L: linux-fsde...@vger.kernel.org @@ -18054,7 +18054,7 @@ F: Documentation/devicetree/bindings/net/pse-pd/ F: drivers/net/pse-pd/ PSTORE FILESYSTEM -M: Kees Cook +M: Kees Cook R: Tony Luck R: Guilherme G. Piccoli L: linux-hardening@vger.kernel.org @@ -20060,7 +20060,7 @@ F: drivers/media/cec/platform/seco/seco-cec.c F: drivers/media/cec/platform/seco/seco-cec.h SECURE COMPUTING -M: Kees Cook +M: Kees Cook R: Andy Lutomirski R: Will Drewry S: Supported @@ -22974,7 +22974,7 @@ F: drivers/block/ublk_drv.c F: include/uapi/linux/ublk_cmd.h UBSAN -M: Kees Cook +M: Kees Cook R: Marco Elver R: Andrey Konovalov R: Andrey Ryabinin @@ -24812,7 +24812,7 @@ F: drivers/net/hamradio/yam* F: include/linux/yam.h YAMA SECURITY MODULE -M: Kees Cook +M: Kees Cook S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening F: Documentation/admin-guide/LSM/Yama.rst -- 2.34.1
Re: [PATCH] remove AND operation in choose_random_kstack_offset()
On Mon, Jun 17, 2024 at 04:52:15PM +0100, Mark Rutland wrote: > On Mon, Jun 17, 2024 at 01:37:21PM +, Yuntao Liu wrote: > > Since the offset would be bitwise ANDed with 0x3FF in > > add_random_kstack_offset(), so just remove AND operation here. > > > > Signed-off-by: Yuntao Liu > > The comments in arm64 and x86 say that they're deliberately capping the > offset at fewer bits than the result of KSTACK_OFFSET_MAX() masking the > value with 0x3FF. > > Maybe it's ok to expand that, but if that's the case the commit message > needs to explain why it's safe add extra bits (2 on arm64, 3 on s39 and > x86), and those comments need to be updated accordingly. > > As-is, I do not think this patch is ok. Yeah, I agree: the truncation is intentional and tuned to the architecture. -- Kees Cook
Re: [PATCH] powerpc/pseries: Whitelist dtl slub object for copying to userspace
On Fri, Jun 14, 2024 at 11:08:44PM +0530, Anjali K wrote: > Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-* > results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as > shown below. > > kernel BUG at mm/usercopy.c:102! > Oops: Exception in kernel mode, sig: 5 [#1] > LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries > Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc > scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror > dm_region_hash dm_log dm_mod fuse > CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85 > Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf06 > of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries > NIP: c05d23d4 LR: c05d23d0 CTR: 006ee6f8 > REGS: c00120c078c0 TRAP: 0700 Not tainted (6.10.0-rc3) > MSR: 80029033 CR: 2828220f XER: 000e > CFAR: c01fdc80 IRQMASK: 0 > [ ... GPRs omitted ... ] > NIP [c05d23d4] usercopy_abort+0x78/0xb0 > LR [c05d23d0] usercopy_abort+0x74/0xb0 > Call Trace: > usercopy_abort+0x74/0xb0 (unreliable) > __check_heap_object+0xf8/0x120 > check_heap_object+0x218/0x240 > __check_object_size+0x84/0x1a4 > dtl_file_read+0x17c/0x2c4 > full_proxy_read+0x8c/0x110 > vfs_read+0xdc/0x3a0 > ksys_read+0x84/0x144 > system_call_exception+0x124/0x330 > system_call_vectored_common+0x15c/0x2ec > --- interrupt: 3000 at 0x7fff81f3ab34 > > Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0") > requires that only whitelisted areas in slab/slub objects can be copied to > userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY. > Dtl contains hypervisor dispatch events which are expected to be read by > privileged users. Hence mark this safe for user access. > Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the > entire object. > > Co-developed-by: Vishal Chourasia > Signed-off-by: Vishal Chourasia > Signed-off-by: Anjali K > --- > arch/powerpc/platforms/pseries/setup.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/setup.c > b/arch/powerpc/platforms/pseries/setup.c > index 284a6fa04b0c..cba40d9d1284 100644 > --- a/arch/powerpc/platforms/pseries/setup.c > +++ b/arch/powerpc/platforms/pseries/setup.c > @@ -343,8 +343,8 @@ static int alloc_dispatch_log_kmem_cache(void) > { > void (*ctor)(void *) = get_dtl_cache_ctor(); > > - dtl_cache = kmem_cache_create("dtl", DISPATCH_LOG_BYTES, > - DISPATCH_LOG_BYTES, 0, ctor); > + dtl_cache = kmem_cache_create_usercopy("dtl", DISPATCH_LOG_BYTES, > + DISPATCH_LOG_BYTES, 0, 0, > DISPATCH_LOG_BYTES, ctor); > if (!dtl_cache) { > pr_warn("Failed to create dispatch trace log buffer cache\n"); > pr_warn("Stolen time statistics will be unreliable\n"); Are you sure you want to universally expose this memory region? It sounds like it's only exposed via a debug interface. Maybe it'd be better to use a bounce buffer in the debug interface instead? diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c index 3f1cdccebc9c..3adcff5cc4b2 100644 --- a/arch/powerpc/platforms/pseries/dtl.c +++ b/arch/powerpc/platforms/pseries/dtl.c @@ -257,6 +257,22 @@ static int dtl_file_release(struct inode *inode, struct file *filp) return 0; } +static inline int bounce_copy(char __user *buf, void *src, size_t size) +{ + u8 *bounce; + int rc; + + bounce = kmalloc(size, GFP_KERNEL); + if (!bounce) + return -ENOMEM; + + memcpy(bounce, src, size); + rc = copy_to_user(buf, bounce, size); + + kfree(bounce); + return rc; +} + static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len, loff_t *pos) { @@ -300,7 +316,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len, if (i + n_req > dtl->buf_entries) { read_size = dtl->buf_entries - i; - rc = copy_to_user(buf, >buf[i], + rc = bounce_copy(buf, >buf[i], read_size * sizeof(struct dtl_entry)); if (rc) return -EFAULT; @@ -312,7 +328,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len, } /* .. and now the head */ - rc = copy_to_user(buf, >buf[i], n_req * sizeof(struct dtl_entry)); + rc = bounce_copy(buf, >buf[i], n_req * sizeof(struct dtl_entry)); if (rc) return -EFAULT; -- Kees Cook
Re: [PATCH] powerpc/pseries: Whitelist dtl slub object for copying to userspace
On Fri, Jun 14, 2024 at 11:08:44PM +0530, Anjali K wrote: > Reading the dispatch trace log from /sys/kernel/debug/powerpc/dtl/cpu-* > results in a BUG() when the config CONFIG_HARDENED_USERCOPY is enabled as > shown below. > > kernel BUG at mm/usercopy.c:102! > Oops: Exception in kernel mode, sig: 5 [#1] > LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries > Modules linked in: xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc > scsi_transport_fc ibmveth pseries_wdt dm_multipath dm_mirror > dm_region_hash dm_log dm_mod fuse > CPU: 27 PID: 1815 Comm: python3 Not tainted 6.10.0-rc3 #85 > Hardware name: IBM,9040-MRX POWER10 (raw) 0x800200 0xf06 > of:IBM,FW1060.00 (NM1060_042) hv:phyp pSeries > NIP: c05d23d4 LR: c05d23d0 CTR: 006ee6f8 > REGS: c00120c078c0 TRAP: 0700 Not tainted (6.10.0-rc3) > MSR: 80029033 CR: 2828220f XER: 000e > CFAR: c01fdc80 IRQMASK: 0 > [ ... GPRs omitted ... ] > NIP [c05d23d4] usercopy_abort+0x78/0xb0 > LR [c05d23d0] usercopy_abort+0x74/0xb0 > Call Trace: > usercopy_abort+0x74/0xb0 (unreliable) > __check_heap_object+0xf8/0x120 > check_heap_object+0x218/0x240 > __check_object_size+0x84/0x1a4 > dtl_file_read+0x17c/0x2c4 > full_proxy_read+0x8c/0x110 > vfs_read+0xdc/0x3a0 > ksys_read+0x84/0x144 > system_call_exception+0x124/0x330 > system_call_vectored_common+0x15c/0x2ec > --- interrupt: 3000 at 0x7fff81f3ab34 > > Commit 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0") > requires that only whitelisted areas in slab/slub objects can be copied to > userspace when usercopy hardening is enabled using CONFIG_HARDENED_USERCOPY. > Dtl contains hypervisor dispatch events which are expected to be read by > privileged users. Hence mark this safe for user access. > Specify useroffset=0 and usersize=DISPATCH_LOG_BYTES to whitelist the > entire object. > > Co-developed-by: Vishal Chourasia > Signed-off-by: Vishal Chourasia > Signed-off-by: Anjali K > --- > arch/powerpc/platforms/pseries/setup.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/setup.c > b/arch/powerpc/platforms/pseries/setup.c > index 284a6fa04b0c..cba40d9d1284 100644 > --- a/arch/powerpc/platforms/pseries/setup.c > +++ b/arch/powerpc/platforms/pseries/setup.c > @@ -343,8 +343,8 @@ static int alloc_dispatch_log_kmem_cache(void) > { > void (*ctor)(void *) = get_dtl_cache_ctor(); > > - dtl_cache = kmem_cache_create("dtl", DISPATCH_LOG_BYTES, > - DISPATCH_LOG_BYTES, 0, ctor); > + dtl_cache = kmem_cache_create_usercopy("dtl", DISPATCH_LOG_BYTES, > + DISPATCH_LOG_BYTES, 0, 0, > DISPATCH_LOG_BYTES, ctor); > if (!dtl_cache) { > pr_warn("Failed to create dispatch trace log buffer cache\n"); > pr_warn("Stolen time statistics will be unreliable\n"); Are you sure you want to universally expose this memory region? It sounds like it's only exposed via a debug interface. Maybe it'd be better to use a bounce buffer in the debug interface instead? diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c index 3f1cdccebc9c..3adcff5cc4b2 100644 --- a/arch/powerpc/platforms/pseries/dtl.c +++ b/arch/powerpc/platforms/pseries/dtl.c @@ -257,6 +257,22 @@ static int dtl_file_release(struct inode *inode, struct file *filp) return 0; } +static inline int bounce_copy(char __user *buf, void *src, size_t size) +{ + u8 *bounce; + int rc; + + bounce = kmalloc(size, GFP_KERNEL); + if (!bounce) + return -ENOMEM; + + memcpy(bounce, src, size); + rc = copy_to_user(buf, bounce, size); + + kfree(bounce); + return rc; +} + static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len, loff_t *pos) { @@ -300,7 +316,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len, if (i + n_req > dtl->buf_entries) { read_size = dtl->buf_entries - i; - rc = copy_to_user(buf, >buf[i], + rc = bounce_copy(buf, >buf[i], read_size * sizeof(struct dtl_entry)); if (rc) return -EFAULT; @@ -312,7 +328,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len, } /* .. and now the head */ - rc = copy_to_user(buf, >buf[i], n_req * sizeof(struct dtl_entry)); + rc = bounce_copy(buf, >buf[i], n_req * sizeof(struct dtl_entry)); if (rc) return -EFAULT; -- Kees Cook
[PATCH] kunit/usercopy: Disable testing on !CONFIG_MMU
Since arch_pick_mmap_layout() is an inline for non-MMU systems, disable this test there. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202406160505.ubge6tmy-...@intel.com/ Signed-off-by: Kees Cook --- Cc: Brendan Higgins Cc: David Gow Cc: Rae Moar Cc: "Gustavo A. R. Silva" Cc: Andrew Morton Cc: linux-kselftest@vger.kernel.org Cc: kunit-...@googlegroups.com Cc: linux-harden...@vger.kernel.org Cc: linux...@kvack.org --- lib/kunit/user_alloc.c | 4 lib/usercopy_kunit.c | 5 + mm/util.c | 2 ++ 3 files changed, 11 insertions(+) diff --git a/lib/kunit/user_alloc.c b/lib/kunit/user_alloc.c index 76d3d1345ed7..ae935df09a5e 100644 --- a/lib/kunit/user_alloc.c +++ b/lib/kunit/user_alloc.c @@ -30,6 +30,10 @@ static int kunit_attach_mm(void) if (current->mm) return 0; + /* arch_pick_mmap_layout() is only sane with MMU systems. */ + if (!IS_ENABLED(CONFIG_MMU)) + return -EINVAL; + mm = mm_alloc(); if (!mm) return -ENOMEM; diff --git a/lib/usercopy_kunit.c b/lib/usercopy_kunit.c index 45f1e558c464..e819561a540d 100644 --- a/lib/usercopy_kunit.c +++ b/lib/usercopy_kunit.c @@ -290,6 +290,11 @@ static int usercopy_test_init(struct kunit *test) struct usercopy_test_priv *priv; unsigned long user_addr; + if (!IS_ENABLED(CONFIG_MMU)) { + kunit_skip(test, "Userspace allocation testing not available on non-MMU systems"); + return 0; + } + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv); test->priv = priv; diff --git a/mm/util.c b/mm/util.c index df37c47d9374..e70e8e439258 100644 --- a/mm/util.c +++ b/mm/util.c @@ -484,7 +484,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) clear_bit(MMF_TOPDOWN, >flags); } #endif +#ifdef CONFIG_MMU EXPORT_SYMBOL_IF_KUNIT(arch_pick_mmap_layout); +#endif /** * __account_locked_vm - account locked pages to an mm's locked_vm -- 2.34.1
Re: [PATCH v4 0/3] Hardening perf subsystem
On Sat, Jun 15, 2024 at 06:09:07PM +0200, Martin Uecker wrote: > Am Freitag, dem 14.06.2024 um 12:17 +0200 schrieb Peter Zijlstra: > > On Wed, Jun 12, 2024 at 04:23:31PM -0700, Kees Cook wrote: > > > On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote: > > > > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > > > > > I'm happy to take patches. And for this bikeshed, this would be better > > > > > named under the size_*() helpers which are trying to keep size_t > > > > > calculations from overflowing (by saturating). i.e.: > > > > > > > > > > size_add_mult(sizeof(*p), sizeof(*p->member), num) > > > > > > > > Fine I suppose, but what if we want something not size_t? Are we waiting > > > > for the type system extension? > > > > > > Because of C's implicit promotion/truncation, we can't do anything > > > sanely with return values of arbitrary type size; we have to capture the > > > lvalue type somehow so the checking can happen without C doing silent > > > garbage. > > What is the specific problem here? This particular complaint is about not being able to "discover" the lvalue type during an assignment, which means we can't create helper macros that Just Work. For example, in: void foo(some_type arg); foo(add_mul(base, size, count)); There is no way to write "add_mul" so that it is checking for overflow of the "some_type" type. The behavior of such a macro changes if it's doing u8, int, size_t, etc. The existing solution to this is to use macros (and builtins) that cannot be composed: add_mul(base, size, count, ); foo(result); Here, the macro can examine the type of "result" and perform the correct overflow handling for that type. But lots of developers (rightly) dislike this style of "assignment side-effect". Additionally, it can't be composed: add(base, mul(size, count)); But, using type attributes we have much more flexibility. Hence, the proposed "wraps" attribute: https://github.com/llvm/llvm-project/pull/86618 > This is likely a stupid question, but making it signed > wouldn't work? Or is a signed size_t too small > or some architectures? Or would this change break too much? The ssize_t gets used in some places already, but since size_t is used for address calculations too, I don't think we can universally switch it. -- Kees Cook
Re: [PATCH v4 0/3] Hardening perf subsystem
On Fri, Jun 14, 2024 at 12:17:08PM +0200, Peter Zijlstra wrote: > On Wed, Jun 12, 2024 at 04:23:31PM -0700, Kees Cook wrote: > > On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote: > > > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > > > > I'm happy to take patches. And for this bikeshed, this would be better > > > > named under the size_*() helpers which are trying to keep size_t > > > > calculations from overflowing (by saturating). i.e.: > > > > > > > > size_add_mult(sizeof(*p), sizeof(*p->member), num) > > > > > > Fine I suppose, but what if we want something not size_t? Are we waiting > > > for the type system extension? > > > > Because of C's implicit promotion/truncation, we can't do anything > > sanely with return values of arbitrary type size; we have to capture the > > lvalue type somehow so the checking can happen without C doing silent > > garbage. > > So sizeof() returns the native (built-in) size_t, right? If that type > the nooverflow qualifier on, then: > > sizeof(*p) + num*sizeof(p->foo[0]) > > should all get the nooverflow semantics right? Because size_t is > effectively 'nooverflow unsigned long' the multiplication should promote > 'num' to some 'long'. Hmmm. This is an interesting point. I'll see what Justin has found as he's been working on limiting the overflow sanitizer to specific types. It doesn't help this (unfortunately common) code pattern, though: int size; size = sizeof(*p) + num*sizeof(p->foo[0]); p = kmalloc(size, GFP_KERNEL); But that was going to be a problem either way. > Now, I've re-read the rules and I don't see qualifiers mentioned, so > can't we state that the overflow/nooverflow qualifiers are to be > preserved on (implicit) promotion and when nooverflow and overflow are > combined the 'safe' nooverflow takes precedence? > > I mean, when we're adding qualifiers we can make up rules about them > too, right? Yup, that is the design of the "wraps" attribute (though it is the reverse: it _allows_ for wrap-around, since we want to the default state to be mitigation). > If 'people' don't want to adorn the built-in size_t, we can always do > something like: > > #define sizeof(x) ((nooverflow unsigned long)(sizeof(x))) > > and 'fix' it ourselves. Right, though my hope is still we get the result of "nooverflow" by marking that which was expected to overflow. -- Kees Cook
Re: [PATCH v3 2/2] pstore/ramoops: Add ramoops.mem_name= command line option
On June 13, 2024 7:06:47 AM PDT, Ard Biesheuvel wrote: >On Thu, 13 Jun 2024 at 15:26, Steven Rostedt wrote: >> >> On Thu, 13 Jun 2024 08:11:48 +0200 >> Ard Biesheuvel wrote: >> > > >> > > I've added one more comment to v5, with that fixed I can take this. >> > > >> > >> > So how is this supposed to work wrt to the rigid 'no user visible >> > regressions' rule, given that this whole thing is a best effort thing >> >> This has nothing to do with user space. The kernel command line has >> broken in the past. If you update the kernel, you can update the >> command line. There's no "no user visible regressions" rule. It's >> "Don't break user space". This has nothing to do with user space. >> >> > to begin with. This needs at least a huge disclaimer that this rule >> > does not apply, and if this works today, there is no guarantee that it >> > will keep working on newer kernels. Otherwise, you will be making the >> > job of the people who work on the boot code significantly more >> > difficult. And even then, I wonder whether Linus and #regzcop are >> > going to honour such a disclaimer. >> >> Again, this has nothing to do with user space. The rule Linus talks >> about is breaking user space. This is about kernel debugging. Something >> *completely different*! >> >> > >> > So this belongs downstream, unless some guarantees can be provided >> > that this functionality is exempt from the usual regression policies. >> >> I disagree. kexec/kdump also has the same issues. >> > >Fair enough. As long as it is documented that there is no guarantee >that this will keep working over a kernel upgrade, then I have no >objections. Yeah, I should better document this for pstore as a whole, but I've already made the call that cross-kernel-versison operation is best effort. -Kees -- Kees Cook
Re: [PATCH v3 0/2] usercopy: Convert test_user_copy to KUnit test
On Thu, Jun 13, 2024 at 12:41:43PM +0800, David Gow wrote: > On Thu, 13 Jun 2024 at 03:59, Kees Cook wrote: > > > > Hi, > > > > This builds on the proposal[1] from Mark and lets me convert the > > existing usercopy selftest to KUnit. Besides adding this basic test to > > the KUnit collection, it also opens the door for execve testing (which > > depends on having a functional current->mm), and should provide the > > basic infrastructure for adding Mark's much more complete usercopy tests. > > > > v3: > > - use MEMEQ KUnit helper (David) > > - exclude pathological address confusion test for systems with separate > > address spaces, noticed by David > > - add KUnit-conditional exports for alloc_mm() and arch_pick_mmap_layout() > > noticed by 0day > > v2: https://lore.kernel.org/lkml/20240610213055.it.075-k...@kernel.org/ > > v1: https://lore.kernel.org/lkml/20240519190422.work.715-k...@kernel.org/ > > > > -Kees > > > > [1] > > https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ > > Thanks! This looks good to me (and passes everything here). Unless > there's a compelling reason not to, I think we can take this via the > KUnit tree. That would be lovely, thank you! :) -- Kees Cook
Re: [PATCH v4 0/3] Hardening perf subsystem
On Thu, Jun 13, 2024 at 12:08:21AM +0200, Peter Zijlstra wrote: > On Wed, Jun 12, 2024 at 12:01:19PM -0700, Kees Cook wrote: > > I'm happy to take patches. And for this bikeshed, this would be better > > named under the size_*() helpers which are trying to keep size_t > > calculations from overflowing (by saturating). i.e.: > > > > size_add_mult(sizeof(*p), sizeof(*p->member), num) > > Fine I suppose, but what if we want something not size_t? Are we waiting > for the type system extension? Because of C's implicit promotion/truncation, we can't do anything sanely with return values of arbitrary type size; we have to capture the lvalue type somehow so the checking can happen without C doing silent garbage. > The saturating thing is relying in the allocators never granting INT_MAX > (or whatever size_t actually is) bytes? The max of size_t is ULONG_MAX, but yes, most of the allocators will refuse >INT_MAX, but I think vmalloc() is higher, but certainly not SIZE_MAX, which is the entire virtual memory space. ;) The saturating thing is two-fold: that we never wrap around SIZE_MAX, and that the allocator will refuse a SIZE_MAX allocation. > > LOL. It's basically doing compile-time (__builtin_object_size) and > > run-time (__builtin_dynamic_object_size) bounds checking on destination > > (and source) object sizes, mainly driven by the mentioned builtins: > > https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html > > Right, I got that far. I also read most of: > > > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854 Oh wow, that's serious extra credit. :) It'll also probably be a while before most of that stuff is even landed in Clang, much less implemented in GCC. What we _do_ have is the "counted_by" attribute. This was added to Clang a little while ago and just landed last week in GCC for GCC 15. > But none of that is showing me generated asm for the various cases. As > such, I don't consider myself informed enough. Gotcha. For the compile-time stuff it's all just looking at known-at-compile-time sizes. So for something like this, we get a __compiletime_warning() emitted: const char src[] = "Hello there"; char dst[10]; strscpy(dst, src); /* Compiler yells since src is bigger than dst. */ For run-time checks it's basically just using the regular WARN() infrastructure with __builtin_dynamic_object_size(). Here's a simplified userspace example with assert(): https://godbolt.org/z/zMrKnMxn5 The kernel's FORTIFY_SOURCE is much more complex in how it does the checking, how it does the reporting (for helping people figure out what's gone weird), etc. > > Anyway! What about the patch that takes the 2 allocations down to 1? > > That seems like an obvious improvement. > > Separate it from the struct_size() nonsense and Cc the author of that > code (Sandipan IIRC) and I might just apply it. Okay, thanks! -Kees -- Kees Cook
[PATCH v3 0/2] usercopy: Convert test_user_copy to KUnit test
Hi, This builds on the proposal[1] from Mark and lets me convert the existing usercopy selftest to KUnit. Besides adding this basic test to the KUnit collection, it also opens the door for execve testing (which depends on having a functional current->mm), and should provide the basic infrastructure for adding Mark's much more complete usercopy tests. v3: - use MEMEQ KUnit helper (David) - exclude pathological address confusion test for systems with separate address spaces, noticed by David - add KUnit-conditional exports for alloc_mm() and arch_pick_mmap_layout() noticed by 0day v2: https://lore.kernel.org/lkml/20240610213055.it.075-k...@kernel.org/ v1: https://lore.kernel.org/lkml/20240519190422.work.715-k...@kernel.org/ -Kees [1] https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ Kees Cook (2): kunit: test: Add vm_mmap() allocation resource manager usercopy: Convert test_user_copy to KUnit test MAINTAINERS| 1 + include/kunit/test.h | 17 ++ kernel/fork.c | 3 + lib/Kconfig.debug | 21 +- lib/Makefile | 2 +- lib/kunit/Makefile | 1 + lib/kunit/user_alloc.c | 113 + lib/{test_user_copy.c => usercopy_kunit.c} | 282 ++--- mm/util.c | 3 + 9 files changed, 288 insertions(+), 155 deletions(-) create mode 100644 lib/kunit/user_alloc.c rename lib/{test_user_copy.c => usercopy_kunit.c} (46%) -- 2.34.1
[PATCH v3 1/2] kunit: test: Add vm_mmap() allocation resource manager
For tests that need to allocate using vm_mmap() (e.g. usercopy and execve), provide the interface to have the allocation tracked by KUnit itself. This requires bringing up a placeholder userspace mm. This combines my earlier attempt at this with Mark Rutland's version[1]. Normally alloc_mm() and arch_pick_mmap_layout() aren't exported for modules, so export these only for KUnit testing. Link: https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ [1] Co-developed-by: Mark Rutland Signed-off-by: Mark Rutland Reviewed-by: David Gow Signed-off-by: Kees Cook --- include/kunit/test.h | 17 +++ kernel/fork.c | 3 ++ lib/kunit/Makefile | 1 + lib/kunit/user_alloc.c | 113 + mm/util.c | 3 ++ 5 files changed, 137 insertions(+) create mode 100644 lib/kunit/user_alloc.c diff --git a/include/kunit/test.h b/include/kunit/test.h index e32b4cb7afa2..ec61cad6b71d 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -480,6 +480,23 @@ static inline void *kunit_kcalloc(struct kunit *test, size_t n, size_t size, gfp return kunit_kmalloc_array(test, n, size, gfp | __GFP_ZERO); } +/** + * kunit_vm_mmap() - Allocate KUnit-tracked vm_mmap() area + * @test: The test context object. + * @file: struct file pointer to map from, if any + * @addr: desired address, if any + * @len: how many bytes to allocate + * @prot: mmap PROT_* bits + * @flag: mmap flags + * @offset: offset into @file to start mapping from. + * + * See vm_mmap() for more information. + */ +unsigned long kunit_vm_mmap(struct kunit *test, struct file *file, + unsigned long addr, unsigned long len, + unsigned long prot, unsigned long flag, + unsigned long offset); + void kunit_cleanup(struct kunit *test); void __printf(2, 3) kunit_log_append(struct string_stream *log, const char *fmt, ...); diff --git a/kernel/fork.c b/kernel/fork.c index 99076dbe27d8..cea203197136 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -115,6 +115,8 @@ #define CREATE_TRACE_POINTS #include +#include + /* * Minimum number of threads to boot the kernel */ @@ -1334,6 +1336,7 @@ struct mm_struct *mm_alloc(void) memset(mm, 0, sizeof(*mm)); return mm_init(mm, current, current_user_ns()); } +EXPORT_SYMBOL_IF_KUNIT(mm_alloc); static inline void __mmput(struct mm_struct *mm) { diff --git a/lib/kunit/Makefile b/lib/kunit/Makefile index 309659a32a78..56dd67dc6e57 100644 --- a/lib/kunit/Makefile +++ b/lib/kunit/Makefile @@ -2,6 +2,7 @@ obj-$(CONFIG_KUNIT) += kunit.o kunit-objs += test.o \ resource.o \ + user_alloc.o \ static_stub.o \ string-stream.o \ assert.o \ diff --git a/lib/kunit/user_alloc.c b/lib/kunit/user_alloc.c new file mode 100644 index ..76d3d1345ed7 --- /dev/null +++ b/lib/kunit/user_alloc.c @@ -0,0 +1,113 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KUnit userspace memory allocation resource management. + */ +#include +#include +#include +#include + +struct kunit_vm_mmap_resource { + unsigned long addr; + size_t size; +}; + +/* vm_mmap() arguments */ +struct kunit_vm_mmap_params { + struct file *file; + unsigned long addr; + unsigned long len; + unsigned long prot; + unsigned long flag; + unsigned long offset; +}; + +/* Create and attach a new mm if it doesn't already exist. */ +static int kunit_attach_mm(void) +{ + struct mm_struct *mm; + + if (current->mm) + return 0; + + mm = mm_alloc(); + if (!mm) + return -ENOMEM; + + /* Define the task size. */ + mm->task_size = TASK_SIZE; + + /* Make sure we can allocate new VMAs. */ + arch_pick_mmap_layout(mm, >signal->rlim[RLIMIT_STACK]); + + /* Attach the mm. It will be cleaned up when the process dies. */ + kthread_use_mm(mm); + + return 0; +} + +static int kunit_vm_mmap_init(struct kunit_resource *res, void *context) +{ + struct kunit_vm_mmap_params *p = context; + struct kunit_vm_mmap_resource vres; + int ret; + + ret = kunit_attach_mm(); + if (ret) + return ret; + + vres.size = p->len; + vres.addr = vm_mmap(p->file, p->addr, p->len, p->prot, p->flag, p->offset); + if (!vres.addr) + return -ENOMEM; + res->data = kmemdup(, sizeof(vres), GFP_KERNEL); + if (!res->data) { + vm_munmap(vres.addr, vres.size); + return -ENOMEM; + } + + return 0; +} + +static void kunit_vm_mmap_free(struct kunit_resource *res) +{ + struct k
[PATCH v3 2/2] usercopy: Convert test_user_copy to KUnit test
Convert the runtime tests of hardened usercopy to standard KUnit tests. Additionally disable usercopy_test_invalid() for systems with separate address spaces (or no MMU) since it's not sensible to test for address confusion there (e.g. m68k). Co-developed-by: Vitor Massaru Iha Signed-off-by: Vitor Massaru Iha Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org Tested-by: Ivan Orlov Reviewed-by: David Gow Signed-off-by: Kees Cook --- MAINTAINERS| 1 + lib/Kconfig.debug | 21 +- lib/Makefile | 2 +- lib/{test_user_copy.c => usercopy_kunit.c} | 282 ++--- 4 files changed, 151 insertions(+), 155 deletions(-) rename lib/{test_user_copy.c => usercopy_kunit.c} (46%) diff --git a/MAINTAINERS b/MAINTAINERS index 8754ac2c259d..0cd171ec6010 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11962,6 +11962,7 @@ F: arch/*/configs/hardening.config F: include/linux/overflow.h F: include/linux/randomize_kstack.h F: kernel/configs/hardening.config +F: lib/usercopy_kunit.c F: mm/usercopy.c K: \b(add|choose)_random_kstack_offset\b K: \b__check_(object_size|heap_object)\b diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 59b6765d86b8..561e346f5cb0 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -2505,18 +2505,6 @@ config TEST_VMALLOC If unsure, say N. -config TEST_USER_COPY - tristate "Test user/kernel boundary protections" - depends on m - help - This builds the "test_user_copy" module that runs sanity checks - on the copy_to/from_user infrastructure, making sure basic - user/kernel boundary testing is working. If it fails to load, - a regression has been detected in the user/kernel memory boundary - protections. - - If unsure, say N. - config TEST_BPF tristate "Test BPF filter functionality" depends on m && NET @@ -2814,6 +2802,15 @@ config SIPHASH_KUNIT_TEST This is intended to help people writing architecture-specific optimized versions. If unsure, say N. +config USERCOPY_KUNIT_TEST + tristate "KUnit Test for user/kernel boundary protections" + depends on KUNIT + default KUNIT_ALL_TESTS + help + This builds the "usercopy_kunit" module that runs sanity checks + on the copy_to/from_user infrastructure, making sure basic + user/kernel boundary testing is working. + config TEST_UDELAY tristate "udelay test driver" help diff --git a/lib/Makefile b/lib/Makefile index 3b1769045651..fae5cc67b95a 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -78,7 +78,6 @@ obj-$(CONFIG_TEST_LKM) += test_module.o obj-$(CONFIG_TEST_VMALLOC) += test_vmalloc.o obj-$(CONFIG_TEST_RHASHTABLE) += test_rhashtable.o obj-$(CONFIG_TEST_SORT) += test_sort.o -obj-$(CONFIG_TEST_USER_COPY) += test_user_copy.o obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_keys.o obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o obj-$(CONFIG_TEST_DYNAMIC_DEBUG) += test_dynamic_debug.o @@ -388,6 +387,7 @@ CFLAGS_fortify_kunit.o += $(call cc-disable-warning, stringop-truncation) CFLAGS_fortify_kunit.o += $(DISABLE_STRUCTLEAK_PLUGIN) obj-$(CONFIG_FORTIFY_KUNIT_TEST) += fortify_kunit.o obj-$(CONFIG_SIPHASH_KUNIT_TEST) += siphash_kunit.o +obj-$(CONFIG_USERCOPY_KUNIT_TEST) += usercopy_kunit.o obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o diff --git a/lib/test_user_copy.c b/lib/usercopy_kunit.c similarity index 46% rename from lib/test_user_copy.c rename to lib/usercopy_kunit.c index 5ff04d8fe971..45f1e558c464 100644 --- a/lib/test_user_copy.c +++ b/lib/usercopy_kunit.c @@ -15,7 +15,7 @@ #include #include #include -#include +#include /* * Several 32-bit architectures support 64-bit {get,put}_user() calls. @@ -31,26 +31,27 @@ # define TEST_U64 #endif -#define test(condition, msg, ...) \ -({ \ - int cond = (condition); \ - if (cond) \ - pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \ - cond; \ -}) +struct usercopy_test_priv { + char *kmem; + char __user *umem; + size_t size; +}; static bool is_zeroed(void *from, size_t size) { return memchr_inv(from, 0x0, size) == NULL; } -static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size) +/* Test usage of check_nonzero_user(). */ +static void usercopy_test_check_nonzero_user(struct kunit *test) { - int ret = 0; size_t start, end, i, zero_start, zero_end
Re: [PATCH v2 2/2] usercopy: Convert test_user_copy to KUnit test
On Wed, Jun 12, 2024 at 09:21:52PM +0200, Geert Uytterhoeven wrote: > Hi Kees, > > On Wed, Jun 12, 2024 at 6:51 PM Kees Cook wrote: > > On Wed, Jun 12, 2024 at 05:13:39PM +0800, David Gow wrote: > > > On Tue, 11 Jun 2024 at 05:33, Kees Cook wrote: > > > > Convert the runtime tests of hardened usercopy to standard KUnit tests. > > > > > > > > Co-developed-by: Vitor Massaru Iha > > > > Signed-off-by: Vitor Massaru Iha > > > > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org > > > > Tested-by: Ivan Orlov > > > > Signed-off-by: Kees Cook > > > > --- > > > > > > This looks good, particularly with the x86 fix applied. > > > > > > It's still hanging on m68k -- I think at the 'illegal reversed > > > copy_to_user passed' test -- but I'll admit to not having tried to > > > debug it further. > > > > > > One other (set of) notes below about using KUNIT_EXPECT_MEMEQ_MSG(), > > > otherwise (assuming the m68k stuff isn't actually a regression, which > > > I haven't tested but I imagine is unlikely), > > > > I'm trying to debug a hang on m68k in the usercopy behavioral testing > > routines. It's testing for the pathological case of having inverted > > arguments to copy_to_user(): > > > > user_addr = kunit_vm_mmap(test, NULL, 0, priv->size, > > PROT_READ | PROT_WRITE | PROT_EXEC, > > MAP_ANONYMOUS | MAP_PRIVATE, 0); > > ... > > bad_usermem = (char *)user_addr; > > ... > > KUNIT_EXPECT_NE_MSG(test, copy_to_user((char __user *)kmem, > > bad_usermem, > >PAGE_SIZE), 0, > > "illegal reversed copy_to_user passed"); > > > > On other architectures, this immediate fails because the access_ok() > > check rejects it. On m68k with CONFIG_ALTERNATE_USER_ADDRESS_SPACE, > > access_ok() short-circuits to "true". I've tried reading > > arch/m68k/include/asm/uaccess.h but I'm not sure what's happening under > > CONFIG_CPU_HAS_ADDRESS_SPACES. > > On m68k CPUs that support CPU_HAS_ADDRESS_SPACES (i.e. all traditional > 680x0 that can run real Linux), the CPU has separate address spaces > for kernel and user addresses. Accessing userspace addresses is done > using the special "moves" instruction, so we can just use the MMU to > catch invalid accesses. Okay, that's what I suspected. I think I'll need to just not test this particular case for archs with separate address spaces, since it would be meaningless. > > > For now I've excluded that test for m68k, but I'm not sure what's > > expected to happen here on m68k for this set of bad arguments. Can you > > advise? > > Perhaps the kernel address is actually a valid user address, or > vice versa? Right -- I think that's what's happened. > > Does the test work on systems that use 4G/4G for kernel/userspace > instead of the usual 1G/3G split? > > /me runs the old test_user_copy.ko on ARAnyM > Seems to take a while? Or it hangs, too? Sounds like the same behavior. > Related reading material > https://lore.kernel.org/all/camuhmduzhwm5_tl7tnaof+uqhejnkgsqf+_vzqgrzb_3euf...@mail.gmail.com/ > https://lore.kernel.org/all/camuhmdvq93ihgcxubjpttahdpjxxlyvasar-m3twbjv0krs...@mail.gmail.com/ Thanks! -Kees -- Kees Cook
Re: [PATCH v4 0/3] Hardening perf subsystem
On Tue, Jun 11, 2024 at 09:55:42AM +0200, Peter Zijlstra wrote: > On Mon, Jun 10, 2024 at 02:46:09PM -0700, Kees Cook wrote: > > > > I really detest this thing because it makes what was trivially readable > > > into something opaque. Get me that type qualifier that traps on overflow > > > and write plain C. All this __builtin_overflow garbage is just that, > > > unreadable nonsense. > > > > It's more readable than container_of(), > > Yeah, no. container_of() is absolutely trivial and very readable. > container_of_const() a lot less so. I mean, we have complex macros in the kernel. This isn't uncommon. Look at cleanup.h. ;) But that's why we write kern-doc: * struct_size() - Calculate size of structure with trailing flexible * array. * @p: Pointer to the structure. * @member: Name of the array member. * @count: Number of elements in the array. > #define struct_size(p, member, num) \ > mult_add_no_overflow(num, sizeof(p->member), sizeof(*p)) > > would be *FAR* more readable. And then I still think struct_size() is > less readable than its expansion. ]] I'm happy to take patches. And for this bikeshed, this would be better named under the size_*() helpers which are trying to keep size_t calculations from overflowing (by saturating). i.e.: size_add_mult(sizeof(*p), sizeof(*p->member), num) Generalized overflow handing (check_add/sub/mul_overflow()) helpers already exist and requires a destination variable to determine type sizes. It is not safe to allow C semantics to perform promotion/truncation, so we have to be explicit. > Some day I'll have to look at this FORTIFY_SOURCE and see what it > actually does I suppose :/ LOL. It's basically doing compile-time (__builtin_object_size) and run-time (__builtin_dynamic_object_size) bounds checking on destination (and source) object sizes, mainly driven by the mentioned builtins: https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html > I coulnd't quickly find a single instance in the code I care about. So > nothing is sailing afaict. I have to care about all of Linux's code. :P We generally don't have to defend the kernel from perf, so the hardening changes tend to end up in core APIs along with compiler improvements. Anyway! What about the patch that takes the 2 allocations down to 1? That seems like an obvious improvement. -Kees -- Kees Cook
Re: [PATCH v3 2/2] pstore/ramoops: Add ramoops.mem_name= command line option
On Tue, Jun 11, 2024 at 10:49:13AM -0400, Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > Add a method to find a region specified by reserve_mem=nn:align:name for > ramoops. Adding a kernel command line parameter: > > reserve_mem=12M:4096:oops ramoops.mem_name=oops > > Will use the size and location defined by the memmap parameter where it > finds the memory and labels it "oops". The "oops" in the ramoops option > is used to search for it. > > This allows for arbitrary RAM to be used for ramoops if it is known that > the memory is not cleared on kernel crashes or soft reboots. > > Signed-off-by: Steven Rostedt (Google) Acked-by: Kees Cook Let me know if this should go via the pstore tree, if you'd rather carry it? -- Kees Cook
Re: [PATCH v3] selftests: seccomp: fix format-zero-length warnings
On Tue, Jun 11, 2024 at 05:16:08PM +0200, Amer Al Shanawany wrote: > fix the following errors by using string format specifier and an empty > parameter: > > seccomp_benchmark.c:197:24: warning: zero-length gnu_printf format > string [-Wformat-zero-length] > 197 | ksft_print_msg(""); > |^~ > seccomp_benchmark.c:202:24: warning: zero-length gnu_printf format > string [-Wformat-zero-length] > 202 | ksft_print_msg(""); > |^~ > seccomp_benchmark.c:204:24: warning: zero-length gnu_printf format > string [-Wformat-zero-length] > 204 | ksft_print_msg(""); > |^~ > > Reported-by: kernel test robot > Closes: > https://lore.kernel.org/oe-kbuild-all/202312260235.uj5ug8k9-...@intel.com/ > Suggested-by: Kees Cook > Signed-off-by: Amer Al Shanawany Thanks! Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v2] x86/traps: Enable UBSAN traps on x86
On Tue, Jun 11, 2024 at 01:26:09PM -0700, Gatlin Newhouse wrote: > On Mon, Jun 03, 2024 at 06:13:53PM UTC, Thomas Gleixner wrote: > > On Sat, Jun 01 2024 at 03:10, Gatlin Newhouse wrote: > > > > > Bring x86 to parity with arm64, similar to commit 25b84002afb9 > > > ("arm64: Support Clang UBSAN trap codes for better reporting"). > > > Enable the output of UBSAN type information on x86 architectures > > > compiled with clang when CONFIG_UBSAN_TRAP=y. Currently ARM > > > architectures output which specific sanitizer caused the trap, > > > via the encoded data in the trap instruction. Clang on x86 > > > currently encodes the same data in ud1 instructions but the x86 > > > handle_bug() and is_valid_bugaddr() functions currently only look > > > at ud2s. > > > > Please structure your change log properly instead of one paragraph of > > unstructured word salad. See: > > > > > > https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog > > > > > +/* > > > + * Check for UD1, UD2, with or without Address Size Override Prefixes > > > instructions. > > > + */ > > > __always_inline int is_valid_bugaddr(unsigned long addr) > > > { > > > if (addr < TASK_SIZE_MAX) > > > @@ -88,7 +92,13 @@ __always_inline int is_valid_bugaddr(unsigned long > > > addr) > > >* We got #UD, if the text isn't readable we'd have gotten > > >* a different exception. > > >*/ > > > - return *(unsigned short *)addr == INSN_UD2; > > > + if (*(u16 *)addr == INSN_UD2) > > > + return INSN_UD2; > > > + if (*(u16 *)addr == INSN_UD1) > > > + return INSN_UD1; > > > + if (*(u8 *)addr == INSN_ASOP && *(u16 *)(addr + 1) == INSN_UD1) > > > > s/1/LEN_ASOP/ ? > > > > > + return INSN_ASOP; > > > + return 0; > > > > I'm not really a fan of the reuse of the INSN defines here. Especially > > not about INSN_ASOP. Also 0 is just lame. > > > > Neither does the function name make sense anymore. is_valid_bugaddr() is > > clearly telling that it's a boolean check (despite the return value > > being int for hysterical raisins). But now you turn it into a > > non-boolean integer which returns a instruction encoding. That's > > hideous. Programming should result in obvious code and that should be > > pretty obvious to people who create tools to validate code. > > > > Also all UBSAN cares about is the actual failure type and not the > > instruction itself: > > > > #define INSN_UD_MASK0x > > #define INSN_ASOP_MASK 0x00FF > > > > #define BUG_UD_NONE 0x > > #define BUG_UD2 0xFFFE > > > > __always_inline u16 get_ud_type(unsigned long addr) > > { > > u16 insn; > > > > if (addr < TASK_SIZE_MAX) > > return BUD_UD_NONE; > > > > insn = *(u16 *)addr; > > if ((insn & INSN_UD_MASK) == INSN_UD2) > > return BUG_UD2; > > > > if ((insn & INSN_ASOP_MASK) == INSN_ASOP) > > insn = *(u16 *)(++addr); > > > > // UBSAN encodes the failure type in the two bytes after UD1 > > if ((insn & INSN_UD_MASK) == INSN_UD1) > > return *(u16 *)(addr + LEN_UD1); > > > > return BUG_UD_NONE; > > } > > > > No? > > Thanks for the feedback. > > It seems that is_valid_bugaddr() needs to be implemented on all architectures > and the function get_ud_type() replaces it here. So how should the patch > handle > is_valid_bugaddr()? Should the function remain as-is in traps.c despite no > longer being used? Yeah, this is why I'd suggested to Gatlin in early designs to reuse is_valid_bugaddr()'s int value. It's a required function, so it seemed sensible to just repurpose it from yes/no to no/type1/type2/type3/etc. -Kees -- Kees Cook
Re: [PATCH v3] wifi: mac80211: Avoid address calculations via out of bounds array indexing
On Wed, Jun 12, 2024 at 10:20:41AM +0200, Johannes Berg wrote: > On Wed, 2024-06-05 at 11:22 -0400, Kenton Groombridge wrote: > > > > Co-authored-by: Kees Cook > > Signed-off-by: Kenton Groombridge > > > > Wait ... I don't know what Kees did here, but seems then he should sign- > off too. > > Kees? I had helped mainly in private emails with Kenton. I'm fine with whatever tags people want. :) Signed-off-by: Kees Cook and/or Reviewed-by: Kees Cook (Though note the email address change just to keep checkpatch happy about Co-authored-by vs S-o-b mismatches.) Thanks! -- Kees Cook
Re: [PATCH v2 2/2] usercopy: Convert test_user_copy to KUnit test
On Wed, Jun 12, 2024 at 05:13:39PM +0800, David Gow wrote: > On Tue, 11 Jun 2024 at 05:33, Kees Cook wrote: > > > > Convert the runtime tests of hardened usercopy to standard KUnit tests. > > > > Co-developed-by: Vitor Massaru Iha > > Signed-off-by: Vitor Massaru Iha > > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org > > Tested-by: Ivan Orlov > > Signed-off-by: Kees Cook > > --- > > This looks good, particularly with the x86 fix applied. > > It's still hanging on m68k -- I think at the 'illegal reversed > copy_to_user passed' test -- but I'll admit to not having tried to > debug it further. > > One other (set of) notes below about using KUNIT_EXPECT_MEMEQ_MSG(), > otherwise (assuming the m68k stuff isn't actually a regression, which > I haven't tested but I imagine is unlikely), Hi Geert, I'm trying to debug a hang on m68k in the usercopy behavioral testing routines. It's testing for the pathological case of having inverted arguments to copy_to_user(): user_addr = kunit_vm_mmap(test, NULL, 0, priv->size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE, 0); ... bad_usermem = (char *)user_addr; ... KUNIT_EXPECT_NE_MSG(test, copy_to_user((char __user *)kmem, bad_usermem, PAGE_SIZE), 0, "illegal reversed copy_to_user passed"); On other architectures, this immediate fails because the access_ok() check rejects it. On m68k with CONFIG_ALTERNATE_USER_ADDRESS_SPACE, access_ok() short-circuits to "true". I've tried reading arch/m68k/include/asm/uaccess.h but I'm not sure what's happening under CONFIG_CPU_HAS_ADDRESS_SPACES. For now I've excluded that test for m68k, but I'm not sure what's expected to happen here on m68k for this set of bad arguments. Can you advise? Thanks! -Kees -- Kees Cook
Re: [PATCH v2 2/2] usercopy: Convert test_user_copy to KUnit test
On Wed, Jun 12, 2024 at 05:13:39PM +0800, David Gow wrote: > On Tue, 11 Jun 2024 at 05:33, Kees Cook wrote: > > > > Convert the runtime tests of hardened usercopy to standard KUnit tests. > > > > Co-developed-by: Vitor Massaru Iha > > Signed-off-by: Vitor Massaru Iha > > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org > > Tested-by: Ivan Orlov > > Signed-off-by: Kees Cook > > --- > > This looks good, particularly with the x86 fix applied. Thanks! > It's still hanging on m68k -- I think at the 'illegal reversed > copy_to_user passed' test -- but I'll admit to not having tried to > debug it further. For my own future reference, I have reproduced this with: $ sudo apt install gcc-m68k-linux-gnu $ tools/testing/kunit/kunit.py run --arch m68k --make_option CROSS_COMPILE=m68k-linux-gnu- usercopy I'll figure it out... > One other (set of) notes below about using KUNIT_EXPECT_MEMEQ_MSG(), > otherwise (assuming the m68k stuff isn't actually a regression, which > I haven't tested but I imagine is unlikely), I should really read all the API docs every few releases. :) I will switch to KUNIT_EXPECT_MEMEQ_MSG! -- Kees Cook
Re: [PATCH v4 0/3] Hardening perf subsystem
On Mon, Jun 10, 2024 at 10:05:44PM +0200, Peter Zijlstra wrote: > On Mon, Jun 10, 2024 at 10:28:52AM -0700, Kees Cook wrote: > > On Sat, Jun 01, 2024 at 06:56:15PM +0200, Erick Archer wrote: > > > Hi everyone, > > > > > > This is an effort to get rid of all multiplications from allocation > > > functions in order to prevent integer overflows [1][2]. > > > > I didn't actually see these 3 patches in this thread nor via lore. > > He managed to break threading between 0/n and the rest. > > > > In the first patch, the "struct amd_uncore_ctx" can be refactored to > > > use a flex array for the "events" member. This way, the allocation/ > > > freeing of the memory can be simplified. Then, the struct_size() > > > helper can be used to do the arithmetic calculation for the memory > > > to be allocated. > > > > I like this patch because it reduces the allocation from 2 to 1. This > > isn't what Peter might see as "churn": this is an improvement in resource > > utilization. > > But then he went and used that struct_size() abomination :/ > > > I prefer this style, as it makes things unambiguous ("this will never > > wrap around") without having to check the associated types and doesn't make > > the resulting binary code different in the "can never overflow" case. > > > > In this particular case: > > > > int size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg); > > > > "int numshared" comes from struct intel_uncore_type::num_shared_regs, > > which is: > > > > unsigned num_shared_regs:8; > > > > And the struct sizes are: > > > > $ pahole -C intel_uncore_box gcc-boot/vmlinux | grep size: > > /* size: 488, cachelines: 8, members: 19 */ > > $ pahole -C intel_uncore_extra_reg gcc-boot/vmlinux | grep size: > > /* size: 96, cachelines: 2, members: 5 */ > > > > So we have: > > > > s32 size = 488 + u8 * 96 > > > > Max size here is 24968 so it can never overflow an s32, so I can see > > why Peter views this as "churn". > > > > I still think the patch is a coding style improvement, but okay. > > I really detest this thing because it makes what was trivially readable > into something opaque. Get me that type qualifier that traps on overflow > and write plain C. All this __builtin_overflow garbage is just that, > unreadable nonsense. It's more readable than container_of(), IMO. "give me the struct size for variable VAR, which has a flexible array MEMBER, when we have COUNT many of them": struct_size(VAR, MEMBER, COUNT). It's more readable, more robust, and provides saturation in the face of potential wrap-around. > > This provides __counted_by coverage, and I think this is important to > > gain in ever place we can. Given that this is part of a ring buffer > > implementation that is arbitrarily sized, this is exactly the kind of > > place I'd like to see __counted_by used. This is a runtime robustness > > improvement, so I don't see this a "churn" at all. > > Again, mixed in with that other crap. Anyway, remind me wth this > __counted_by thing actually does? It provides annotation for the compiler to perform run-time bounds checking on dynamically sized arrays. i.e. CONFIG_FORTIFY_SOURCE and CONFIG_UBSAN_BOUNDS can actually reason about annotated flexible arrays instead of just saying "oh no a flexible array, I give up". > > Peter, for patches 1 and 3, if you'd prefer not to carry them, I could > > put them in the hardening tree to keep them out of your way. It seems > > clear you don't want patch 2 at all. > > I prefer to not have struct_size() anywhere at all. Please just write > readable code. That ship has sailed, and it has been keeping things at bay for a while now. As we make progress on making the compiler able to do this more naturally, we can work on replacing struct_size(), but it's in use globally and it's useful both for catching runtime mistakes and for catching compile-time mistakes (the flexible array has to match the variable's struct). -Kees -- Kees Cook
[PATCH v2 1/2] kunit: test: Add vm_mmap() allocation resource manager
For tests that need to allocate using vm_mmap() (e.g. usercopy and execve), provide the interface to have the allocation tracked by KUnit itself. This requires bringing up a placeholder userspace mm. This combines my earlier attempt at this with Mark Rutland's version[1]. Link: https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ [1] Co-developed-by: Mark Rutland Signed-off-by: Mark Rutland Signed-off-by: Kees Cook --- include/kunit/test.h | 17 +++ lib/kunit/Makefile | 1 + lib/kunit/user_alloc.c | 111 + 3 files changed, 129 insertions(+) create mode 100644 lib/kunit/user_alloc.c diff --git a/include/kunit/test.h b/include/kunit/test.h index e32b4cb7afa2..ec61cad6b71d 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -480,6 +480,23 @@ static inline void *kunit_kcalloc(struct kunit *test, size_t n, size_t size, gfp return kunit_kmalloc_array(test, n, size, gfp | __GFP_ZERO); } +/** + * kunit_vm_mmap() - Allocate KUnit-tracked vm_mmap() area + * @test: The test context object. + * @file: struct file pointer to map from, if any + * @addr: desired address, if any + * @len: how many bytes to allocate + * @prot: mmap PROT_* bits + * @flag: mmap flags + * @offset: offset into @file to start mapping from. + * + * See vm_mmap() for more information. + */ +unsigned long kunit_vm_mmap(struct kunit *test, struct file *file, + unsigned long addr, unsigned long len, + unsigned long prot, unsigned long flag, + unsigned long offset); + void kunit_cleanup(struct kunit *test); void __printf(2, 3) kunit_log_append(struct string_stream *log, const char *fmt, ...); diff --git a/lib/kunit/Makefile b/lib/kunit/Makefile index 309659a32a78..56dd67dc6e57 100644 --- a/lib/kunit/Makefile +++ b/lib/kunit/Makefile @@ -2,6 +2,7 @@ obj-$(CONFIG_KUNIT) += kunit.o kunit-objs += test.o \ resource.o \ + user_alloc.o \ static_stub.o \ string-stream.o \ assert.o \ diff --git a/lib/kunit/user_alloc.c b/lib/kunit/user_alloc.c new file mode 100644 index ..d66f42282f43 --- /dev/null +++ b/lib/kunit/user_alloc.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KUnit userspace memory allocation resource management. + */ +#include +#include +#include +#include + +struct kunit_vm_mmap_resource { + unsigned long addr; + size_t size; +}; + +/* vm_mmap() arguments */ +struct kunit_vm_mmap_params { + struct file *file; + unsigned long addr; + unsigned long len; + unsigned long prot; + unsigned long flag; + unsigned long offset; +}; + +/* Create and attach a new mm if it doesn't already exist. */ +static int kunit_attach_mm(void) +{ + struct mm_struct *mm; + + if (current->mm) + return 0; + + mm = mm_alloc(); + if (!mm) + return -ENOMEM; + + /* Define the task size. */ + mm->task_size = TASK_SIZE; + + /* Make sure we can allocate new VMAs. */ + arch_pick_mmap_layout(mm, >signal->rlim[RLIMIT_STACK]); + + /* Attach the mm. It will be cleaned up when the process dies. */ + kthread_use_mm(mm); + + return 0; +} + +static int kunit_vm_mmap_init(struct kunit_resource *res, void *context) +{ + struct kunit_vm_mmap_params *p = context; + struct kunit_vm_mmap_resource vres; + int ret; + + ret = kunit_attach_mm(); + if (ret) + return ret; + + vres.size = p->len; + vres.addr = vm_mmap(p->file, p->addr, p->len, p->prot, p->flag, p->offset); + if (!vres.addr) + return -ENOMEM; + res->data = kmemdup(, sizeof(vres), GFP_KERNEL); + if (!res->data) { + vm_munmap(vres.addr, vres.size); + return -ENOMEM; + } + + return 0; +} + +static void kunit_vm_mmap_free(struct kunit_resource *res) +{ + struct kunit_vm_mmap_resource *vres = res->data; + + /* +* Since this is executed from the test monitoring process, +* the test's mm has already been torn down. We don't need +* to run vm_munmap(vres->addr, vres->size), only clean up +* the vres. +*/ + + kfree(vres); + res->data = NULL; +} + +unsigned long kunit_vm_mmap(struct kunit *test, struct file *file, + unsigned long addr, unsigned long len, + unsigned long prot, unsigned long flag, + unsigned long offset) +{ + struct kunit_vm_mmap_params params = { + .file = file, + .addr = addr
[PATCH v2 0/2] usercopy: Convert test_user_copy to KUnit test
Hi, This builds on the proposal[1] from Mark and lets me convert the existing usercopy selftest to KUnit. Besides adding this basic test to the KUnit collection, it also opens the door for execve testing (which depends on having a functional current->mm), and should provide the basic infrastructure for adding Mark's much more complete usercopy tests. v2: - dropped "initial VMA", turns out it wasn't needed (Mark) - various cleanups in the test itself (Vitor) - moved new kunit resource to a separate file (David) v1: https://lore.kernel.org/all/20240519190422.work.715-k...@kernel.org/ -Kees [1] https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ Kees Cook (2): kunit: test: Add vm_mmap() allocation resource manager usercopy: Convert test_user_copy to KUnit test MAINTAINERS| 1 + include/kunit/test.h | 17 ++ lib/Kconfig.debug | 21 +- lib/Makefile | 2 +- lib/kunit/Makefile | 1 + lib/kunit/user_alloc.c | 111 + lib/{test_user_copy.c => usercopy_kunit.c} | 273 ++--- 7 files changed, 271 insertions(+), 155 deletions(-) create mode 100644 lib/kunit/user_alloc.c rename lib/{test_user_copy.c => usercopy_kunit.c} (47%) -- 2.34.1
[PATCH v2 2/2] usercopy: Convert test_user_copy to KUnit test
Convert the runtime tests of hardened usercopy to standard KUnit tests. Co-developed-by: Vitor Massaru Iha Signed-off-by: Vitor Massaru Iha Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org Tested-by: Ivan Orlov Signed-off-by: Kees Cook --- MAINTAINERS| 1 + lib/Kconfig.debug | 21 +- lib/Makefile | 2 +- lib/{test_user_copy.c => usercopy_kunit.c} | 273 ++--- 4 files changed, 142 insertions(+), 155 deletions(-) rename lib/{test_user_copy.c => usercopy_kunit.c} (47%) diff --git a/MAINTAINERS b/MAINTAINERS index 8754ac2c259d..0cd171ec6010 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11962,6 +11962,7 @@ F: arch/*/configs/hardening.config F: include/linux/overflow.h F: include/linux/randomize_kstack.h F: kernel/configs/hardening.config +F: lib/usercopy_kunit.c F: mm/usercopy.c K: \b(add|choose)_random_kstack_offset\b K: \b__check_(object_size|heap_object)\b diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 59b6765d86b8..561e346f5cb0 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -2505,18 +2505,6 @@ config TEST_VMALLOC If unsure, say N. -config TEST_USER_COPY - tristate "Test user/kernel boundary protections" - depends on m - help - This builds the "test_user_copy" module that runs sanity checks - on the copy_to/from_user infrastructure, making sure basic - user/kernel boundary testing is working. If it fails to load, - a regression has been detected in the user/kernel memory boundary - protections. - - If unsure, say N. - config TEST_BPF tristate "Test BPF filter functionality" depends on m && NET @@ -2814,6 +2802,15 @@ config SIPHASH_KUNIT_TEST This is intended to help people writing architecture-specific optimized versions. If unsure, say N. +config USERCOPY_KUNIT_TEST + tristate "KUnit Test for user/kernel boundary protections" + depends on KUNIT + default KUNIT_ALL_TESTS + help + This builds the "usercopy_kunit" module that runs sanity checks + on the copy_to/from_user infrastructure, making sure basic + user/kernel boundary testing is working. + config TEST_UDELAY tristate "udelay test driver" help diff --git a/lib/Makefile b/lib/Makefile index 3b1769045651..fae5cc67b95a 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -78,7 +78,6 @@ obj-$(CONFIG_TEST_LKM) += test_module.o obj-$(CONFIG_TEST_VMALLOC) += test_vmalloc.o obj-$(CONFIG_TEST_RHASHTABLE) += test_rhashtable.o obj-$(CONFIG_TEST_SORT) += test_sort.o -obj-$(CONFIG_TEST_USER_COPY) += test_user_copy.o obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_keys.o obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o obj-$(CONFIG_TEST_DYNAMIC_DEBUG) += test_dynamic_debug.o @@ -388,6 +387,7 @@ CFLAGS_fortify_kunit.o += $(call cc-disable-warning, stringop-truncation) CFLAGS_fortify_kunit.o += $(DISABLE_STRUCTLEAK_PLUGIN) obj-$(CONFIG_FORTIFY_KUNIT_TEST) += fortify_kunit.o obj-$(CONFIG_SIPHASH_KUNIT_TEST) += siphash_kunit.o +obj-$(CONFIG_USERCOPY_KUNIT_TEST) += usercopy_kunit.o obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o diff --git a/lib/test_user_copy.c b/lib/usercopy_kunit.c similarity index 47% rename from lib/test_user_copy.c rename to lib/usercopy_kunit.c index 5ff04d8fe971..f1689f2c5c7b 100644 --- a/lib/test_user_copy.c +++ b/lib/usercopy_kunit.c @@ -15,7 +15,7 @@ #include #include #include -#include +#include /* * Several 32-bit architectures support 64-bit {get,put}_user() calls. @@ -31,26 +31,27 @@ # define TEST_U64 #endif -#define test(condition, msg, ...) \ -({ \ - int cond = (condition); \ - if (cond) \ - pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \ - cond; \ -}) +struct usercopy_test_priv { + char *kmem; + char __user *umem; + size_t size; +}; static bool is_zeroed(void *from, size_t size) { return memchr_inv(from, 0x0, size) == NULL; } -static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size) +/* Test usage of check_nonzero_user(). */ +static void usercopy_test_check_nonzero_user(struct kunit *test) { - int ret = 0; size_t start, end, i, zero_start, zero_end; + struct usercopy_test_priv *priv = test->priv; + char __user *umem = priv->umem; + char *kmem = priv->kmem; + size_t size = priv->size; - if (test(size <
[PATCH] x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking
When reworking the range checking for get_user(), the get_user_8() case on 32-bit wasn't zeroing the high register. (The jump to bad_get_user_8 was accidentally dropped.) Restore the correct error handling destination (and rename the jump to using the expected ".L" prefix). While here, switch to using a named argument ("size") for the call template ("%c4" to "%c[size]") as already used in the other call templates in this file. Found after moving the usercopy selftests to KUnit: # usercopy_test_invalid: EXPECTATION FAILED at lib/usercopy_kunit.c:278 Expected val_u64 == 0, but val_u64 == -60129542144 (0xfff2) Reported-by: David Gow Closes: https://lore.kernel.org/all/CABVgOSn=tb=lj9sxhut4_9mtjjkvxsq-ikdxc4kgho4cfkv...@mail.gmail.com Fixes: b19b74bc99b1 ("x86/mm: Rework address range check in get_user() and put_user()") Signed-off-by: Kees Cook --- Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: x...@kernel.org Cc: "H. Peter Anvin" Cc: Sean Christopherson Cc: Peter Zijlstra Cc: Arnd Bergmann Cc: "Kirill A. Shutemov" Cc: Qiuxu Zhuo Cc: Nadav Amit Cc: Masahiro Yamada --- arch/x86/include/asm/uaccess.h | 4 ++-- arch/x86/lib/getuser.S | 6 +- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 0f9bab92a43d..3a7755c1a441 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -78,10 +78,10 @@ extern int __get_user_bad(void); int __ret_gu; \ register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX);\ __chk_user_ptr(ptr);\ - asm volatile("call __" #fn "_%c4" \ + asm volatile("call __" #fn "_%c[size]" \ : "=a" (__ret_gu), "=r" (__val_gu),\ ASM_CALL_CONSTRAINT \ -: "0" (ptr), "i" (sizeof(*(ptr;\ +: "0" (ptr), [size] "i" (sizeof(*(ptr; \ instrument_get_user(__val_gu); \ (x) = (__force __typeof__(*(ptr))) __val_gu;\ __builtin_expect(__ret_gu, 0); \ diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index 10d5ed8b5990..a1cb3a4e6742 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -44,7 +44,11 @@ or %rdx, %rax .else cmp $TASK_SIZE_MAX-\size+1, %eax +.if \size != 8 jae .Lbad_get_user +.else + jae .Lbad_get_user_8 +.endif sbb %edx, %edx /* array_index_mask_nospec() */ and %edx, %eax .endif @@ -154,7 +158,7 @@ SYM_CODE_END(__get_user_handle_exception) #ifdef CONFIG_X86_32 SYM_CODE_START_LOCAL(__get_user_8_handle_exception) ASM_CLAC -bad_get_user_8: +.Lbad_get_user_8: xor %edx,%edx xor %ecx,%ecx mov $(-EFAULT),%_ASM_AX -- 2.34.1
Re: [PATCH 2/2] usercopy: Convert test_user_copy to KUnit test
On Sat, Jun 08, 2024 at 04:44:10PM +0800, David Gow wrote: > On Mon, 20 May 2024 at 03:12, Kees Cook wrote: > > > > Convert the runtime tests of hardened usercopy to standard KUnit tests. > > > > Co-developed-by: Vitor Massaru Iha > > Signed-off-by: Vitor Massaru Iha > > Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org > > Signed-off-by: Kees Cook > > --- > > This fails here on i386: > > # usercopy_test_invalid: EXPECTATION FAILED at lib/usercopy_kunit.c:278 > > Expected val_u64 == 0, but > > val_u64 == -60129542144 (0xfff2) Hunh. I can reproduce this with "--arch=i386" but not under UML with SUBARCH=i386. But perhaps it's a difference in the get_user() implementations between the two. And this looks like a bug in the get_user() failure path on i386. I will investigate... > It also seems to be hanging somewhere in usercopy_test_invalid on my > m68k/qemu setup: > ./tools/testing/kunit/kunit.py run --build_dir=.kunit-m68k --arch m68k > usercopy Oh, that's weird. I'll need to get an m68k environment set up... > Otherwise, it looks fine. Maybe it'd make sense to split some of the > tests up a bit more, but it's a matter of taste (and only really an > advantage for debugging hangs where more detailed progress is nice). Yeah. I can do this in follow-up patches, perhaps. > With those architecture-specific hangs either fixed, or documented (if > they're actual problems, not issues with the test), this is: > > Reviewed-by: David Gow Thanks! -- Kees Cook
Re: [PATCH 1/2] kunit: test: Add vm_mmap() allocation resource manager
On Sat, Jun 08, 2024 at 04:44:16PM +0800, David Gow wrote: > On Mon, 20 May 2024 at 03:12, Kees Cook wrote: > > > > For tests that need to allocate using vm_mmap() (e.g. usercopy and > > execve), provide the interface to have the allocation tracked by KUnit > > itself. This requires bringing up a placeholder userspace mm. > > > > This combines my earlier attempt at this with Mark Rutland's version[1]. > > > > Link: > > https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ > > [1] > > Co-developed-by: Mark Rutland > > Signed-off-by: Mark Rutland > > Signed-off-by: Kees Cook > > --- > > Thanks very much for this! > > A few high-level thoughts: > - Do we want to move this into a separate file? test.{c,h} is already > getting pretty big, and this would probably fit more comfortably with > some of the resource-management bits, which are in their own files. > Not every test will need mm support. I'm happy to do that -- I was just following where kunit_kmalloc() was defined. I'll create a new file for it. > - It'd be nice for there to be a way to explicitly teardown/reset > this: I agree that this is made more awkward by KUnit cleanup normally > running on a different thread, but I could definitely see why a test > might want to unset/reset this, and it would be more consistent with > other resources. Yeah, it's weird, but it's naturally managed? > Otherwise, I have a few small questions below, but nothing essential. > There are a couple of test failures/hangs for the usercopy test (on > i386 and m68k), which may have origins here: I've mentioned them > there. I'll look into this. I must have some 64/32 oversight... > Reviewed-by: David Gow Thanks! > > +/* > > + * Arbitrarily chosen user address for the base allocation. > > + */ > > +#define UBUF_ADDR_BASE SZ_2M > > Are there any circumstances where we'd want a _different_ base address > here? Could it conflict with something / could tests require something > different? > > I suspect it's fine to leave it like this until such a case actually shows up. Yeah, it shouldn't be important, and as Mark has pointed out, it might not be needed at all. I'll see what I can do. > > + vres = kunit_alloc_resource(test, > > + kunit_vm_mmap_init, > > + kunit_vm_mmap_free, > > + GFP_KERNEL, > > + ); > > It could be easier to use kunit_add_action() here, rather than > kunit_alloc_resource(), as you wouldn't need the params struct to pass > things through. > > The advantage to keeping the separate resource is that we can more > easily look it up later if we, for example, wanted to be able to make > it current on other threads (is that something we'd ever want to do?). I like having it follow the pattern of the other resource allocators, but if there's not a strong reason to switch, I'll leave it as-is. -- Kees Cook
Re: [PATCH 2/2] usercopy: Convert test_user_copy to KUnit test
On Wed, May 29, 2024 at 01:17:35PM +0100, Ivan Orlov wrote: > On 5/19/24 20:12, Kees Cook wrote: > > #define test(condition, msg, ...) \ > > ({ > > \ > > int cond = (condition); \ > > if (cond) \ > > - pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \ > > + KUNIT_EXPECT_FALSE_MSG(test, cond, msg, ##__VA_ARGS__); \ > > cond; \ > > }) > It looks like the 'test' macro is not used anymore, so probably it should be > removed. Oops, yes. Thanks! > > +static int usercopy_test_init(struct kunit *test) > > +{ > > + struct usercopy_test_priv *priv; > > + unsigned long user_addr; > > - if (ret == 0) { > > - pr_info("tests passed.\n"); > > - return 0; > > - } > > + priv = kunit_kzalloc(test, sizeof(*priv), GFP_KERNEL); > > + if (!priv) > > + return -ENOMEM; > > Should the check be done with KUNIT_ASSERT_NOT_ERR_OR_NULL here as well, as > it is done with priv->kmem? Yes, that's much more idiomatic. I'll adjust this too. > Other than that, > > Tested-by: Ivan Orlov Thanks! -- Kees Cook
Re: [PATCH 1/2] kunit: test: Add vm_mmap() allocation resource manager
On Mon, May 20, 2024 at 10:29:06AM +0100, Mark Rutland wrote: > On Sun, May 19, 2024 at 12:12:52PM -0700, Kees Cook wrote: > > +/* Create and attach a new mm if it doesn't already exist. */ > > +static int kunit_attach_mm(void) > > +{ > > + struct vm_area_struct *vma; > > + struct mm_struct *mm; > > + > > + if (current->mm) > > + return 0; > > My tests deliberately created/destroyed the mm for each test; surely we > don't want to inherit an MM in some arbitrary state? ... or is this just > so the mm can be allocated lazily upon the first mmap() within a test? It's for lazily creation and for supporting running the KUnit test as a module (where a userspace would exist). The old usercopy test worked against the existing userspace, so I'd want to continue to support that. > > > + > > + mm = mm_alloc(); > > + if (!mm) > > + return -ENOMEM; > > + > > + if (mmap_write_lock_killable(mm)) > > + goto out_free; > > + > > + /* Define the task size. */ > > + mm->task_size = TASK_SIZE; > > + > > + /* Prepare the base VMA. */ > > + vma = vm_area_alloc(mm); > > + if (!vma) > > + goto out_unlock; > > + > > + vma_set_anonymous(vma); > > + vma->vm_start = UBUF_ADDR_BASE; > > + vma->vm_end = UBUF_ADDR_BASE + PAGE_SIZE; > > + vm_flags_init(vma, VM_READ | VM_MAYREAD | VM_WRITE | VM_MAYWRITE); > > + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); > > + > > + if (insert_vm_struct(mm, vma)) > > + goto out_free_vma; > > + > > + mmap_write_unlock(mm); > > Why do we need this VMA given you have kunit_vm_mmap()? When I was originally testing this, it seemed like I couldn't perform a vm_mmap() without an existing VMA. > This existed in my uaccess tests because I didn't use vm_mmap(), and I > wanted complete control over the addresses used. > > Given you add kunit_vm_mmap(), I don't think we want this VMA -- it > doesn't serve any real purpose to tests, and accesses can erroneously > hit it, which is problematic. > > UBUF_ADDR_BASE shouldn't be necessary either with kunit_vm_mmap(), > unless you want to use fixed addresses. That was just arbitrarily chosen > to be above NULL and the usual minimum mmap limit. I'll recheck whether this is needed. I think I had to make some other changes as well, so maybe something here ended up being redundant without my noticing it the first time. Thanks for looking this over! -- Kees Cook
Re: [PATCH 2/2] can: mcp251xfd: decorate mcp251xfd_rx_ring.obj with __counted_by()
On Sun, Jun 09, 2024 at 01:54:19PM +0900, Vincent Mailhol wrote: > A new __counted_by() attribute was introduced in [1]. It makes the > compiler's sanitizer aware of the actual size of a flexible array > member, allowing for additional runtime checks. > > Apply the __counted_by() attribute to the obj flexible array member of > struct mcp251xfd_rx_ring. > > Note that the mcp251xfd_rx_ring.obj member is polymorphic: it can be > either of: > > * an array of struct mcp251xfd_hw_rx_obj_can > * an array of struct mcp251xfd_hw_rx_obj_canfd > > The canfd type was chosen in the declaration by the original author to > reflect the upper bound. We pursue the same logic here: the sanitizer > will only see the accurate size of canfd frames. For classical can > frames, it will see a size bigger than the reality, making the check > incorrect but silent (false negative). > > [1] commit dd06e72e68bc ("Compiler Attributes: Add __counted_by macro") > Link: https://git.kernel.org/torvalds/c/dd06e72e68bc > > CC: Kees Cook > Signed-off-by: Vincent Mailhol > --- > drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h > b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h > index 24510b3b8020..b7579fba9457 100644 > --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h > +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h > @@ -565,7 +565,7 @@ struct mcp251xfd_rx_ring { > union mcp251xfd_write_reg_buf uinc_buf; > union mcp251xfd_write_reg_buf uinc_irq_disable_buf; > struct spi_transfer uinc_xfer[MCP251XFD_FIFO_DEPTH]; > - struct mcp251xfd_hw_rx_obj_canfd obj[]; > + struct mcp251xfd_hw_rx_obj_canfd obj[] __counted_by(obj_num); > }; > > struct __packed mcp251xfd_map_buf_nocrc { This one seems safe: rx_ring = kzalloc(sizeof(*rx_ring) + rx_obj_size * rx_obj_num, GFP_KERNEL); ... rx_ring->obj_num = rx_obj_num; But I would like to see the above allocation math replaced with struct_size() usage: rx_ring = kzalloc(struct_size(rx_ring, obj, rx_obj_num), GFP_KERNEL); But that leaves me with a question about the use of rx_obj_size in the original code. Why is this a variable size? rx_ring has only struct mcp251xfd_hw_rx_obj_canfd objects in the flexible array... I suspect that struct mcp251xfd_rx_ring needs to actually be using a union for its flexible array, but I can't find anything that is casting struct mcp251xfd_rx_ring::obj to struct mcp251xfd_hw_rx_obj_can. I'm worried about __counted_by getting used here if the flexible array isn't actually the right type. -- Kees Cook
Re: [PATCH 1/2] can: peak_canfd: decorate pciefd_board.can with __counted_by()
On Sun, Jun 09, 2024 at 01:54:18PM +0900, Vincent Mailhol wrote: > A new __counted_by() attribute was introduced in [1]. It makes the > compiler's sanitizer aware of the actual size of a flexible array > member, allowing for additional runtime checks. > > Move the end of line comments to the previous line to make room and > apply the __counted_by() attribute to the can flexible array member of > struct pciefd_board. > > [1] commit dd06e72e68bc ("Compiler Attributes: Add __counted_by macro") > Link: https://git.kernel.org/torvalds/c/dd06e72e68bc > > CC: Kees Cook > Signed-off-by: Vincent Mailhol > --- > drivers/net/can/peak_canfd/peak_pciefd_main.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/can/peak_canfd/peak_pciefd_main.c > b/drivers/net/can/peak_canfd/peak_pciefd_main.c > index 1df3c4b54f03..636102103a88 100644 > --- a/drivers/net/can/peak_canfd/peak_pciefd_main.c > +++ b/drivers/net/can/peak_canfd/peak_pciefd_main.c > @@ -190,8 +190,10 @@ struct pciefd_board { > void __iomem *reg_base; > struct pci_dev *pci_dev; > int can_count; > - spinlock_t cmd_lock;/* 64-bits cmds must be atomic */ > - struct pciefd_can *can[]; /* array of network devices */ > + /* 64-bits cmds must be atomic */ > + spinlock_t cmd_lock; > + /* array of network devices */ > + struct pciefd_can *can[] __counted_by(can_count); > }; > > /* supported device ids. */ You'll need to adjust the code logic that manipulates "can_count", as accesses to "can" will trap when they're seen as out of bounds. For example: pciefd = devm_kzalloc(>dev, struct_size(pciefd, can, can_count), GFP_KERNEL); ... /* pciefd->can_count is "0" now */ while (pciefd->can_count < can_count) { ... pciefd_can_probe(pciefd); /* which does: */ pciefd->can[pciefd->can_count] = priv; /// HERE ... pciefd->can_count++; } The access at "HERE" above will trap: "can" is believed to have "can_count" elements (0 on the first time through the loop). This needs to be adjusted to increment "can_count" first. Perhaps: diff --git a/drivers/net/can/peak_canfd/peak_pciefd_main.c b/drivers/net/can/peak_canfd/peak_pciefd_main.c index 1df3c4b54f03..df8304b2d291 100644 --- a/drivers/net/can/peak_canfd/peak_pciefd_main.c +++ b/drivers/net/can/peak_canfd/peak_pciefd_main.c @@ -676,7 +676,8 @@ static int pciefd_can_probe(struct pciefd_board *pciefd) spin_lock_init(>tx_lock); /* save the object address in the board structure */ - pciefd->can[pciefd->can_count] = priv; + pciefd->can_count++; + pciefd->can[pciefd->can_count - 1] = priv; dev_info(>pci_dev->dev, "%s at reg_base=0x%p irq=%d\n", ndev->name, priv->reg_base, ndev->irq); @@ -800,8 +801,6 @@ static int peak_pciefd_probe(struct pci_dev *pdev, err = pciefd_can_probe(pciefd); if (err) goto err_free_canfd; - - pciefd->can_count++; } /* set system timestamps counter in RST mode */ -- Kees Cook
[PATCH] kunit/overflow: Adjust for __counted_by with DEFINE_RAW_FLEX()
When a flexible array structure has a __counted_by annotation, its use with DEFINE_RAW_FLEX() will result in the count being zero-initialized. This is expected since one doesn't want to use RAW with a counted_by struct. Adjust the tests to check for the condition and for compiler support. Reported-by: Christian Schrefl Closes: https://lore.kernel.org/all/0bfc6b38-8bc5-4971-b6fb-dc642a73f...@gmail.com/ Suggested-by: Nathan Chancellor Signed-off-by: Kees Cook --- Cc: "Gustavo A. R. Silva" Cc: linux-hardening@vger.kernel.org --- lib/overflow_kunit.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/lib/overflow_kunit.c b/lib/overflow_kunit.c index 4ef31b0bb74d..d305b0c054bb 100644 --- a/lib/overflow_kunit.c +++ b/lib/overflow_kunit.c @@ -1178,14 +1178,28 @@ struct foo { s16 array[] __counted_by(counter); }; +struct bar { + int a; + u32 counter; + s16 array[]; +}; + static void DEFINE_FLEX_test(struct kunit *test) { - DEFINE_RAW_FLEX(struct foo, two, array, 2); + /* Using _RAW_ on a __counted_by struct will initialize "counter" to zero */ + DEFINE_RAW_FLEX(struct foo, two_but_zero, array, 2); +#if __has_attribute(__counted_by__) + int expected_raw_size = sizeof(struct foo); +#else + int expected_raw_size = sizeof(struct foo) + 2 * sizeof(s16); +#endif + /* Without annotation, it will always be on-stack size. */ + DEFINE_RAW_FLEX(struct bar, two, array, 2); DEFINE_FLEX(struct foo, eight, array, counter, 8); DEFINE_FLEX(struct foo, empty, array, counter, 0); - KUNIT_EXPECT_EQ(test, __struct_size(two), - sizeof(struct foo) + sizeof(s16) + sizeof(s16)); + KUNIT_EXPECT_EQ(test, __struct_size(two_but_zero), expected_raw_size); + KUNIT_EXPECT_EQ(test, __struct_size(two), sizeof(struct bar) + 2 * sizeof(s16)); KUNIT_EXPECT_EQ(test, __struct_size(eight), 24); KUNIT_EXPECT_EQ(test, __struct_size(empty), sizeof(struct foo)); } -- 2.34.1
Re: [PATCH v4 0/3] Hardening perf subsystem
On Sat, Jun 01, 2024 at 06:56:15PM +0200, Erick Archer wrote: > Hi everyone, > > This is an effort to get rid of all multiplications from allocation > functions in order to prevent integer overflows [1][2]. I didn't actually see these 3 patches in this thread nor via lore. > In the first patch, the "struct amd_uncore_ctx" can be refactored to > use a flex array for the "events" member. This way, the allocation/ > freeing of the memory can be simplified. Then, the struct_size() > helper can be used to do the arithmetic calculation for the memory > to be allocated. I like this patch because it reduces the allocation from 2 to 1. This isn't what Peter might see as "churn": this is an improvement in resource utilization. I think it's here: https://lore.kernel.org/linux-hardening/as8pr02mb7237e4848b44a5226bd3551c8b...@as8pr02mb7237.eurprd02.prod.outlook.com/ > In the second patch, as the "struct intel_uncore_box" ends in a > flexible array, the preferred way in the kernel is to use the > struct_size() helper to do the arithmetic instead of the calculation > "size + count * size" in the kzalloc_node() function. This is https://lore.kernel.org/linux-hardening/as8pr02mb7237f4d39bf6aa0ff40e91638b...@as8pr02mb7237.eurprd02.prod.outlook.com/ I prefer this style, as it makes things unambiguous ("this will never wrap around") without having to check the associated types and doesn't make the resulting binary code different in the "can never overflow" case. In this particular case: int size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg); "int numshared" comes from struct intel_uncore_type::num_shared_regs, which is: unsigned num_shared_regs:8; And the struct sizes are: $ pahole -C intel_uncore_box gcc-boot/vmlinux | grep size: /* size: 488, cachelines: 8, members: 19 */ $ pahole -C intel_uncore_extra_reg gcc-boot/vmlinux | grep size: /* size: 96, cachelines: 2, members: 5 */ So we have: s32 size = 488 + u8 * 96 Max size here is 24968 so it can never overflow an s32, so I can see why Peter views this as "churn". I still think the patch is a coding style improvement, but okay. > In the third patch, as the "struct perf_buffer" also ends in a > flexible array, the preferred way in the kernel is to use the > struct_size() helper to do the arithmetic instead of the calculation > "size + count * size" in the kzalloc_node() functions. At the same > time, prepare for the coming implementation by GCC and Clang of the > __counted_by attribute. This is https://lore.kernel.org/linux-hardening/as8pr02mb72379b4807f3951a1b926ba58b...@as8pr02mb7237.eurprd02.prod.outlook.com/ This provides __counted_by coverage, and I think this is important to gain in ever place we can. Given that this is part of a ring buffer implementation that is arbitrarily sized, this is exactly the kind of place I'd like to see __counted_by used. This is a runtime robustness improvement, so I don't see this a "churn" at all. Peter, for patches 1 and 3, if you'd prefer not to carry them, I could put them in the hardening tree to keep them out of your way. It seems clear you don't want patch 2 at all. -Kees -- Kees Cook
Re: [PATCH v2] selftests: seccomp: fix format-zero-length warnings
On Fri, Jun 07, 2024 at 02:58:47PM -0600, Shuah Khan wrote: > On 6/7/24 06:41, Amer Al Shanawany wrote: > > fix the following errors by removing empty print statements: > > seccomp_benchmark.c:197:24: warning: zero-length gnu_printf format > > string [-Wformat-zero-length] > >197 | ksft_print_msg(""); > >|^~ > > seccomp_benchmark.c:202:24: warning: zero-length gnu_printf format > > string [-Wformat-zero-length] > >202 | ksft_print_msg(""); > >|^~ > > seccomp_benchmark.c:204:24: warning: zero-length gnu_printf format > > string [-Wformat-zero-length] > >204 | ksft_print_msg(""); > >|^~ > > > > Reported-by: kernel test robot > > Closes: > > https://lore.kernel.org/oe-kbuild-all/202312260235.uj5ug8k9-...@intel.com/ > > Signed-off-by: Amer Al Shanawany > > --- > > Changes v1 -> v2: > > removed empty print statements > > Kees, > > Is this change okay with you. I didn't see any use for > these empty ksft_print_msg(). > > I will take this patch if you are okay with the change. Dropping these means that the "#" marks go missing. Currently: # Running on: # Linux proton 6.5.0-25-generic #25~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Feb 20 16:09:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux with the proposed patch: # Running on: Linux proton 6.5.0-25-generic #25~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Feb 20 16:09:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux This breaks the TAP syntax for the test, so we should find a different solution. Perhaps: ksft_print_msg("%s", ""); ? -Kees > > > --- > > tools/testing/selftests/seccomp/seccomp_benchmark.c | 3 --- > > 1 file changed, 3 deletions(-) > > > > diff --git a/tools/testing/selftests/seccomp/seccomp_benchmark.c > > b/tools/testing/selftests/seccomp/seccomp_benchmark.c > > index b83099160fbc..6fe34be6c693 100644 > > --- a/tools/testing/selftests/seccomp/seccomp_benchmark.c > > +++ b/tools/testing/selftests/seccomp/seccomp_benchmark.c > > @@ -194,14 +194,11 @@ int main(int argc, char *argv[]) > > ksft_set_plan(7); > > ksft_print_msg("Running on:\n"); > > - ksft_print_msg(""); > > system("uname -a"); > > ksft_print_msg("Current BPF sysctl settings:\n"); > > /* Avoid using "sysctl" which may not be installed. */ > > - ksft_print_msg(""); > > system("grep -H . /proc/sys/net/core/bpf_jit_enable"); > > - ksft_print_msg(""); > > system("grep -H . /proc/sys/net/core/bpf_jit_harden"); > > affinity(); > > > thanks, > -- Shuah -- Kees Cook
Re: [PATCH v2 0/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up
On Fri, Jun 07, 2024 at 04:54:41PM -0300, Guilherme G. Piccoli wrote: > On 06/06/2024 12:01, Steven Rostedt wrote: > > Reserve unspecified location of physical memory from kernel command line > > [...] > > Solution: > > > > The solution I have come up with is to introduce a new "reserve_mem=" kernel > > command line. This parameter takes the following format: > > > > reserve_mem=nn:align:label > > > > Where nn is the size of memory to reserve, the align is the alignment of > > that memory, and label is the way for other sub-systems to find that memory. > > This way the kernel command line could have: > > > > reserve_mem=12M:4096:oops ramoops.mem_name=oops > > > > At boot up, the kernel will search for 12 megabytes in usable memory regions > > with an alignment of 4096. It will start at the highest regions and work its > > way down (for those old devices that want access to lower address DMA). When > > it finds a region, it will save it off in a small table and mark it with the > > "oops" label. Then the pstore ramoops sub-system could ask for that memory > > and location, and it will map itself there. > > > > This prototype allows for 8 different mappings (which may be overkill, 4 is > > probably plenty) with 16 byte size to store the label. > > > > I have tested this and it works for us to solve the above problem. We can > > update the kernel and command line and increase the size of pstore without > > needing to update the firmware, or knowing every memory layout of each > > board. I only tested this locally, it has not been tested in the field. > > > > Hi Steve, first of all, thanks for this work! This is much appreciated. > The kdumpst tooling (Arch Linux) makes use of pstore when available, and > the recommendation so far was to reserve memory somehow, like "mem=" or > use kdump instead, if no free RAM area was available. > > With your solution, things get way more "elegant". Also, I think we all > know pstore is not 100% reliable, specially the RAM backend due to > already mentioned reasons (like FW memory retraining, ECC memory, etc), > but it's great we have a mechanism to **try it**. If it works, awesome - > for statistical analysis, this is very useful; pstore has been used with > success in the Steam Deck, for example. > > With all that said, I've tested your patches on top of 6.10-rc2 in 2 > qemu VMs (one running legacy BIOS - seabios - and the other UEFI - using > ovmf) and on Steam Deck, and it's working flawlessly. I've tested only > using ramoops as module. > > Some code review in the patches themselves (like a missing > EXPORT_SYMBOL_GPL), but all in all, that's a great addition! Feel free > to add my: > > Tested-by: Guilherme G. Piccoli Yeah, I think this looks good as long as it's understood to be a "best effort", and will radically simplify doing qemu testing, etc. I expect I can take v3 into -next with the fixes Guilherme noted. -Kees -- Kees Cook
Re: [PATCH] mm/util: Swap kmemdup_array() arguments
On Thu, Jun 06, 2024 at 08:48:37PM +0300, Andy Shevchenko wrote: > On Thu, Jun 6, 2024 at 8:46 PM Kees Cook wrote: > > > > On Thu, Jun 06, 2024 at 08:35:13PM +0300, Andy Shevchenko wrote: > > > On Thu, Jun 6, 2024 at 6:56 PM Kees Cook wrote: > > > > On Thu, 06 Jun 2024 15:46:09 +0100, Jean-Philippe Brucker wrote: > > > > > > [...] > > > > > > > Applied to for-next/hardening, thanks! > > > > > > Btw, is it possible to get this for v6.10, so we may start enabling it > > > for others? > > > > Which others do you mean? > > There are a lot of users of kmemdup(x*y) which I want to convert > sooner than later to kmemdup_array(x,y). Ah-ha, I see what you mean. Well, I'm not sure we can do v6.10 for this because rc2 is behind us, and that's what most subsystems merge to. I can land the patch for rc3 so there will be no warnings in Linus's tree/-next, but conversions in subsystem trees will gain warnings, I think... -- Kees Cook
Re: [PATCH] mm/util: Swap kmemdup_array() arguments
On Thu, Jun 06, 2024 at 08:35:13PM +0300, Andy Shevchenko wrote: > On Thu, Jun 6, 2024 at 6:56 PM Kees Cook wrote: > > On Thu, 06 Jun 2024 15:46:09 +0100, Jean-Philippe Brucker wrote: > > [...] > > > Applied to for-next/hardening, thanks! > > Btw, is it possible to get this for v6.10, so we may start enabling it > for others? Which others do you mean? -- Kees Cook
Re: [PATCH v2] pstore/ram: Replace of_node_put with __free() for automatic cleanup
On Wed, Jun 05, 2024 at 09:49:44PM +, Abhinav Jain wrote: > Add __free(device_node) to the parent_node struct declaration > Add changes to incorporate the review comments from v1 > > Signed-off-by: Abhinav Jain > > PATCH v1 link: > https://lore.kernel.org/all/20240415161409.8375-1-jain.abhinav...@gmail.com/ > > Changes since v1: > - Moved the variable definition back to the top of the function body The history and version links should be below the "---" line. But more importantly, please base your patch on the upstream tree, rather than on your v1 patch. :) (i.e. squash your v1 and v2 together). -Kees > --- > fs/pstore/ram.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c > index 14f2f4864e48..f8258e4567c3 100644 > --- a/fs/pstore/ram.c > +++ b/fs/pstore/ram.c > @@ -644,6 +644,7 @@ static int ramoops_parse_dt(struct platform_device *pdev, > struct ramoops_platform_data *pdata) > { > struct device_node *of_node = pdev->dev.of_node; > + struct device_node *parent_node __free(device_node) = > of_node_parent(of_node); > struct resource *res; > u32 value; > int ret; > @@ -703,7 +704,6 @@ static int ramoops_parse_dt(struct platform_device *pdev, >* we're not a child of "reserved-memory" and mimicking the >* expected behavior. >*/ > - struct device_node *parent_node __free(device_node) = > of_node_parent(of_node); > if (!of_node_name_eq(parent_node, "reserved-memory") && > !pdata->console_size && !pdata->ftrace_size && > !pdata->pmsg_size && !pdata->ecc_info.ecc_size) { > -- > 2.34.1 > -- Kees Cook
Re: [PATCH] mm/util: Swap kmemdup_array() arguments
On Thu, 06 Jun 2024 15:46:09 +0100, Jean-Philippe Brucker wrote: > GCC 14.1 complains about the argument usage of kmemdup_array(): > > drivers/soc/tegra/fuse/fuse-tegra.c:130:65: error: 'kmemdup_array' sizes > specified with 'sizeof' in the earlier argument and not in the later argument > [-Werror=calloc-transposed-args] > 130 | fuse->lookups = kmemdup_array(fuse->soc->lookups, > sizeof(*fuse->lookups), > | ^ > drivers/soc/tegra/fuse/fuse-tegra.c:130:65: note: earlier argument should > specify number of elements, later size of each element > > [...] Applied to for-next/hardening, thanks! [1/1] mm/util: Swap kmemdup_array() arguments https://git.kernel.org/kees/c/0ee14725471c Take care, -- Kees Cook
Re: [PATCH v4 4/6] mm/slab: Introduce kmem_buckets_create() and family
On Tue, Jun 04, 2024 at 04:13:32PM -0600, Tycho Andersen wrote: > On Tue, Jun 04, 2024 at 04:02:28PM +0100, Simon Horman wrote: > > On Fri, May 31, 2024 at 12:14:56PM -0700, Kees Cook wrote: > > > + for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) { > > > + char *short_size, *cache_name; > > > + unsigned int cache_useroffset, cache_usersize; > > > + unsigned int size; > > > + > > > + if (!kmalloc_caches[KMALLOC_NORMAL][idx]) > > > + continue; > > > + > > > + size = kmalloc_caches[KMALLOC_NORMAL][idx]->object_size; > > > + if (!size) > > > + continue; > > > + > > > + short_size = strchr(kmalloc_caches[KMALLOC_NORMAL][idx]->name, > > > '-'); > > > + if (WARN_ON(!short_size)) > > > + goto fail; > > > + > > > + cache_name = kasprintf(GFP_KERNEL, "%s-%s", name, short_size + > > > 1); > > > + if (WARN_ON(!cache_name)) > > > + goto fail; > > > + > > > + if (useroffset >= size) { > > > + cache_useroffset = 0; > > > + cache_usersize = 0; > > > + } else { > > > + cache_useroffset = useroffset; > > > + cache_usersize = min(size - cache_useroffset, usersize); > > > + } > > > + (*b)[idx] = kmem_cache_create_usercopy(cache_name, size, > > > + align, flags, cache_useroffset, > > > + cache_usersize, ctor); > > > + kfree(cache_name); > > > + if (WARN_ON(!(*b)[idx])) > > > + goto fail; > > > + } > > > + > > > + return b; > > > + > > > +fail: > > > + for (idx = 0; idx < ARRAY_SIZE(kmalloc_caches[KMALLOC_NORMAL]); idx++) { > > > + if ((*b)[idx]) > > > + kmem_cache_destroy((*b)[idx]); > > > > nit: I don't think it is necessary to guard this with a check for NULL. > > Isn't it? What if a kasprintf() fails halfway through the loop? He means that kmem_cache_destroy() already checks for NULL. Quite right! void kmem_cache_destroy(struct kmem_cache *s) { int err = -EBUSY; bool rcu_set; if (unlikely(!s) || !kasan_check_byte(s)) return; -- Kees Cook
Re: [PATCH] HID: usbhid: fix recurrent out-of-bounds bug in usbhid_parse()
On Tue, Jun 04, 2024 at 10:09:43AM -0700, Nikita Zhandarovich wrote: > Hi, > > On 6/4/24 07:15, Jiri Kosina wrote: > > On Tue, 4 Jun 2024, Kees Cook wrote: > > > >> This isn't the right solution. The problem is that hid_class_descriptor > >> is a flexible array but was sized as a single element fake flexible > >> array: > >> > >> struct hid_descriptor { > >> __u8 bLength; > >> __u8 bDescriptorType; > >> __le16 bcdHID; > >> __u8 bCountryCode; > >> __u8 bNumDescriptors; > >> > >> struct hid_class_descriptor desc[1]; > >> } __attribute__ ((packed)); > >> > >> This likely needs to be: > >> > >> struct hid_class_descriptor desc[] __counted_by(bNumDescriptors); > >> > >> And then check for any sizeof() uses of the struct that might have changed. > > > > Ah, you are of course right, not sure what I was thinking. Thanks a lot > > for catching my brainfart. > > > > I am dropping the patch for now; Nikita, will you please send a refreshed > > one? > > > > Thanks for catching my mistake. > > I'll gladly send a revised version, hoping to do it very soon. I spent a little more time looking at this, and I'm not sure I understand where the actual space for the descriptors comes from? There's interface->extra that is being parsed, and effectively hid_descriptor is being mapped into it, but it uses "sizeof(struct hid_descriptor)" for the limit. Is more than 1 descriptor expected to work correctly? Or is the limit being ignored? I'm a bit confused by this code... -- Kees Cook
Re: [PATCH] HID: usbhid: fix recurrent out-of-bounds bug in usbhid_parse()
On June 4, 2024 1:15:35 AM PDT, Jiri Kosina wrote: >On Fri, 24 May 2024, Nikita Zhandarovich wrote: > >> Syzbot reports [1] a reemerging out-of-bounds bug regarding hid >> descriptors possibly having incorrect bNumDescriptors values in >> usbhid_parse(). >> >> Build on the previous fix in "HID: usbhid: fix out-of-bounds bug" >> and run a sanity-check ensuring that number of descriptors doesn't >> exceed the size of desc[] in struct hid_descriptor. >> >> [1] Syzbot report: >> Link: https://syzkaller.appspot.com/bug?extid=c52569baf0c843f35495 >> >> UBSAN: array-index-out-of-bounds in drivers/hid/usbhid/hid-core.c:1024:7 >> index 1 is out of range for type 'struct hid_class_descriptor[1]' >> CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted >> 6.9.0-rc6-syzkaller-00290-gb9158815de52 #0 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> Google 03/27/2024 >> Workqueue: usb_hub_wq hub_event >> Call Trace: >> >> __dump_stack lib/dump_stack.c:88 [inline] >> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 >> ubsan_epilogue lib/ubsan.c:231 [inline] >> __ubsan_handle_out_of_bounds+0x121/0x150 lib/ubsan.c:429 >> usbhid_parse+0x5a7/0xc80 drivers/hid/usbhid/hid-core.c:1024 >> hid_add_device+0x132/0x520 drivers/hid/hid-core.c:2790 >> usbhid_probe+0xb38/0xea0 drivers/hid/usbhid/hid-core.c:1429 >> usb_probe_interface+0x645/0xbb0 drivers/usb/core/driver.c:399 >> really_probe+0x2b8/0xad0 drivers/base/dd.c:656 >> __driver_probe_device+0x1a2/0x390 drivers/base/dd.c:798 >> driver_probe_device+0x50/0x430 drivers/base/dd.c:828 >> __device_attach_driver+0x2d6/0x530 drivers/base/dd.c:956 >> bus_for_each_drv+0x24e/0x2e0 drivers/base/bus.c:457 >> __device_attach+0x333/0x520 drivers/base/dd.c:1028 >> bus_probe_device+0x189/0x260 drivers/base/bus.c:532 >> device_add+0x8ff/0xca0 drivers/base/core.c:3720 >> usb_set_configuration+0x1976/0x1fb0 drivers/usb/core/message.c:2210 >> usb_generic_driver_probe+0x88/0x140 drivers/usb/core/generic.c:254 >> usb_probe_device+0x1b8/0x380 drivers/usb/core/driver.c:294 >> >> Reported-and-tested-by: syzbot+c52569baf0c843f35...@syzkaller.appspotmail.com >> Fixes: f043bfc98c19 ("HID: usbhid: fix out-of-bounds bug") >> Signed-off-by: Nikita Zhandarovich > >Applied, thanks. This isn't the right solution. The problem is that hid_class_descriptor is a flexible array but was sized as a single element fake flexible array: struct hid_descriptor { __u8 bLength; __u8 bDescriptorType; __le16 bcdHID; __u8 bCountryCode; __u8 bNumDescriptors; struct hid_class_descriptor desc[1]; } __attribute__ ((packed)); This likely needs to be: struct hid_class_descriptor desc[] __counted_by(bNumDescriptors); And then check for any sizeof() uses of the struct that might have changed. -- Kees Cook
Re: [PATCH 1/2] mm/memblock: Add "reserve_mem" to reserved named memory at boot up
On June 3, 2024 4:33:31 PM PDT, Steven Rostedt wrote: >From: "Steven Rostedt (Google)" > >In order to allow for requesting a memory region that can be used for >things like pstore on multiple machines where the memory layout is not the >same, add a new option to the kernel command line called "reserve_mem". > >The format is: reserve_mem=nn:align:name > >Where it will find nn amount of memory at the given alignment of align. >The name field is to allow another subsystem to retrieve where the memory >was found. For example: > > reserve_mem=12M:4096:oops ramoops.mem_name=oops How does this interact with KASLR? It has chosen its physical location before this parsing happens, so I'd expect this to fail once in a while, unless the size/alignment is lucky enough that KASLR never uses that portion of the physical memory... -Kees > >Where ramoops.mem_name will tell ramoops that memory was reserved for it >via the reserve_mem option and it can find it by calling: > > if (reserve_mem_find_by_name("oops", , )) { > // start holds the start address and size holds the size given > >Link: https://lore.kernel.org/all/zjjvnzux3nzig...@kernel.org/ > >Suggested-by: Mike Rapoport >Signed-off-by: Steven Rostedt (Google) >--- > include/linux/mm.h | 2 + > mm/memblock.c | 97 ++ > 2 files changed, 99 insertions(+) > >diff --git a/include/linux/mm.h b/include/linux/mm.h >index 9849dfda44d4..b4455cc02f2c 100644 >--- a/include/linux/mm.h >+++ b/include/linux/mm.h >@@ -4263,4 +4263,6 @@ static inline bool pfn_is_unaccepted_memory(unsigned >long pfn) > void vma_pgtable_walk_begin(struct vm_area_struct *vma); > void vma_pgtable_walk_end(struct vm_area_struct *vma); > >+int reserve_mem_find_by_name(const char *name, unsigned long *start, unsigned >long *size); >+ > #endif /* _LINUX_MM_H */ >diff --git a/mm/memblock.c b/mm/memblock.c >index d09136e040d3..a8bf0ee9e2b4 100644 >--- a/mm/memblock.c >+++ b/mm/memblock.c >@@ -2244,6 +2244,103 @@ void __init memblock_free_all(void) > totalram_pages_add(pages); > } > >+/* Keep a table to reserve named memory */ >+#define RESERVE_MEM_MAX_ENTRIES 8 >+#define RESERVE_MEM_NAME_SIZE 16 >+struct reserve_mem_table { >+ charname[RESERVE_MEM_NAME_SIZE]; >+ unsigned long start; >+ unsigned long size; >+}; >+static struct reserve_mem_table reserved_mem_table[RESERVE_MEM_MAX_ENTRIES]; >+static int reserved_mem_count; >+ >+/* Add wildcard region with a lookup name */ >+static int __init reserved_mem_add(unsigned long start, unsigned long size, >+ const char *name) >+{ >+ struct reserve_mem_table *map; >+ >+ if (!name || !name[0] || strlen(name) >= RESERVE_MEM_NAME_SIZE) >+ return -EINVAL; >+ >+ if (reserved_mem_count >= RESERVE_MEM_MAX_ENTRIES) >+ return -1; >+ >+ map = _mem_table[reserved_mem_count++]; >+ map->start = start; >+ map->size = size; >+ strscpy(map->name, name); >+ return 0; >+} >+ >+/** >+ * reserve_mem_find_by_name - Find reserved memory region with a given name >+ * @name: The name that is attached to a reserved memory region >+ * @start: If found, holds the start address >+ * @size: If found, holds the size of the address. >+ * >+ * Returns: 1 if found or 0 if not found. >+ */ >+int reserve_mem_find_by_name(const char *name, unsigned long *start, unsigned >long *size) >+{ >+ struct reserve_mem_table *map; >+ int i; >+ >+ for (i = 0; i < reserved_mem_count; i++) { >+ map = _mem_table[i]; >+ if (!map->size) >+ continue; >+ if (strcmp(name, map->name) == 0) { >+ *start = map->start; >+ *size = map->size; >+ return 1; >+ } >+ } >+ return 0; >+} >+ >+/* >+ * Parse early_reserve_mem=nn:align:name >+ */ >+static int __init reserve_mem(char *p) >+{ >+ phys_addr_t start, size, align; >+ char *oldp; >+ int err; >+ >+ if (!p) >+ return -EINVAL; >+ >+ oldp = p; >+ size = memparse(p, ); >+ if (p == oldp) >+ return -EINVAL; >+ >+ if (*p != ':') >+ return -EINVAL; >+ >+ align = memparse(p+1, ); >+ if (*p != ':') >+ return -EINVAL; >+ >+ start = memblock_phys_alloc(size, align); >+ if (!start) >+ return -ENOMEM; >+ >+ p++; >+ err = reserved_mem_add(start, size, p); >+ if (err) { >+ memblock_phys_free(start, size); >+ return err; >+ } >+ >+ p += strlen(p); >+ >+ return *p == '\0' ? 0: -EINVAL; >+} >+__setup("reserve_mem=", reserve_mem); >+ > #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_ARCH_KEEP_MEMBLOCK) > static const char * const flagname[] = { > [ilog2(MEMBLOCK_HOTPLUG)] = "HOTPLUG", -- Kees Cook
Re: [PATCH v4 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
On Mon, Jun 03, 2024 at 07:06:15PM +0200, Vlastimil Babka wrote: > On 5/31/24 9:14 PM, Kees Cook wrote: > > Introduce CONFIG_SLAB_BUCKETS which provides the infrastructure to > > support separated kmalloc buckets (in the follow kmem_buckets_create() > > patches and future codetag-based separation). Since this will provide > > a mitigation for a very common case of exploits, enable it by default. > > Are you sure? I thought there was a policy that nobody is special enough > to have stuff enabled by default. Is it worth risking Linus shouting? :) I think it's important to have this enabled given how common the exploitation methodology is and how cheap this solution is. Regardless, if you want it "default n", I can change it. > I found this too verbose and tried a different approach, in the end rewrote > everything to verify the idea works. So I'll just link to the result in git: > > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=slab-buckets-v4-rewrite > > It's also rebased on slab.git:slab/for-6.11/cleanups with some alloc_hooks() > cleanups that would cause conflicts otherwkse. > > But the crux of that approach is: > > /* > * These macros allow declaring a kmem_buckets * parameter alongside size, > which > * can be compiled out with CONFIG_SLAB_BUCKETS=n so that a large number of > call > * sites don't have to pass NULL. > */ > #ifdef CONFIG_SLAB_BUCKETS > #define DECL_BUCKET_PARAMS(_size, _b) size_t (_size), kmem_buckets *(_b) > #define PASS_BUCKET_PARAMS(_size, _b) (_size), (_b) > #define PASS_BUCKET_PARAM(_b) (_b) > #else > #define DECL_BUCKET_PARAMS(_size, _b) size_t (_size) > #define PASS_BUCKET_PARAMS(_size, _b) (_size) > #define PASS_BUCKET_PARAM(_b) NULL > #endif > > Then we have declaration e.g. > > void *__kmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int > node) > __assume_kmalloc_alignment __alloc_size(1); > > and the function is called like (from code not using buckets) > return __kmalloc_node_noprof(PASS_BUCKET_PARAMS(size, NULL), flags, node); > > or (from code using buckets) > #define kmem_buckets_alloc(_b, _size, _flags) \ > alloc_hooks(__kmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, _b), > _flags, NUMA_NO_NODE)) > > And implementation looks like: > > void *__kmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int > node) > { > return __do_kmalloc_node(size, PASS_BUCKET_PARAM(b), flags, node, > _RET_IP_); > } > > The size param is always the first, so the __alloc_size(1) doesn't need > tweaking. > size is also used in the macros even if it's never mangled, because it's easy > to pass one param instead of two, but not zero params instead of one, if we > want > the ending comma not be part of the macro (which would look awkward). > > Does it look ok to you? Of course names of the macros could be tweaked. > Anyway feel > free to use the branch for the followup. Hopefully this way is also > compatible with > the planned codetag based followup. This looks really nice, thank you! This is well aligned with the codetag followup, which also needs to have "size" be very easy to find (to the macros can check for compile-time-constant or not). I will go work from your branch... Thanks! -Kees -- Kees Cook
Re: [GIT PULL] bcachefs updates fro 6.10-rc1
On Sat, Jun 01, 2024 at 12:33:31PM +0100, Mark Brown wrote: > On Mon, May 20, 2024 at 12:10:31PM -0400, James Bottomley wrote: > > On Sun, 2024-05-19 at 23:52 -0400, Kent Overstreet wrote: > > > > I also do (try to) post patches to the list that are doing something > > > interesting and worth discussion; the vast majority this cycle has > > > been boring syzbot crap... > > > you still don't say what problem not posting most patches solves? You > > imply it would slow you down, but getting git-send-email to post to a > > mailing list can actually be automated through a pre-push commit hook > > with no slowdown in the awesome rate at which you apply patches to your > > own tree. > > > Linux kernel process exists because it's been found to work over time. > > That's not to say it can't be changed, but it usually requires at least > > some stab at a reason before that happens. > > Even if no meaningful review ever happens on the actual posts there's > still utility in having the patches on a list and findable in lore, > since everything is normally on the list people end up with workflows > that assume that they'll be able to find things there. For example it's > common for test people who identify which patch introduces an issue to > grab the patch from lore in order to review any discussion of the patch, > then report by replying to the patch to help with context for their > report and get some help with figuring out a CC list. Posting costs > very little and makes people's lives easier. Exactly. This is the standard workflow that everyone depends on. So, for example, for my -next trees, I only ever add patches to them via "b4 am lore-url-here...". If I've got patches to add to -next from some devel tree, I don't cherry-pick them to my -next tree: I send them to lore, and then pull them back down. But the point is: send your stuff to lore. :) -- Kees Cook
Re: [PATCH v2] x86/traps: Enable UBSAN traps on x86
On Sat, Jun 01, 2024 at 03:10:05AM +, Gatlin Newhouse wrote: > +void handle_ubsan_failure(struct pt_regs *regs, int insn) > +{ > + u32 type = 0; > + > + if (insn == INSN_ASOP) { > + type = (*(u16 *)(regs->ip + LEN_ASOP + LEN_UD1)); > + if ((type & 0xFF) == 0x40) > + type = (type >> 8) & 0xFF; > + } else { > + type = (*(u16 *)(regs->ip + LEN_UD1)); > + if ((type & 0xFF) == 0x40) > + type = (type >> 8) & 0xFF; > + } The if/else code is repeated, but the only difference is the offset to read from. Also, if the 0x40 is absent, we likely don't want to report anything. So, perhaps: u16 offset = LEN_UD1; u32 type; if (insn == INSN_ASOP) offset += INSN_ASOP; type = *(u16 *)(regs->ip + offset); if ((type & 0xFF) != 0x40) return; type = (type >> 8) & 0xFF; pr_crit("%s at %pS\n", report_ubsan_failure(regs, type), (void *)regs->ip); -- Kees Cook
Re: [PATCH] x86/boot: add prototype for __fortify_panic()
On Fri, May 31, 2024 at 11:20:09PM +0200, Borislav Petkov wrote: > So I get an allergic reaction everytime we wag the dog - i.e., fix the > code because some tool or option can't handle it even if it is > a perfectly fine code. In that case it is an unused symbol. > > And frankly, I'd prefer the silly warning to denote that fortify doesn't > need to do any checking there vs shutting it up just because. If we want to declare that x86 boot will never perform string handling on strings with unknown lengths, we could just delete the boot/ implementation of __fortify_panic(), and make it a hard failure if such cases are introduced in the future. This hasn't been a particularly friendly solution in the past, though, as the fortify routines do tend to grow additional coverage over time, so there may be future cases that do trip the runtime checking... -- Kees Cook
Re: [PATCH] x86/boot: add prototype for __fortify_panic()
On Fri, May 31, 2024 at 10:49:47PM +0200, Borislav Petkov wrote: > On Fri, May 31, 2024 at 01:46:37PM -0700, Kees Cook wrote: > > Please do not do this. It still benefits from compile-time sanity > > checking. > > Care to elaborate how exactly it benefits? Because when new code gets added that accidentally does improper string handling, fortify will yell about it at compile time. e.g, if someone typos something like: #define BUF_LEN_FOO 16 ... #define BUF_LEN_BAR 10 struct foo { ... char buf[BUF_LEN_FOO]; ... }; ... void process_stuff(struct foo *p) { ... char local_copy[BUF_LEN_BAR]; ... strcpy(local_copy, p->buf); ... } or refactors and forgets to change some name, etc. It's all for catching bugs before they happen, etc. And when source string lengths aren't known, the runtime checking can kick in too. It happens x86 boot doesn't have any of those (good!) so __fortify_panic() goes unused there. But that's a larger topic covered by stuff like CONFIG_LD_DEAD_CODE_DATA_ELIMINATION, etc. -- Kees Cook
Re: [PATCH v3 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
On Fri, May 31, 2024 at 12:51:29PM -0400, Kent Overstreet wrote: > On Fri, May 31, 2024 at 09:48:49AM -0700, Kees Cook wrote: > > On Fri, May 24, 2024 at 11:01:40AM -0400, Kent Overstreet wrote: > > > On Wed, Apr 24, 2024 at 02:40:59PM -0700, Kees Cook wrote: > > > > To be able to choose which buckets to allocate from, make the buckets > > > > available to the lower level kmalloc interfaces by adding them as the > > > > first argument. Where the bucket is not available, pass NULL, which > > > > means > > > > "use the default system kmalloc bucket set" (the prior existing > > > > behavior), > > > > as implemented in kmalloc_slab(). > > > > > > I thought the plan was to use codetags for this? That would obviate the > > > need for all this plumbing. > > > > > > Add fields to the alloc tag for: > > > - allocation size (or 0 if it's not a compile time constant) > > > - union of kmem_cache, kmem_buckets, depending on whether the > > >allocation size is constant or not > > > > I want to provide "simple" (low-hanging fruit) coverage that can live > > separately from the codetags-based coverage. The memory overhead for > > this patch series is negligible, but I suspect the codetags expansion, > > while not giant, will be more than some deployments will want. I want > > to avoid an all-or-nothing solution -- which is why I had intended this > > to be available "by default". > > technically there's no reason for your thing to depend on > CONFIG_CODETAGGING at all, that's the infrastructure for finding > codetags for e.g. /proc/allocinfo. you'd just be using the alloc_hoos() > macro and struct alloc_tag as a place to stash the kmem_buckets pointer. It's the overhead of separate kmem_cache and kmem_buckets for every allocation location that I meant. So I'd like the "simple" version for gaining coverage over the currently-being-regularly-exploited cases, and then allow for the "big hammer" solution too. However, I do think I'll still need the codetag infra because of the sections, etc. I think we'll need to pre-build the caches, but maybe that could be avoided by adding some kind of per-site READ_ONCE/lock thingy to create them on demand. We'll see! :) -- Kees Cook
Re: [PATCH] x86/boot: add prototype for __fortify_panic()
On Fri, May 31, 2024 at 09:08:16PM +0200, Borislav Petkov wrote: > On Fri, May 31, 2024 at 09:53:28AM -0700, Kees Cook wrote: > > Under CONFIG_FORTIFY_SOURCE, the boot code *does* still uses > > fortify-string.h. It lets us both catch mistakes we can discover at > > compile and will catch egregious runtime mistakes, though the reporting > > is much simpler in the boot code. > > From where I'm standing, we're not catching anything in the > decompressor: > > $ objdump -D arch/x86/boot/compressed/vmlinux | grep __fortify_panic > 01bec250 <__fortify_panic>: > $ > > Sure, in vmlinux proper (allmodconfig) we do: > > objdump -D vmlinux | grep __fortify_panic | wc -l > 1417 > > but not in the decompressor which is special anyway. > > So we can just as well disable CONFIG_FORTIFY_SOURCE in the decompressor > and not do silly prototypes. Please do not do this. It still benefits from compile-time sanity checking. -- Kees Cook
[PATCH v4 5/6] ipc, msg: Use dedicated slab buckets for alloc_msg()
The msg subsystem is a common target for exploiting[1][2][3][4][5][6][7] use-after-free type confusion flaws in the kernel for both read and write primitives. Avoid having a user-controlled dynamically-size allocation share the global kmalloc cache by using a separate set of kmalloc buckets via the kmem_buckets API. Link: https://blog.hacktivesecurity.com/index.php/2022/06/13/linux-kernel-exploit-development-1day-case-study/ [1] Link: https://hardenedvault.net/blog/2022-11-13-msg_msg-recon-mitigation-ved/ [2] Link: https://www.willsroot.io/2021/08/corctf-2021-fire-of-salvation-writeup.html [3] Link: https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html [4] Link: https://google.github.io/security-research/pocs/linux/cve-2021-22555/writeup.html [5] Link: https://zplin.me/papers/ELOISE.pdf [6] Link: https://syst3mfailure.io/wall-of-perdition/ [7] Signed-off-by: Kees Cook --- Cc: "GONG, Ruiqi" Cc: Xiu Jianfeng Cc: Suren Baghdasaryan Cc: Kent Overstreet Cc: Jann Horn Cc: Matteo Rizzo Cc: jvoisin --- ipc/msgutil.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/ipc/msgutil.c b/ipc/msgutil.c index d0a0e877cadd..f392f30a057a 100644 --- a/ipc/msgutil.c +++ b/ipc/msgutil.c @@ -42,6 +42,17 @@ struct msg_msgseg { #define DATALEN_MSG((size_t)PAGE_SIZE-sizeof(struct msg_msg)) #define DATALEN_SEG((size_t)PAGE_SIZE-sizeof(struct msg_msgseg)) +static kmem_buckets *msg_buckets __ro_after_init; + +static int __init init_msg_buckets(void) +{ + msg_buckets = kmem_buckets_create("msg_msg", 0, SLAB_ACCOUNT, + sizeof(struct msg_msg), + DATALEN_MSG, NULL); + + return 0; +} +subsys_initcall(init_msg_buckets); static struct msg_msg *alloc_msg(size_t len) { @@ -50,7 +61,7 @@ static struct msg_msg *alloc_msg(size_t len) size_t alen; alen = min(len, DATALEN_MSG); - msg = kmalloc(sizeof(*msg) + alen, GFP_KERNEL_ACCOUNT); + msg = kmem_buckets_alloc(msg_buckets, sizeof(*msg) + alen, GFP_KERNEL); if (msg == NULL) return NULL; -- 2.34.1
[PATCH v4 6/6] mm/util: Use dedicated slab buckets for memdup_user()
Both memdup_user() and vmemdup_user() handle allocations that are regularly used for exploiting use-after-free type confusion flaws in the kernel (e.g. prctl() PR_SET_VMA_ANON_NAME[1] and setxattr[2][3][4] respectively). Since both are designed for contents coming from userspace, it allows for userspace-controlled allocation sizes. Use a dedicated set of kmalloc buckets so these allocations do not share caches with the global kmalloc buckets. After a fresh boot under Ubuntu 23.10, we can see the caches are already in active use: # grep ^memdup /proc/slabinfo memdup_user-8k 4 4 819248 : ... memdup_user-4k 8 8 409688 : ... memdup_user-2k16 16 2048 168 : ... memdup_user-1k 0 0 1024 164 : ... memdup_user-5120 0512 162 : ... memdup_user-2560 0256 161 : ... memdup_user-1280 0128 321 : ... memdup_user-64 256256 64 641 : ... memdup_user-32 512512 32 1281 : ... memdup_user-16 1024 1024 16 2561 : ... memdup_user-8 2048 2048 8 5121 : ... memdup_user-1920 0192 211 : ... memdup_user-96 168168 96 421 : ... Link: https://starlabs.sg/blog/2023/07-prctl-anon_vma_name-an-amusing-heap-spray/ [1] Link: https://duasynt.com/blog/linux-kernel-heap-spray [2] Link: https://etenal.me/archives/1336 [3] Link: https://github.com/a13xp0p0v/kernel-hack-drill/blob/master/drill_exploit_uaf.c [4] Signed-off-by: Kees Cook --- Cc: "GONG, Ruiqi" Cc: Xiu Jianfeng Cc: Suren Baghdasaryan Cc: Kent Overstreet Cc: Jann Horn Cc: Matteo Rizzo Cc: jvoisin Cc: linux...@kvack.org --- mm/util.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/mm/util.c b/mm/util.c index 53f7fc5912bd..f30460c82641 100644 --- a/mm/util.c +++ b/mm/util.c @@ -198,6 +198,16 @@ char *kmemdup_nul(const char *s, size_t len, gfp_t gfp) } EXPORT_SYMBOL(kmemdup_nul); +static kmem_buckets *user_buckets __ro_after_init; + +static int __init init_user_buckets(void) +{ + user_buckets = kmem_buckets_create("memdup_user", 0, 0, 0, INT_MAX, NULL); + + return 0; +} +subsys_initcall(init_user_buckets); + /** * memdup_user - duplicate memory region from user space * @@ -211,7 +221,7 @@ void *memdup_user(const void __user *src, size_t len) { void *p; - p = kmalloc_track_caller(len, GFP_USER | __GFP_NOWARN); + p = kmem_buckets_alloc_track_caller(user_buckets, len, GFP_USER | __GFP_NOWARN); if (!p) return ERR_PTR(-ENOMEM); @@ -237,7 +247,7 @@ void *vmemdup_user(const void __user *src, size_t len) { void *p; - p = kvmalloc(len, GFP_USER); + p = kmem_buckets_valloc(user_buckets, len, GFP_USER); if (!p) return ERR_PTR(-ENOMEM); -- 2.34.1
[PATCH v4 1/6] mm/slab: Introduce kmem_buckets typedef
Encapsulate the concept of a single set of kmem_caches that are used for the kmalloc size buckets. Redefine kmalloc_caches as an array of these buckets (for the different global cache buckets). Signed-off-by: Kees Cook --- Cc: Vlastimil Babka Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: jvoisin Cc: Andrew Morton Cc: Roman Gushchin Cc: Hyeonggon Yoo <42.hye...@gmail.com> Cc: linux...@kvack.org --- include/linux/slab.h | 5 +++-- mm/slab_common.c | 3 +-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 7247e217e21b..de2b7209cd05 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -426,8 +426,9 @@ enum kmalloc_cache_type { NR_KMALLOC_TYPES }; -extern struct kmem_cache * -kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1]; +typedef struct kmem_cache * kmem_buckets[KMALLOC_SHIFT_HIGH + 1]; + +extern kmem_buckets kmalloc_caches[NR_KMALLOC_TYPES]; /* * Define gfp bits that should not be set for KMALLOC_NORMAL. diff --git a/mm/slab_common.c b/mm/slab_common.c index 1560a1546bb1..e0b1c109bed2 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -653,8 +653,7 @@ static struct kmem_cache *__init create_kmalloc_cache(const char *name, return s; } -struct kmem_cache * -kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1] __ro_after_init = +kmem_buckets kmalloc_caches[NR_KMALLOC_TYPES] __ro_after_init = { /* initialization for https://llvm.org/pr42570 */ }; EXPORT_SYMBOL(kmalloc_caches); -- 2.34.1
[PATCH v4 4/6] mm/slab: Introduce kmem_buckets_create() and family
Dedicated caches are available for fixed size allocations via kmem_cache_alloc(), but for dynamically sized allocations there is only the global kmalloc API's set of buckets available. This means it isn't possible to separate specific sets of dynamically sized allocations into a separate collection of caches. This leads to a use-after-free exploitation weakness in the Linux kernel since many heap memory spraying/grooming attacks depend on using userspace-controllable dynamically sized allocations to collide with fixed size allocations that end up in same cache. While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense against these kinds of "type confusion" attacks, including for fixed same-size heap objects, we can create a complementary deterministic defense for dynamically sized allocations that are directly user controlled. Addressing these cases is limited in scope, so isolating these kinds of interfaces will not become an unbounded game of whack-a-mole. For example, many pass through memdup_user(), making isolation there very effective. In order to isolate user-controllable dynamically-sized allocations from the common system kmalloc allocations, introduce kmem_buckets_create(), which behaves like kmem_cache_create(). Introduce kmem_buckets_alloc(), which behaves like kmem_cache_alloc(). Introduce kmem_buckets_alloc_track_caller() for where caller tracking is needed. Introduce kmem_buckets_valloc() for cases where vmalloc fallback is needed. This can also be used in the future to extend allocation profiling's use of code tagging to implement per-caller allocation cache isolation[1] even for dynamic allocations. Memory allocation pinning[2] is still needed to plug the Use-After-Free cross-allocator weakness, but that is an existing and separate issue which is complementary to this improvement. Development continues for that feature via the SLAB_VIRTUAL[3] series (which could also provide guard pages -- another complementary improvement). Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1] Link: https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html [2] Link: https://lore.kernel.org/lkml/20230915105933.495735-1-matteori...@google.com/ [3] Signed-off-by: Kees Cook --- Cc: Vlastimil Babka Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: jvoisin Cc: Andrew Morton Cc: Roman Gushchin Cc: Hyeonggon Yoo <42.hye...@gmail.com> Cc: linux...@kvack.org --- include/linux/slab.h | 12 +++ mm/slab_common.c | 80 2 files changed, 92 insertions(+) diff --git a/include/linux/slab.h b/include/linux/slab.h index 8853c6eb20b4..b48c50d90aae 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -552,6 +552,11 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru, void kmem_cache_free(struct kmem_cache *s, void *objp); +kmem_buckets *kmem_buckets_create(const char *name, unsigned int align, + slab_flags_t flags, + unsigned int useroffset, unsigned int usersize, + void (*ctor)(void *)); + /* * Bulk allocation and freeing operations. These are accelerated in an * allocator specific way to avoid taking locks repeatedly or building @@ -675,6 +680,12 @@ static __always_inline __alloc_size(1) void *kmalloc_noprof(size_t size, gfp_t f } #define kmalloc(...) alloc_hooks(kmalloc_noprof(__VA_ARGS__)) +#define kmem_buckets_alloc(_b, _size, _flags) \ + alloc_hooks(__kmalloc_node_noprof(_b, _size, _flags, NUMA_NO_NODE)) + +#define kmem_buckets_alloc_track_caller(_b, _size, _flags) \ + alloc_hooks(kmalloc_node_track_caller_noprof(_b, _size, _flags, NUMA_NO_NODE, _RET_IP_)) + static __always_inline __alloc_size(1) void *kmalloc_node_noprof(size_t size, gfp_t flags, int node) { if (__builtin_constant_p(size) && size) { @@ -818,6 +829,7 @@ extern void *kvmalloc_buckets_node_noprof(size_t size, gfp_t flags, int node) #define kvzalloc(_size, _flags)kvmalloc(_size, (_flags)|__GFP_ZERO) #define kvzalloc_node(_size, _flags, _node)kvmalloc_node(_size, (_flags)|__GFP_ZERO, _node) +#define kmem_buckets_valloc(_b, _size, _flags) kvmalloc_buckets_node(_b, _size, _flags, NUMA_NO_NODE) static inline __alloc_size(1, 2) void * kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) diff --git a/mm/slab_common.c b/mm/slab_common.c index b5c879fa66bc..f42a98d368a9 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -392,6 +392,82 @@ kmem_cache_create(const char *name, unsigned int size, unsigned int align, } EXPORT_SYMBOL(kmem_cache_create); +static struct kmem_cache *kmem_buckets_cache __ro_after_init; + +kmem_buckets *kmem_buckets_create(const char *name, unsigned int align, +
[PATCH v4 3/6] mm/slab: Introduce kvmalloc_buckets_node() that can take kmem_buckets argument
Plumb kmem_buckets arguments through kvmalloc_node_noprof() so it is possible to provide an API to perform kvmalloc-style allocations with a particular set of buckets. Introduce kvmalloc_buckets_node() that takes a kmem_buckets argument. Signed-off-by: Kees Cook --- Cc: Vlastimil Babka Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: jvoisin Cc: Andrew Morton Cc: Roman Gushchin Cc: Hyeonggon Yoo <42.hye...@gmail.com> Cc: linux...@kvack.org --- include/linux/slab.h | 19 +++ lib/rhashtable.c | 2 +- mm/util.c| 13 + 3 files changed, 25 insertions(+), 9 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index b1165b22cc6f..8853c6eb20b4 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -799,11 +799,22 @@ static inline __alloc_size(1) void *kzalloc_noprof(size_t size, gfp_t flags) #define kzalloc(...) alloc_hooks(kzalloc_noprof(__VA_ARGS__)) #define kzalloc_node(_size, _flags, _node) kmalloc_node(_size, (_flags)|__GFP_ZERO, _node) -extern void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node) __alloc_size(1); -#define kvmalloc_node(...) alloc_hooks(kvmalloc_node_noprof(__VA_ARGS__)) +#ifdef CONFIG_SLAB_BUCKETS +extern void *kvmalloc_buckets_node_noprof(kmem_buckets *b, size_t size, gfp_t flags, int node) + __alloc_size(2); +# define kvmalloc_node_noprof(b, size, flags, node)\ + kvmalloc_buckets_node_noprof(b, size, flags, node) +#else +extern void *kvmalloc_buckets_node_noprof(size_t size, gfp_t flags, int node) + __alloc_size(1); +# define kvmalloc_node_noprof(b, size, flags, node)\ + kvmalloc_buckets_node_noprof(size, flags, node) +#endif +#define kvmalloc_buckets_node(...) alloc_hooks(kvmalloc_node_noprof(__VA_ARGS__)) +#define kvmalloc_node(...) kvmalloc_buckets_node(NULL, __VA_ARGS__) #define kvmalloc(_size, _flags)kvmalloc_node(_size, _flags, NUMA_NO_NODE) -#define kvmalloc_noprof(_size, _flags) kvmalloc_node_noprof(_size, _flags, NUMA_NO_NODE) +#define kvmalloc_noprof(_size, _flags) kvmalloc_node_noprof(NULL, _size, _flags, NUMA_NO_NODE) #define kvzalloc(_size, _flags)kvmalloc(_size, (_flags)|__GFP_ZERO) #define kvzalloc_node(_size, _flags, _node)kvmalloc_node(_size, (_flags)|__GFP_ZERO, _node) @@ -816,7 +827,7 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) if (unlikely(check_mul_overflow(n, size, ))) return NULL; - return kvmalloc_node_noprof(bytes, flags, node); + return kvmalloc_node_noprof(NULL, bytes, flags, node); } #define kvmalloc_array_noprof(...) kvmalloc_array_node_noprof(__VA_ARGS__, NUMA_NO_NODE) diff --git a/lib/rhashtable.c b/lib/rhashtable.c index dbbed19f8fff..ef0f496e4aed 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -184,7 +184,7 @@ static struct bucket_table *bucket_table_alloc(struct rhashtable *ht, static struct lock_class_key __key; tbl = alloc_hooks_tag(ht->alloc_tag, - kvmalloc_node_noprof(struct_size(tbl, buckets, nbuckets), + kvmalloc_node_noprof(NULL, struct_size(tbl, buckets, nbuckets), gfp|__GFP_ZERO, NUMA_NO_NODE)); size = nbuckets; diff --git a/mm/util.c b/mm/util.c index 80430e5ba981..53f7fc5912bd 100644 --- a/mm/util.c +++ b/mm/util.c @@ -593,9 +593,11 @@ unsigned long vm_mmap(struct file *file, unsigned long addr, } EXPORT_SYMBOL(vm_mmap); +#ifdef CONFIG_SLAB_BUCKETS /** - * kvmalloc_node - attempt to allocate physically contiguous memory, but upon + * kvmalloc_buckets_node_noprof - attempt to allocate physically contiguous memory, but upon * failure, fall back to non-contiguous (vmalloc) allocation. + * @b: which set of kmalloc buckets to allocate from. * @size: size of the request. * @flags: gfp mask for the allocation - must be compatible (superset) with GFP_KERNEL. * @node: numa node to allocate from @@ -609,7 +611,10 @@ EXPORT_SYMBOL(vm_mmap); * * Return: pointer to the allocated memory of %NULL in case of failure */ -void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node) +void *kvmalloc_buckets_node_noprof(kmem_buckets *b, size_t size, gfp_t flags, int node) +#else +void *kvmalloc_buckets_node_noprof(size_t size, gfp_t flags, int node) +#endif { gfp_t kmalloc_flags = flags; void *ret; @@ -631,7 +636,7 @@ void *kvmalloc_node_noprof(size_t size, gfp_t flags, int node) kmalloc_flags &= ~__GFP_NOFAIL; } - ret = kmalloc_node_noprof(size, kmalloc_flags, node); + ret = __kmalloc_node_noprof(b, size, kmalloc_flags, node); /* * It doesn't real
[PATCH v4 0/6] slab: Introduce dedicated bucket allocator
Hi, v4: - Rebase to v6.10-rc1 - Add CONFIG_SLAB_BUCKETS to turn off the feature v3: https://lore.kernel.org/lkml/20240424213019.make.366-k...@kernel.org/ v2: https://lore.kernel.org/lkml/20240305100933.it.923-k...@kernel.org/ v1: https://lore.kernel.org/lkml/20240304184252.work.496-k...@kernel.org/ For the cover letter, I'm repeating commit log for patch 4 here, which has additional clarifications and rationale since v2: Dedicated caches are available for fixed size allocations via kmem_cache_alloc(), but for dynamically sized allocations there is only the global kmalloc API's set of buckets available. This means it isn't possible to separate specific sets of dynamically sized allocations into a separate collection of caches. This leads to a use-after-free exploitation weakness in the Linux kernel since many heap memory spraying/grooming attacks depend on using userspace-controllable dynamically sized allocations to collide with fixed size allocations that end up in same cache. While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense against these kinds of "type confusion" attacks, including for fixed same-size heap objects, we can create a complementary deterministic defense for dynamically sized allocations that are directly user controlled. Addressing these cases is limited in scope, so isolation these kinds of interfaces will not become an unbounded game of whack-a-mole. For example, pass through memdup_user(), making isolation there very effective. In order to isolate user-controllable sized allocations from system allocations, introduce kmem_buckets_create(), which behaves like kmem_cache_create(). Introduce kmem_buckets_alloc(), which behaves like kmem_cache_alloc(). Introduce kmem_buckets_alloc_track_caller() for where caller tracking is needed. Introduce kmem_buckets_valloc() for cases where vmalloc callback is needed. Allows for confining allocations to a dedicated set of sized caches (which have the same layout as the kmalloc caches). This can also be used in the future to extend codetag allocation annotations to implement per-caller allocation cache isolation[1] even for dynamic allocations. Memory allocation pinning[2] is still needed to plug the Use-After-Free cross-allocator weakness, but that is an existing and separate issue which is complementary to this improvement. Development continues for that feature via the SLAB_VIRTUAL[3] series (which could also provide guard pages -- another complementary improvement). Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1] Link: https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html [2] Link: https://lore.kernel.org/lkml/20230915105933.495735-1-matteori...@google.com/ [3] After the core implementation are 2 patches that cover the most heavily abused "repeat offenders" used in exploits. Repeating those details here: The msg subsystem is a common target for exploiting[1][2][3][4][5][6] use-after-free type confusion flaws in the kernel for both read and write primitives. Avoid having a user-controlled size cache share the global kmalloc allocator by using a separate set of kmalloc buckets. Link: https://blog.hacktivesecurity.com/index.php/2022/06/13/linux-kernel-exploit-development-1day-case-study/ [1] Link: https://hardenedvault.net/blog/2022-11-13-msg_msg-recon-mitigation-ved/ [2] Link: https://www.willsroot.io/2021/08/corctf-2021-fire-of-salvation-writeup.html [3] Link: https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html [4] Link: https://google.github.io/security-research/pocs/linux/cve-2021-22555/writeup.html [5] Link: https://zplin.me/papers/ELOISE.pdf [6] Link: https://syst3mfailure.io/wall-of-perdition/ [7] Both memdup_user() and vmemdup_user() handle allocations that are regularly used for exploiting use-after-free type confusion flaws in the kernel (e.g. prctl() PR_SET_VMA_ANON_NAME[1] and setxattr[2][3][4] respectively). Since both are designed for contents coming from userspace, it allows for userspace-controlled allocation sizes. Use a dedicated set of kmalloc buckets so these allocations do not share caches with the global kmalloc buckets. Link: https://starlabs.sg/blog/2023/07-prctl-anon_vma_name-an-amusing-heap-spray/ [1] Link: https://duasynt.com/blog/linux-kernel-heap-spray [2] Link: https://etenal.me/archives/1336 [3] Link: https://github.com/a13xp0p0v/kernel-hack-drill/blob/master/drill_exploit_uaf.c [4] Thanks! -Kees Kees Cook (6): mm/slab: Introduce kmem_buckets typedef mm/slab: Plumb kmem_buckets into __do_kmalloc_node() mm/slab: Introduce kvmalloc_buckets_node() that can take kmem_buckets argument mm/slab: Introduce
[PATCH v4 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
Introduce CONFIG_SLAB_BUCKETS which provides the infrastructure to support separated kmalloc buckets (in the follow kmem_buckets_create() patches and future codetag-based separation). Since this will provide a mitigation for a very common case of exploits, enable it by default. To be able to choose which buckets to allocate from, make the buckets available to the internal kmalloc interfaces by adding them as the first argument, rather than depending on the buckets being chosen from the fixed set of global buckets. Where the bucket is not available, pass NULL, which means "use the default system kmalloc bucket set" (the prior existing behavior), as implemented in kmalloc_slab(). To avoid adding the extra argument when !CONFIG_SLAB_BUCKETS, only the top-level macros and static inlines use the buckets argument (where they are stripped out and compiled out respectively). The actual extern functions can then been built without the argument, and the internals fall back to the global kmalloc buckets unconditionally. Signed-off-by: Kees Cook --- Cc: Vlastimil Babka Cc: Christoph Lameter Cc: Pekka Enberg Cc: David Rientjes Cc: Joonsoo Kim Cc: jvoisin Cc: Andrew Morton Cc: Roman Gushchin Cc: Hyeonggon Yoo <42.hye...@gmail.com> Cc: linux...@kvack.org Cc: linux-hardening@vger.kernel.org --- include/linux/slab.h | 34 ++ mm/Kconfig | 15 +++ mm/slab.h| 6 -- mm/slab_common.c | 4 ++-- mm/slub.c| 34 -- mm/util.c| 2 +- 6 files changed, 72 insertions(+), 23 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index de2b7209cd05..b1165b22cc6f 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -569,8 +569,17 @@ static __always_inline void kfree_bulk(size_t size, void **p) kmem_cache_free_bulk(NULL, size, p); } -void *__kmalloc_node_noprof(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment -__alloc_size(1); +#ifdef CONFIG_SLAB_BUCKETS +void *__kmalloc_buckets_node_noprof(kmem_buckets *b, size_t size, gfp_t flags, int node) + __assume_kmalloc_alignment __alloc_size(2); +# define __kmalloc_node_noprof(b, size, flags, node) \ + __kmalloc_buckets_node_noprof(b, size, flags, node) +#else +void *__kmalloc_buckets_node_noprof(size_t size, gfp_t flags, int node) + __assume_kmalloc_alignment __alloc_size(1); +# define __kmalloc_node_noprof(b, size, flags, node) \ + __kmalloc_buckets_node_noprof(size, flags, node) +#endif #define __kmalloc_node(...) alloc_hooks(__kmalloc_node_noprof(__VA_ARGS__)) void *kmem_cache_alloc_node_noprof(struct kmem_cache *s, gfp_t flags, @@ -679,7 +688,7 @@ static __always_inline __alloc_size(1) void *kmalloc_node_noprof(size_t size, gf kmalloc_caches[kmalloc_type(flags, _RET_IP_)][index], flags, node, size); } - return __kmalloc_node_noprof(size, flags, node); + return __kmalloc_node_noprof(NULL, size, flags, node); } #define kmalloc_node(...) alloc_hooks(kmalloc_node_noprof(__VA_ARGS__)) @@ -730,10 +739,19 @@ static inline __realloc_size(2, 3) void * __must_check krealloc_array_noprof(voi */ #define kcalloc(n, size, flags)kmalloc_array(n, size, (flags) | __GFP_ZERO) -void *kmalloc_node_track_caller_noprof(size_t size, gfp_t flags, int node, - unsigned long caller) __alloc_size(1); +#ifdef CONFIG_SLAB_BUCKETS +void *__kmalloc_node_track_caller_noprof(kmem_buckets *b, size_t size, gfp_t flags, int node, +unsigned long caller) __alloc_size(2); +# define kmalloc_node_track_caller_noprof(b, size, flags, node, caller) \ + __kmalloc_node_track_caller_noprof(b, size, flags, node, caller) +#else +void *__kmalloc_node_track_caller_noprof(size_t size, gfp_t flags, int node, +unsigned long caller) __alloc_size(1); +# define kmalloc_node_track_caller_noprof(b, size, flags, node, caller) \ + __kmalloc_node_track_caller_noprof(size, flags, node, caller) +#endif #define kmalloc_node_track_caller(...) \ - alloc_hooks(kmalloc_node_track_caller_noprof(__VA_ARGS__, _RET_IP_)) + alloc_hooks(kmalloc_node_track_caller_noprof(NULL, __VA_ARGS__, _RET_IP_)) /* * kmalloc_track_caller is a special version of kmalloc that records the @@ -746,7 +764,7 @@ void *kmalloc_node_track_caller_noprof(size_t size, gfp_t flags, int node, #define kmalloc_track_caller(...) kmalloc_node_track_caller(__VA_ARGS__, NUMA_NO_NODE) #define kmalloc_track_caller_noprof(...) \ - kmalloc_node_track_caller_noprof(__VA_ARGS__, NUMA_NO
[PATCH] kunit/fortify: Remove __kmalloc_node() test
__kmalloc_node() is considered an "internal" function to the Slab, so drop it from explicit testing. Signed-off-by: Kees Cook --- Cc: Vlastimil Babka Cc: linux...@kvack.org Cc: linux-hardening@vger.kernel.org --- lib/fortify_kunit.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/lib/fortify_kunit.c b/lib/fortify_kunit.c index 39da5b3bc649..f9cc467334ce 100644 --- a/lib/fortify_kunit.c +++ b/lib/fortify_kunit.c @@ -235,9 +235,6 @@ static void fortify_test_alloc_size_##allocator##_dynamic(struct kunit *test) \ kmalloc_array_node(alloc_size, 1, gfp, NUMA_NO_NODE), \ kfree(p)); \ checker(expected_size, __kmalloc(alloc_size, gfp), \ - kfree(p)); \ - checker(expected_size, \ - __kmalloc_node(alloc_size, gfp, NUMA_NO_NODE), \ kfree(p)); \ \ orig = kmalloc(alloc_size, gfp);\ -- 2.34.1
Re: [PATCH] x86/boot: add prototype for __fortify_panic()
On Thu, May 30, 2024 at 06:46:39PM +0200, Borislav Petkov wrote: > On May 30, 2024 6:23:36 PM GMT+02:00, Jeff Johnson > wrote: > >On 5/30/2024 8:42 AM, Nikolay Borisov wrote: > >> > >> > >> On 29.05.24 г. 21:09 ч., Jeff Johnson wrote: > >>> As discussed in [1] add a prototype for __fortify_panic() to fix the > >>> 'make W=1 C=1' warning: > >>> > >>> arch/x86/boot/compressed/misc.c:535:6: warning: symbol '__fortify_panic' > >>> was not declared. Should it be static? > >> > >> Actually doesn't it make sense to have this defined under ../string.h ? > >> Actually given that we don't have any string fortification under the > >> boot/ why have the fortify _* functions at all ? > > > >I'll let Kees answer these questions since I just took guidance from him :) > > The more important question is how does the decompressor build even know of > this symbol? And then make it forget it again instead of adding silly > prototypes... Under CONFIG_FORTIFY_SOURCE, the boot code *does* still uses fortify-string.h. It lets us both catch mistakes we can discover at compile and will catch egregious runtime mistakes, though the reporting is much simpler in the boot code. -- Kees Cook
Re: [PATCH v3 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
On Fri, May 24, 2024 at 11:01:40AM -0400, Kent Overstreet wrote: > On Wed, Apr 24, 2024 at 02:40:59PM -0700, Kees Cook wrote: > > To be able to choose which buckets to allocate from, make the buckets > > available to the lower level kmalloc interfaces by adding them as the > > first argument. Where the bucket is not available, pass NULL, which means > > "use the default system kmalloc bucket set" (the prior existing behavior), > > as implemented in kmalloc_slab(). > > I thought the plan was to use codetags for this? That would obviate the > need for all this plumbing. > > Add fields to the alloc tag for: > - allocation size (or 0 if it's not a compile time constant) > - union of kmem_cache, kmem_buckets, depending on whether the >allocation size is constant or not I want to provide "simple" (low-hanging fruit) coverage that can live separately from the codetags-based coverage. The memory overhead for this patch series is negligible, but I suspect the codetags expansion, while not giant, will be more than some deployments will want. I want to avoid an all-or-nothing solution -- which is why I had intended this to be available "by default". But I will respin this with kmem_buckets under a Kconfig. -- Kees Cook
Re: [PATCH v3 2/6] mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
On Fri, May 24, 2024 at 03:38:58PM +0200, Vlastimil Babka wrote: > On 4/24/24 11:40 PM, Kees Cook wrote: > > To be able to choose which buckets to allocate from, make the buckets > > available to the lower level kmalloc interfaces by adding them as the > > first argument. Where the bucket is not available, pass NULL, which means > > "use the default system kmalloc bucket set" (the prior existing behavior), > > as implemented in kmalloc_slab(). > > > > Signed-off-by: Kees Cook > > --- > > Cc: Vlastimil Babka > > Cc: Christoph Lameter > > Cc: Pekka Enberg > > Cc: David Rientjes > > Cc: Joonsoo Kim > > Cc: Andrew Morton > > Cc: Roman Gushchin > > Cc: Hyeonggon Yoo <42.hye...@gmail.com> > > Cc: linux...@kvack.org > > Cc: linux-hardening@vger.kernel.org > > --- > > include/linux/slab.h | 16 > > lib/fortify_kunit.c | 2 +- > > mm/slab.h| 6 -- > > mm/slab_common.c | 4 ++-- > > mm/slub.c| 14 +++--- > > mm/util.c| 2 +- > > 6 files changed, 23 insertions(+), 21 deletions(-) > > > > diff --git a/include/linux/slab.h b/include/linux/slab.h > > index c8164d5db420..07373b680894 100644 > > --- a/include/linux/slab.h > > +++ b/include/linux/slab.h > > @@ -569,8 +569,8 @@ static __always_inline void kfree_bulk(size_t size, > > void **p) > > kmem_cache_free_bulk(NULL, size, p); > > } > > > > -void *__kmalloc_node_noprof(size_t size, gfp_t flags, int node) > > __assume_kmalloc_alignment > > -__alloc_size(1); > > +void *__kmalloc_node_noprof(kmem_buckets *b, size_t size, gfp_t flags, int > > node) > > + __assume_kmalloc_alignment __alloc_size(2); > > #define __kmalloc_node(...) > > alloc_hooks(__kmalloc_node_noprof(__VA_ARGS__)) > > > > void *kmem_cache_alloc_node_noprof(struct kmem_cache *s, gfp_t flags, > > @@ -679,7 +679,7 @@ static __always_inline __alloc_size(1) void > > *kmalloc_node_noprof(size_t size, gf > > kmalloc_caches[kmalloc_type(flags, > > _RET_IP_)][index], > > flags, node, size); > > } > > - return __kmalloc_node_noprof(size, flags, node); > > + return __kmalloc_node_noprof(NULL, size, flags, node); > > This is not ideal as now every kmalloc_node() callsite will now have to add > the NULL parameter even if this is not enabled. Could the new parameter be > only added depending on the respective config? I felt like it was much simpler to add an argument to the existing call path than to create a duplicate API that had 1 extra argument. However, if you want this behind a Kconfig option, I can redefine the argument list based on that? -- Kees Cook
Re: [PATCH v3 0/6] slab: Introduce dedicated bucket allocator
On Fri, May 24, 2024 at 10:54:58AM -0400, Kent Overstreet wrote: > On Wed, Apr 24, 2024 at 02:40:57PM -0700, Kees Cook wrote: > > Hi, > > > > Series change history: > > > > v3: > > - clarify rationale and purpose in commit log > > - rebase to -next (CONFIG_CODE_TAGGING) > > - simplify calling styles and split out bucket plumbing more cleanly > > - consolidate kmem_buckets_*() family introduction patches > > v2: https://lore.kernel.org/lkml/20240305100933.it.923-k...@kernel.org/ > > v1: https://lore.kernel.org/lkml/20240304184252.work.496-k...@kernel.org/ > > > > For the cover letter, I'm repeating commit log for patch 4 here, which has > > additional clarifications and rationale since v2: > > > > Dedicated caches are available for fixed size allocations via > > kmem_cache_alloc(), but for dynamically sized allocations there is only > > the global kmalloc API's set of buckets available. This means it isn't > > possible to separate specific sets of dynamically sized allocations into > > a separate collection of caches. > > > > This leads to a use-after-free exploitation weakness in the Linux > > kernel since many heap memory spraying/grooming attacks depend on using > > userspace-controllable dynamically sized allocations to collide with > > fixed size allocations that end up in same cache. > > This is going to increase internal fragmentation in the slab allocator, > so we're going to need better, more visible numbers on the amount of > memory stranded thusly, so users can easily see the effect this has. Yes, but not significantly. It's less than the 16-buckets randomized kmalloc implementation. The numbers will be visible in /proc/slabinfo just like any other. > Please also document this effect and point users in the documentation > where to check, so that we devs can get feedback on this. Okay, sure. In the commit log, or did you have somewhere else in mind? -- Kees Cook
Re: [PATCH v3 4/6] mm/slab: Introduce kmem_buckets_create() and family
On Fri, May 24, 2024 at 03:43:33PM +0200, Vlastimil Babka wrote: > On 4/24/24 11:41 PM, Kees Cook wrote: > > Dedicated caches are available for fixed size allocations via > > kmem_cache_alloc(), but for dynamically sized allocations there is only > > the global kmalloc API's set of buckets available. This means it isn't > > possible to separate specific sets of dynamically sized allocations into > > a separate collection of caches. > > > > This leads to a use-after-free exploitation weakness in the Linux > > kernel since many heap memory spraying/grooming attacks depend on using > > userspace-controllable dynamically sized allocations to collide with > > fixed size allocations that end up in same cache. > > > > While CONFIG_RANDOM_KMALLOC_CACHES provides a probabilistic defense > > against these kinds of "type confusion" attacks, including for fixed > > same-size heap objects, we can create a complementary deterministic > > defense for dynamically sized allocations that are directly user > > controlled. Addressing these cases is limited in scope, so isolation these > > kinds of interfaces will not become an unbounded game of whack-a-mole. For > > example, pass through memdup_user(), making isolation there very > > effective. > > > > In order to isolate user-controllable sized allocations from system > > allocations, introduce kmem_buckets_create(), which behaves like > > kmem_cache_create(). Introduce kmem_buckets_alloc(), which behaves like > > kmem_cache_alloc(). Introduce kmem_buckets_alloc_track_caller() for > > where caller tracking is needed. Introduce kmem_buckets_valloc() for > > cases where vmalloc callback is needed. > > > > Allows for confining allocations to a dedicated set of sized caches > > (which have the same layout as the kmalloc caches). > > > > This can also be used in the future to extend codetag allocation > > annotations to implement per-caller allocation cache isolation[1] even > > for dynamic allocations. > > > > Memory allocation pinning[2] is still needed to plug the Use-After-Free > > cross-allocator weakness, but that is an existing and separate issue > > which is complementary to this improvement. Development continues for > > that feature via the SLAB_VIRTUAL[3] series (which could also provide > > guard pages -- another complementary improvement). > > > > Link: https://lore.kernel.org/lkml/202402211449.401382D2AF@keescook [1] > > Link: > > https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html > > [2] > > Link: > > https://lore.kernel.org/lkml/20230915105933.495735-1-matteori...@google.com/ > > [3] > > Signed-off-by: Kees Cook > > So this seems like it's all unconditional and not depending on a config > option? I'd rather see one, as has been done for all similar functionality > before, as not everyone will want this trade-off. Okay, sure, I can do that. Since it was changing some of the core APIs to pass in the target buckets (instead of using the global buckets), it seemed less complex to me to just make the simple changes unconditional. > Also AFAIU every new user (patches 5, 6) will add new bunch of lines to > /proc/slabinfo? And when you integrate alloc tagging, do you expect every Yes, that's true, but that's the goal: they want to live in a separate bucket collection. At present, it's only two, and arguable almost everything that the manual isolation provides should be using the mem_from_user() interface anyway. > callsite will do that as well? Any idea how many there would be and what > kind of memory overhead it will have in the end? I haven't measured the per-codetag-site overhead, but I don't expect it to have giant overhead. If the buckets are created on demand, it should be small. But one step at a time. This provides the infrastructure needed to immediately solve the low-hanging exploitable fruit and to pave the way for the per-codetag-site automation. -Kees -- Kees Cook
Re: [PATCH] x86/boot: add prototype for __fortify_panic()
On Thu, May 30, 2024 at 09:23:36AM -0700, Jeff Johnson wrote: > On 5/30/2024 8:42 AM, Nikolay Borisov wrote: > > > > > > On 29.05.24 г. 21:09 ч., Jeff Johnson wrote: > >> As discussed in [1] add a prototype for __fortify_panic() to fix the > >> 'make W=1 C=1' warning: > >> > >> arch/x86/boot/compressed/misc.c:535:6: warning: symbol '__fortify_panic' > >> was not declared. Should it be static? > > > > Actually doesn't it make sense to have this defined under ../string.h ? > > Actually given that we don't have any string fortification under the > > boot/ why have the fortify _* functions at all ? > > I'll let Kees answer these questions since I just took guidance from him :) Ah-ha, I see what's happening. When not built with CONFIG_FORTIFY_SOURCE, fortify-string.h isn't included. But since misc.c has the function definition, we get a warning that the function declaration was never seen. This is likely the better solution: diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index b70e4a21c15f..3f21a5e218f8 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -532,7 +532,9 @@ asmlinkage __visible void *extract_kernel(void *rmode, unsigned char *output) return output + entry_offset; } +#ifdef CONFIG_FORTIFY_SOURCE void __fortify_panic(const u8 reason, size_t avail, size_t size) { error("detected buffer overflow"); } +#endif Jeff, can you test this? (I still haven't been able to reproduce the warning.) -- Kees Cook
Re: [PATCH] x86/traps: Enable UBSAN traps on x86
On Wed, May 29, 2024 at 01:36:39PM -0700, Gatlin Newhouse wrote: > On Wed, May 29, 2024 at 08:30:20PM UTC, Marco Elver wrote: > > On Wed, 29 May 2024 at 20:17, Gatlin Newhouse > > wrote: > > > > > > On Wed, May 29, 2024 at 09:25:21AM UTC, Marco Elver wrote: > > > > On Wed, 29 May 2024 at 04:20, Gatlin Newhouse > > > > wrote: > > > > [...] > > > > > if (regs->flags & X86_EFLAGS_IF) > > > > > raw_local_irq_enable(); > > > > > - if (report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN || > > > > > - handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN) { > > > > > - regs->ip += LEN_UD2; > > > > > - handled = true; > > > > > + > > > > > + if (insn == INSN_UD2) { > > > > > + if (report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN > > > > > || > > > > > + handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN) { > > > > > + regs->ip += LEN_UD2; > > > > > + handled = true; > > > > > + } > > > > > + } else { > > > > > + if (handle_ubsan_failure(regs, insn) == > > > > > BUG_TRAP_TYPE_WARN) { > > > > > > > > handle_ubsan_failure currently only returns BUG_TRAP_TYPE_NONE? > > > > > > > > > + if (insn == INSN_REX) > > > > > + regs->ip += LEN_REX; > > > > > + regs->ip += LEN_UD1; > > > > > + handled = true; > > > > > + } > > > > > } > > > > > if (regs->flags & X86_EFLAGS_IF) > > > > > raw_local_irq_disable(); > > > > > diff --git a/arch/x86/kernel/ubsan.c b/arch/x86/kernel/ubsan.c > > > > > new file mode 100644 > > > > > index ..6cae11f4fe23 > > > > > --- /dev/null > > > > > +++ b/arch/x86/kernel/ubsan.c > > > > > @@ -0,0 +1,32 @@ > > > > > +// SPDX-License-Identifier: GPL-2.0 > > > > > +/* > > > > > + * Clang Undefined Behavior Sanitizer trap mode support. > > > > > + */ > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > +#include > > > > > + > > > > > +/* > > > > > + * Checks for the information embedded in the UD1 trap instruction > > > > > + * for the UB Sanitizer in order to pass along debugging output. > > > > > + */ > > > > > +enum bug_trap_type handle_ubsan_failure(struct pt_regs *regs, int > > > > > insn) > > > > > +{ > > > > > + u32 type = 0; > > > > > + > > > > > + if (insn == INSN_REX) { > > > > > + type = (*(u16 *)(regs->ip + LEN_REX + LEN_UD1)); > > > > > + if ((type & 0xFF) == 0x40) > > > > > + type = (type >> 8) & 0xFF; > > > > > + } else { > > > > > + type = (*(u16 *)(regs->ip + LEN_UD1)); > > > > > + if ((type & 0xFF) == 0x40) > > > > > + type = (type >> 8) & 0xFF; > > > > > + } > > > > > + pr_crit("%s at %pS\n", report_ubsan_failure(regs, type), > > > > > (void *)regs->ip); > > > > > + > > > > > + return BUG_TRAP_TYPE_NONE; > > > > > +} > > > > > > > > Shouldn't this return BUG_TRAP_TYPE_WARN? > > > > > > So as far as I understand, UBSAN trap mode never warns. Perhaps it does on > > > arm64, although it calls die() so I am unsure. Maybe the condition in > > > handle_bug() should be rewritten in the case of UBSAN ud1s? Do you have > > > any > > > suggestions? > > > > AFAIK on arm64 it's basically a kernel OOPS. > > > > The main thing I just wanted to point out though is that your newly added > > branch > > > > > if (handle_ubsan_failure(regs, insn) == BUG_TRAP_TYPE_WARN) { > > > > will never be taken, because I don't see where handle_ubsan_failure() > > returns BUG_TRAP_TYPE_WARN. > > > > Initially I wrote this with some symmetry to the KCFI checks nearby, but I > was unsure if this would be considered handled or not. Yeah, that seemed like the right "style" to me too. Perhaps, since it can never warn, we could just rewrite it so it's a void function avoid the checking, etc. -- Kees Cook
Re: __fortify_panic() question
On Wed, May 29, 2024 at 10:09:45AM -0700, Jeff Johnson wrote: > On 5/29/2024 9:55 AM, Kees Cook wrote: > > On Wed, May 29, 2024 at 07:36:25AM -0700, Jeff Johnson wrote: > >> 'make W=1 C=1' on x86 gives the warning: > >> arch/x86/boot/compressed/misc.c:535:6: warning: symbol '__fortify_panic' > >> was not declared. Should it be static? > > > > Hm, I can't reproduce this currently (but yes, it looks like arm vs x86 > > is mismatched). What tree is this? > > e0cce98fe279 (linus/master, linux-master) Merge tag 'tpmdd-next-6.10-rc2' of > git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd > > > > >> Looking at this I see for ARM there is a prototype for __fortify_panic() in > >> arch/arm/boot/compressed/misc.h > >> And there is a matching implementation in arch/arm/boot/compressed/misc.c > >> > >> But for x86 there is only the implementation in > >> arch/x86/boot/compressed/misc.c > >> There is not a prototype in arch/x86/boot/compressed/misc.h. > >> > >> The easy fix for this would be to add a prototype to > >> arch/x86/boot/compressed/misc.h. > > > > Yeah, I think this is the right solution. > > You want to do this, or should I? Please feel free! I'd appreciate it. :) -- Kees Cook
Re: __fortify_panic() question
On Wed, May 29, 2024 at 07:36:25AM -0700, Jeff Johnson wrote: > 'make W=1 C=1' on x86 gives the warning: > arch/x86/boot/compressed/misc.c:535:6: warning: symbol '__fortify_panic' was > not declared. Should it be static? Hm, I can't reproduce this currently (but yes, it looks like arm vs x86 is mismatched). What tree is this? > Looking at this I see for ARM there is a prototype for __fortify_panic() in > arch/arm/boot/compressed/misc.h > And there is a matching implementation in arch/arm/boot/compressed/misc.c > > But for x86 there is only the implementation in > arch/x86/boot/compressed/misc.c > There is not a prototype in arch/x86/boot/compressed/misc.h. > > The easy fix for this would be to add a prototype to > arch/x86/boot/compressed/misc.h. Yeah, I think this is the right solution. > But it seems strange to me to add a prototype to a header file that is only > for the benefit of the callee and is not the prototype/header used by the > caller, in this case the one in include/linux/fortify-string.h The stuff in boot/ doesn't tend to include fortify-string.h (since it's sort of "outside" the kernel), hence the need for additional prototypes. -- Kees Cook
Re: [PATCH] drm/i915: 2 GiB of relocations ought to be enough for anybody*
On Tue, May 21, 2024 at 11:12:01AM +0100, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > Kernel test robot reports i915 can hit a warn in kvmalloc_node which has > a purpose of dissalowing crazy size kernel allocations. This was added in > 7661809d493b ("mm: don't allow oversized kvmalloc() calls"): > >/* Don't even allow crazy sizes */ >if (WARN_ON_ONCE(size > INT_MAX)) >return NULL; > > This would be kind of okay since i915 at one point dropped the need for > making a shadow copy of the relocation list, but then it got re-added in > fd1500fcd442 ("Revert "drm/i915/gem: Drop relocation slowpath".") a year > after Linus added the above warning. > > It is plausible that the issue was not seen until now because to trigger > gem_exec_reloc test requires a combination of an relatively older > generation hardware but with at least 8GiB of RAM installed. Probably even > more depending on runtime checks. > > Lets cap what we allow userspace to pass in using the matching limit. > There should be no issue for real userspace since we are talking about > "crazy" number of relocations which have no practical purpose. > > *) Well IGT tests might get upset but they can be easily adjusted. > > Signed-off-by: Tvrtko Ursulin Thanks for fixing this! Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH] drm/i915: 2 GiB of relocations ought to be enough for anybody*
On Tue, May 21, 2024 at 11:12:01AM +0100, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > Kernel test robot reports i915 can hit a warn in kvmalloc_node which has > a purpose of dissalowing crazy size kernel allocations. This was added in > 7661809d493b ("mm: don't allow oversized kvmalloc() calls"): > >/* Don't even allow crazy sizes */ >if (WARN_ON_ONCE(size > INT_MAX)) >return NULL; > > This would be kind of okay since i915 at one point dropped the need for > making a shadow copy of the relocation list, but then it got re-added in > fd1500fcd442 ("Revert "drm/i915/gem: Drop relocation slowpath".") a year > after Linus added the above warning. > > It is plausible that the issue was not seen until now because to trigger > gem_exec_reloc test requires a combination of an relatively older > generation hardware but with at least 8GiB of RAM installed. Probably even > more depending on runtime checks. > > Lets cap what we allow userspace to pass in using the matching limit. > There should be no issue for real userspace since we are talking about > "crazy" number of relocations which have no practical purpose. > > *) Well IGT tests might get upset but they can be easily adjusted. > > Signed-off-by: Tvrtko Ursulin Thanks for fixing this! Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v10 0/5] Introduce mseal
On Tue, May 14, 2024 at 12:52:13PM -0700, Kees Cook wrote: > On Tue, May 14, 2024 at 10:46:46AM -0700, Andrew Morton wrote: > > On Mon, 15 Apr 2024 16:35:19 + jef...@chromium.org wrote: > > > > > This patchset proposes a new mseal() syscall for the Linux kernel. > > > > I have not moved this into mm-stable for a 6.10 merge. Mainly because > > of the total lack of Reviewed-by:s and Acked-by:s. > > Oh, I thought I had already reviewed it. FWIW, please consider it: > > Reviewed-by: Kees Cook > > > The code appears to be stable enough for a merge. > > Agreed. > > > It's awkward that we're in conference this week, but I ask people to > > give consideration to the desirability of moving mseal() into mainline > > sometime over the next week, please. > > Yes please. :) Is the plan still to land this for 6.10? With the testing it's had in -next and Liam's review, I think we're good to go? Thanks! -Kees -- Kees Cook
[PATCH] ext4: Use memtostr_pad() for s_volume_name
As with the other strings in struct ext4_super_block, s_volume_name is not NUL terminated. The other strings were marked in commit 072ebb3bffe6 ("ext4: add nonstring annotations to ext4.h"). Using strscpy() isn't the right replacement for strncpy(); it should use memtostr_pad() instead. Reported-by: syzbot+50835f73143cc2905...@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/19f4c00619192...@google.com/ Fixes: 744a56389f73 ("ext4: replace deprecated strncpy with alternatives") Signed-off-by: Kees Cook --- Cc: "Theodore Ts'o" Cc: Justin Stitt Cc: Andreas Dilger Cc: linux-e...@vger.kernel.org --- fs/ext4/ext4.h | 2 +- fs/ext4/ioctl.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 983dad8c07ec..efed7f09876d 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1347,7 +1347,7 @@ struct ext4_super_block { /*60*/ __le32 s_feature_incompat; /* incompatible feature set */ __le32 s_feature_ro_compat;/* readonly-compatible feature set */ /*68*/ __u8s_uuid[16]; /* 128-bit uuid for volume */ -/*78*/ chars_volume_name[EXT4_LABEL_MAX]; /* volume name */ +/*78*/ chars_volume_name[EXT4_LABEL_MAX] __nonstring; /* volume name */ /*88*/ chars_last_mounted[64] __nonstring; /* directory where last mounted */ /*C8*/ __le32 s_algorithm_usage_bitmap; /* For compression */ /* diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index dab7acd49709..e8bf5972dd47 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -1151,7 +1151,7 @@ static int ext4_ioctl_getlabel(struct ext4_sb_info *sbi, char __user *user_label BUILD_BUG_ON(EXT4_LABEL_MAX >= FSLABEL_MAX); lock_buffer(sbi->s_sbh); - strscpy_pad(label, sbi->s_es->s_volume_name); + memtostr_pad(label, sbi->s_es->s_volume_name); unlock_buffer(sbi->s_sbh); if (copy_to_user(user_label, label, sizeof(label))) -- 2.34.1
Re: [linux-next:master] [mm/slab] 7bd230a266: WARNING:at_mm/util.c:#kvmalloc_node_noprof
On Sun, May 19, 2024 at 07:06:45PM -0400, Kent Overstreet wrote: > this looks like an i915 bug Yeah, agreed. > On Wed, May 15, 2024 at 10:41:19AM +0800, kernel test robot wrote: [...] > > [test failed on linux-next/master 6ba6c795dc73c22ce2c86006f17c4aa802db2a60] [...] > > > > If you fix the issue in a separate patch/commit (i.e. not just a new > > version of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot > > | Closes: > > https://lore.kernel.org/oe-lkp/202405151008.6ddd1aaf-oliver.s...@intel.com > > > > > > [ 940.101700][ T5353] [ cut here ] > > [ 940.107107][ T5353] WARNING: CPU: 1 PID: 5353 at mm/util.c:649 > > kvmalloc_node_noprof (mm/util.c:649 (discriminator 1)) This is: /* Don't even allow crazy sizes */ if (unlikely(size > INT_MAX)) { WARN_ON_ONCE(!(flags & __GFP_NOWARN)); > > [ 940.307791][ T5353] Call Trace: [...] > > [ 940.351795][ T5353] eb_copy_relocations > > (drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1685) i915 And this is: const unsigned int nreloc = eb->exec[i].relocation_count; ... size = nreloc * sizeof(*relocs); relocs = kvmalloc_array(1, size, GFP_KERNEL); So something isn't checking the "relocation_count" size that I assume is coming in from the ioctl? -Kees -- Kees Cook
Re: [linux-next:master] [mm/slab] 7bd230a266: WARNING:at_mm/util.c:#kvmalloc_node_noprof
On Sun, May 19, 2024 at 07:06:45PM -0400, Kent Overstreet wrote: > this looks like an i915 bug Yeah, agreed. > On Wed, May 15, 2024 at 10:41:19AM +0800, kernel test robot wrote: [...] > > [test failed on linux-next/master 6ba6c795dc73c22ce2c86006f17c4aa802db2a60] [...] > > > > If you fix the issue in a separate patch/commit (i.e. not just a new > > version of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot > > | Closes: > > https://lore.kernel.org/oe-lkp/202405151008.6ddd1aaf-oliver.s...@intel.com > > > > > > [ 940.101700][ T5353] [ cut here ] > > [ 940.107107][ T5353] WARNING: CPU: 1 PID: 5353 at mm/util.c:649 > > kvmalloc_node_noprof (mm/util.c:649 (discriminator 1)) This is: /* Don't even allow crazy sizes */ if (unlikely(size > INT_MAX)) { WARN_ON_ONCE(!(flags & __GFP_NOWARN)); > > [ 940.307791][ T5353] Call Trace: [...] > > [ 940.351795][ T5353] eb_copy_relocations > > (drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1685) i915 And this is: const unsigned int nreloc = eb->exec[i].relocation_count; ... size = nreloc * sizeof(*relocs); relocs = kvmalloc_array(1, size, GFP_KERNEL); So something isn't checking the "relocation_count" size that I assume is coming in from the ioctl? -Kees -- Kees Cook
Re: [GIT PULL] bcachefs updates fro 6.10-rc1
On Sun, May 19, 2024 at 12:14:34PM -0400, Kent Overstreet wrote: > [...] > bcachefs changes for 6.10-rc1 > [...] > bcachefs: bch2_btree_path_to_text() Hi Kent, I've asked after this before[1], but there continues to be a lot of bcachefs development going on that is only visible when it appears in -next or during the merge window. I cannot find the above commit on any mailing list on lore.kernel.org[2]. The rules for -next are clear: patches _must_ appear on a list _somewhere_ before they land in -next (much less Linus's tree). The point is to get additional reviews, and to serve as a focal point for any discussions that pop up over a given change. Please adjust the bcachefs development workflow to address this. Anyway, in reference to the above commit, please scrub bcachefs of its %px format string uses. Neither it nor %p should be used[3][4] in new code. (Which is, again, something checkpatch.pl will warn about.) Thanks! -Kees [1] https://lore.kernel.org/all/202401101525.112E8234@keescook/ [2] https://lore.kernel.org/linux-bcachefs/?q=bch2_btree_path_to_text [3] https://docs.kernel.org/process/deprecated.html#p-format-specifier [4] https://lore.kernel.org/lkml/ca+55afwqed_d40g4mucssvrzzrfpujt74vc6pppb675hynx...@mail.gmail.com/ -- Kees Cook
[PATCH 0/2] exec: Add KUnit test for bprm_stack_limits()
Hi, This adds a first KUnit test to the core exec code. With the ability to manipulate userspace memory from KUnit coming[1], I wanted to at least get the KUnit framework in place in exec.c. Most of the coming tests will likely be to binfmt_elf.c, but still, this serves as a reasonable first step. -Kees [1] https://lore.kernel.org/linux-hardening/20240519190422.work.715-k...@kernel.org/ Kees Cook (2): exec: Add KUnit test for bprm_stack_limits() exec: Avoid pathological argc, envc, and bprm->p values MAINTAINERS | 2 + fs/Kconfig.binfmt | 8 +++ fs/exec.c | 24 +++- fs/exec_test.c| 137 ++ 4 files changed, 170 insertions(+), 1 deletion(-) create mode 100644 fs/exec_test.c -- 2.34.1
[PATCH 2/2] exec: Avoid pathological argc, envc, and bprm->p values
Make sure nothing goes wrong with the string counters or the bprm's belief about the stack pointer. Add checks and matching self-tests. For 32-bit validation, this was run under 32-bit UML: $ tools/testing/kunit/kunit.py run --make_options SUBARCH=i386 exec Signed-off-by: Kees Cook --- Cc: Eric Biederman Cc: Justin Stitt Cc: Alexander Viro Cc: Christian Brauner Cc: Jan Kara Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org --- fs/exec.c | 11 ++- fs/exec_test.c | 26 +- 2 files changed, 35 insertions(+), 2 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 1d45e1a2d620..5dcdd115739e 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -503,6 +503,9 @@ static int bprm_stack_limits(struct linux_binprm *bprm) * of argument strings even with small stacks */ limit = max_t(unsigned long, limit, ARG_MAX); + /* Reject totally pathological counts. */ + if (bprm->argc < 0 || bprm->envc < 0) + return -E2BIG; /* * We must account for the size of all the argv and envp pointers to * the argv and envp strings, since they will also take up space in @@ -516,11 +519,17 @@ static int bprm_stack_limits(struct linux_binprm *bprm) * argc can never be 0, to keep them from walking envp by accident. * See do_execveat_common(). */ - ptr_size = (max(bprm->argc, 1) + bprm->envc) * sizeof(void *); + if (check_add_overflow(max(bprm->argc, 1), bprm->envc, _size) || + check_mul_overflow(ptr_size, sizeof(void *), _size)) + return -E2BIG; if (limit <= ptr_size) return -E2BIG; limit -= ptr_size; + /* Avoid a pathological bprm->p. */ + if (bprm->p < limit) + return -E2BIG; + bprm->argmin = bprm->p - limit; return 0; } diff --git a/fs/exec_test.c b/fs/exec_test.c index 32a90c6f47e7..f2d4a80c861d 100644 --- a/fs/exec_test.c +++ b/fs/exec_test.c @@ -8,9 +8,32 @@ struct bprm_stack_limits_result { }; static const struct bprm_stack_limits_result bprm_stack_limits_results[] = { - /* Giant values produce -E2BIG */ + /* Negative argc/envc counts produce -E2BIG */ + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX, + .argc = INT_MIN, .envc = INT_MIN }, .expected_rc = -E2BIG }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX, + .argc = 5, .envc = -1 }, .expected_rc = -E2BIG }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX, + .argc = -1, .envc = 10 }, .expected_rc = -E2BIG }, + /* The max value of argc or envc is MAX_ARG_STRINGS. */ { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX, .argc = INT_MAX, .envc = INT_MAX }, .expected_rc = -E2BIG }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX, + .argc = MAX_ARG_STRINGS, .envc = MAX_ARG_STRINGS }, .expected_rc = -E2BIG }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX, + .argc = 0, .envc = MAX_ARG_STRINGS }, .expected_rc = -E2BIG }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX, + .argc = MAX_ARG_STRINGS, .envc = 0 }, .expected_rc = -E2BIG }, + /* +* On 32-bit system these argc and envc counts, while likely impossible +* to represent within the associated TASK_SIZE, could overflow the +* limit calculation, and bypass the ptr_size <= limit check. +*/ + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX, + .argc = 0x2001, .envc = 0x2001 }, .expected_rc = -E2BIG }, + /* Make sure a pathological bprm->p doesn't cause an overflow. */ + { { .p = sizeof(void *), .rlim_stack.rlim_cur = ULONG_MAX, + .argc = 10, .envc = 10 }, .expected_rc = -E2BIG }, /* * 0 rlim_stack will get raised to ARG_MAX. With 1 string pointer, * we should see p - ARG_MAX + sizeof(void *). @@ -88,6 +111,7 @@ static void exec_test_bprm_stack_limits(struct kunit *test) /* Double-check the constants. */ KUNIT_EXPECT_EQ(test, _STK_LIM, SZ_8M); KUNIT_EXPECT_EQ(test, ARG_MAX, 32 * SZ_4K); + KUNIT_EXPECT_EQ(test, MAX_ARG_STRINGS, 0x7FFF); for (int i = 0; i < ARRAY_SIZE(bprm_stack_limits_results); i++) { const struct bprm_stack_limits_result *result = _stack_limits_results[i]; -- 2.34.1
[PATCH 1/2] exec: Add KUnit test for bprm_stack_limits()
Since bprm_stack_limits() operates with very limited side-effects, add it as the first exec.c KUnit test. Add to Kconfig and adjust MAINTAINERS file to include it. Tested on 64-bit UML: $ tools/testing/kunit/kunit.py run exec Signed-off-by: Kees Cook --- Cc: Eric Biederman Cc: Justin Stitt Cc: Alexander Viro Cc: Christian Brauner Cc: Jan Kara Cc: linux-fsde...@vger.kernel.org Cc: linux...@kvack.org --- MAINTAINERS | 2 + fs/Kconfig.binfmt | 8 fs/exec.c | 13 ++ fs/exec_test.c| 113 ++ 4 files changed, 136 insertions(+) create mode 100644 fs/exec_test.c diff --git a/MAINTAINERS b/MAINTAINERS index 7c121493f43d..845165dbb756 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8039,7 +8039,9 @@ S:Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/execve F: Documentation/userspace-api/ELF.rst F: fs/*binfmt_*.c +F: fs/Kconfig.binfmt F: fs/exec.c +F: fs/exec_test.c F: include/linux/binfmts.h F: include/linux/elf.h F: include/uapi/linux/binfmts.h diff --git a/fs/Kconfig.binfmt b/fs/Kconfig.binfmt index f5693164ca9a..58657f2d9719 100644 --- a/fs/Kconfig.binfmt +++ b/fs/Kconfig.binfmt @@ -176,4 +176,12 @@ config COREDUMP certainly want to say Y here. Not necessary on systems that never need debugging or only ever run flawless code. +config EXEC_KUNIT_TEST + bool "Build execve tests" if !KUNIT_ALL_TESTS + depends on KUNIT + default KUNIT_ALL_TESTS + help + This builds the exec KUnit tests, which tests boundary conditions + of various aspects of the exec internals. + endmenu diff --git a/fs/exec.c b/fs/exec.c index b3c40fbb325f..1d45e1a2d620 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -475,6 +475,15 @@ static int count_strings_kernel(const char *const *argv) return i; } +/* + * Calculate bprm->argmin from: + * - _STK_LIM + * - ARG_MAX + * - bprm->rlim_stack.rlim_cur + * - bprm->argc + * - bprm->envc + * - bprm->p + */ static int bprm_stack_limits(struct linux_binprm *bprm) { unsigned long limit, ptr_size; @@ -2200,3 +2209,7 @@ static int __init init_fs_exec_sysctls(void) fs_initcall(init_fs_exec_sysctls); #endif /* CONFIG_SYSCTL */ + +#ifdef CONFIG_EXEC_KUNIT_TEST +#include "exec_test.c" +#endif diff --git a/fs/exec_test.c b/fs/exec_test.c new file mode 100644 index ..32a90c6f47e7 --- /dev/null +++ b/fs/exec_test.c @@ -0,0 +1,113 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include + +struct bprm_stack_limits_result { + struct linux_binprm bprm; + int expected_rc; + unsigned long expected_argmin; +}; + +static const struct bprm_stack_limits_result bprm_stack_limits_results[] = { + /* Giant values produce -E2BIG */ + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ULONG_MAX, + .argc = INT_MAX, .envc = INT_MAX }, .expected_rc = -E2BIG }, + /* +* 0 rlim_stack will get raised to ARG_MAX. With 1 string pointer, +* we should see p - ARG_MAX + sizeof(void *). +*/ + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0, + .argc = 1, .envc = 0 }, .expected_argmin = ULONG_MAX - ARG_MAX + sizeof(void *)}, + /* Validate that argc is always raised to a minimum of 1. */ + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0, + .argc = 0, .envc = 0 }, .expected_argmin = ULONG_MAX - ARG_MAX + sizeof(void *)}, + /* +* 0 rlim_stack will get raised to ARG_MAX. With pointers filling ARG_MAX, +* we should see -E2BIG. (Note argc is always raised to at least 1.) +*/ + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0, + .argc = ARG_MAX / sizeof(void *), .envc = 0 }, .expected_rc = -E2BIG }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0, + .argc = 0, .envc = ARG_MAX / sizeof(void *) - 1 }, .expected_rc = -E2BIG }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0, + .argc = ARG_MAX / sizeof(void *) + 1, .envc = 0 }, .expected_rc = -E2BIG }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0, + .argc = 0, .envc = ARG_MAX / sizeof(void *) }, .expected_rc = -E2BIG }, + /* And with one less, we see space for exactly 1 pointer. */ + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0, + .argc = (ARG_MAX / sizeof(void *)) - 1, .envc = 0 }, + .expected_argmin = ULONG_MAX - sizeof(void *) }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = 0, + .argc = 0, .envc = (ARG_MAX / sizeof(void *)) - 2, }, + .expected_argmin = ULONG_MAX - sizeof(void *) }, + /* If we raise rlim_stack / 4 to exactly ARG_MAX, nothing changes. */ + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ARG_MAX * 4, + .argc = ARG_MAX / sizeof(void *), .envc = 0 }, .expected_rc = -E2BIG }, + { { .p = ULONG_MAX, .rlim_stack.rlim_cur = ARG_MA
[PATCH 2/2] usercopy: Convert test_user_copy to KUnit test
Convert the runtime tests of hardened usercopy to standard KUnit tests. Co-developed-by: Vitor Massaru Iha Signed-off-by: Vitor Massaru Iha Link: https://lore.kernel.org/r/20200721174654.72132-1-vi...@massaru.org Signed-off-by: Kees Cook --- MAINTAINERS| 1 + lib/Kconfig.debug | 21 +- lib/Makefile | 2 +- lib/{test_user_copy.c => usercopy_kunit.c} | 252 ++--- 4 files changed, 133 insertions(+), 143 deletions(-) rename lib/{test_user_copy.c => usercopy_kunit.c} (52%) diff --git a/MAINTAINERS b/MAINTAINERS index 7c121493f43d..73995b807e5a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11761,6 +11761,7 @@ F: arch/*/configs/hardening.config F: include/linux/overflow.h F: include/linux/randomize_kstack.h F: kernel/configs/hardening.config +F: lib/usercopy_kunit.c F: mm/usercopy.c K: \b(add|choose)_random_kstack_offset\b K: \b__check_(object_size|heap_object)\b diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index c63a5fbf1f1c..fd974480aa45 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -2460,18 +2460,6 @@ config TEST_VMALLOC If unsure, say N. -config TEST_USER_COPY - tristate "Test user/kernel boundary protections" - depends on m - help - This builds the "test_user_copy" module that runs sanity checks - on the copy_to/from_user infrastructure, making sure basic - user/kernel boundary testing is working. If it fails to load, - a regression has been detected in the user/kernel memory boundary - protections. - - If unsure, say N. - config TEST_BPF tristate "Test BPF filter functionality" depends on m && NET @@ -2779,6 +2767,15 @@ config SIPHASH_KUNIT_TEST This is intended to help people writing architecture-specific optimized versions. If unsure, say N. +config USERCOPY_KUNIT_TEST + tristate "KUnit Test for user/kernel boundary protections" + depends on KUNIT + default KUNIT_ALL_TESTS + help + This builds the "usercopy_kunit" module that runs sanity checks + on the copy_to/from_user infrastructure, making sure basic + user/kernel boundary testing is working. + config TEST_UDELAY tristate "udelay test driver" help diff --git a/lib/Makefile b/lib/Makefile index ffc6b2341b45..6287bd6be5d7 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -78,7 +78,6 @@ obj-$(CONFIG_TEST_LKM) += test_module.o obj-$(CONFIG_TEST_VMALLOC) += test_vmalloc.o obj-$(CONFIG_TEST_RHASHTABLE) += test_rhashtable.o obj-$(CONFIG_TEST_SORT) += test_sort.o -obj-$(CONFIG_TEST_USER_COPY) += test_user_copy.o obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_keys.o obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o obj-$(CONFIG_TEST_DYNAMIC_DEBUG) += test_dynamic_debug.o @@ -406,6 +405,7 @@ obj-$(CONFIG_FORTIFY_KUNIT_TEST) += fortify_kunit.o obj-$(CONFIG_STRCAT_KUNIT_TEST) += strcat_kunit.o obj-$(CONFIG_STRSCPY_KUNIT_TEST) += strscpy_kunit.o obj-$(CONFIG_SIPHASH_KUNIT_TEST) += siphash_kunit.o +obj-$(CONFIG_USERCOPY_KUNIT_TEST) += usercopy_kunit.o obj-$(CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED) += devmem_is_allowed.o diff --git a/lib/test_user_copy.c b/lib/usercopy_kunit.c similarity index 52% rename from lib/test_user_copy.c rename to lib/usercopy_kunit.c index 5ff04d8fe971..515df08b3190 100644 --- a/lib/test_user_copy.c +++ b/lib/usercopy_kunit.c @@ -15,7 +15,7 @@ #include #include #include -#include +#include /* * Several 32-bit architectures support 64-bit {get,put}_user() calls. @@ -31,11 +31,17 @@ # define TEST_U64 #endif +struct usercopy_test_priv { + char *kmem; + char __user *umem; + size_t size; +}; + #define test(condition, msg, ...) \ ({ \ int cond = (condition); \ if (cond) \ - pr_warn("[%d] " msg "\n", __LINE__, ##__VA_ARGS__); \ + KUNIT_EXPECT_FALSE_MSG(test, cond, msg, ##__VA_ARGS__); \ cond; \ }) @@ -44,13 +50,16 @@ static bool is_zeroed(void *from, size_t size) return memchr_inv(from, 0x0, size) == NULL; } -static int test_check_nonzero_user(char *kmem, char __user *umem, size_t size) +/* Test usage of check_nonzero_user(). */ +static void usercopy_test_check_nonzero_user(struct kunit *test) { - int ret = 0; size_t start, end, i, zero_start, zero_end; + struct usercopy_test_priv *priv = test->priv; + char __user *umem = priv->umem; + char *kmem = priv->kmem; + size_t size = p
[PATCH 0/2] usercopy: Convert test_user_copy to KUnit test
Hi, This builds on the proposal[1] from Mark and lets me convert the existing usercopy selftest to KUnit. Besides adding this basic test to the KUnit collection, it also opens the door for execve testing (which depends on having a functional current->mm), and should provide the basic infrastructure for adding Mark's much more complete usercopy tests. -Kees [1] https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ Kees Cook (2): kunit: test: Add vm_mmap() allocation resource manager usercopy: Convert test_user_copy to KUnit test MAINTAINERS| 1 + include/kunit/test.h | 17 ++ lib/Kconfig.debug | 21 +- lib/Makefile | 2 +- lib/kunit/test.c | 139 +++- lib/{test_user_copy.c => usercopy_kunit.c} | 252 ++--- 6 files changed, 288 insertions(+), 144 deletions(-) rename lib/{test_user_copy.c => usercopy_kunit.c} (52%) -- 2.34.1
[PATCH 1/2] kunit: test: Add vm_mmap() allocation resource manager
For tests that need to allocate using vm_mmap() (e.g. usercopy and execve), provide the interface to have the allocation tracked by KUnit itself. This requires bringing up a placeholder userspace mm. This combines my earlier attempt at this with Mark Rutland's version[1]. Link: https://lore.kernel.org/lkml/20230321122514.1743889-2-mark.rutl...@arm.com/ [1] Co-developed-by: Mark Rutland Signed-off-by: Mark Rutland Signed-off-by: Kees Cook --- include/kunit/test.h | 17 ++ lib/kunit/test.c | 139 ++- 2 files changed, 155 insertions(+), 1 deletion(-) diff --git a/include/kunit/test.h b/include/kunit/test.h index 61637ef32302..8c3835a6f282 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -478,6 +478,23 @@ static inline void *kunit_kcalloc(struct kunit *test, size_t n, size_t size, gfp return kunit_kmalloc_array(test, n, size, gfp | __GFP_ZERO); } +/** + * kunit_vm_mmap() - Allocate KUnit-tracked vm_mmap() area + * @test: The test context object. + * @file: struct file pointer to map from, if any + * @addr: desired address, if any + * @len: how many bytes to allocate + * @prot: mmap PROT_* bits + * @flag: mmap flags + * @offset: offset into @file to start mapping from. + * + * See vm_mmap() for more information. + */ +unsigned long kunit_vm_mmap(struct kunit *test, struct file *file, + unsigned long addr, unsigned long len, + unsigned long prot, unsigned long flag, + unsigned long offset); + void kunit_cleanup(struct kunit *test); void __printf(2, 3) kunit_log_append(struct string_stream *log, const char *fmt, ...); diff --git a/lib/kunit/test.c b/lib/kunit/test.c index 1d1475578515..09194dbffb63 100644 --- a/lib/kunit/test.c +++ b/lib/kunit/test.c @@ -11,13 +11,14 @@ #include #include #include +#include +#include #include #include #include #include #include #include -#include #include "debugfs.h" #include "device-impl.h" @@ -871,6 +872,142 @@ void kunit_kfree(struct kunit *test, const void *ptr) } EXPORT_SYMBOL_GPL(kunit_kfree); +struct kunit_vm_mmap_resource { + unsigned long addr; + size_t size; +}; + +/* vm_mmap() arguments */ +struct kunit_vm_mmap_params { + struct file *file; + unsigned long addr; + unsigned long len; + unsigned long prot; + unsigned long flag; + unsigned long offset; +}; + +/* + * Arbitrarily chosen user address for the base allocation. + */ +#define UBUF_ADDR_BASE SZ_2M + +/* Create and attach a new mm if it doesn't already exist. */ +static int kunit_attach_mm(void) +{ + struct vm_area_struct *vma; + struct mm_struct *mm; + + if (current->mm) + return 0; + + mm = mm_alloc(); + if (!mm) + return -ENOMEM; + + if (mmap_write_lock_killable(mm)) + goto out_free; + + /* Define the task size. */ + mm->task_size = TASK_SIZE; + + /* Prepare the base VMA. */ + vma = vm_area_alloc(mm); + if (!vma) + goto out_unlock; + + vma_set_anonymous(vma); + vma->vm_start = UBUF_ADDR_BASE; + vma->vm_end = UBUF_ADDR_BASE + PAGE_SIZE; + vm_flags_init(vma, VM_READ | VM_MAYREAD | VM_WRITE | VM_MAYWRITE); + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); + + if (insert_vm_struct(mm, vma)) + goto out_free_vma; + + mmap_write_unlock(mm); + + /* Make sure we can allocate new VMAs. */ + arch_pick_mmap_layout(mm, >signal->rlim[RLIMIT_STACK]); + + /* Attach the mm. It will be cleaned up when the process dies. */ + kthread_use_mm(mm); + + return 0; + +out_free_vma: + vm_area_free(vma); +out_unlock: + mmap_write_unlock(mm); +out_free: + mmput(mm); + return -ENOMEM; +} + +static int kunit_vm_mmap_init(struct kunit_resource *res, void *context) +{ + struct kunit_vm_mmap_params *p = context; + struct kunit_vm_mmap_resource vres; + int ret; + + ret = kunit_attach_mm(); + if (ret) + return ret; + + vres.size = p->len; + vres.addr = vm_mmap(p->file, p->addr, p->len, p->prot, p->flag, p->offset); + if (!vres.addr) + return -ENOMEM; + res->data = kmemdup(, sizeof(vres), GFP_KERNEL); + if (!res->data) { + vm_munmap(vres.addr, vres.size); + return -ENOMEM; + } + + return 0; +} + +static void kunit_vm_mmap_free(struct kunit_resource *res) +{ + struct kunit_vm_mmap_resource *vres = res->data; + + /* +* Since this is executed from the test monitoring process, +* the test's mm has already been torn down. We don't need +* to run vm_munmap(vres->addr, vres->size), only clean up +* the vres. +*/ + +
Re: [PATCH] efi: pstore: Return proper errors on UEFI failures
On Sun, May 19, 2024 at 01:33:53PM -0300, Guilherme G. Piccoli wrote: > Right now efi-pstore either returns 0 (success) or -EIO; but we > do have a function to convert UEFI errors in different standard > error codes, helping to narrow down potential issues more accurately. > > So, let's use this helper here. > > Signed-off-by: Guilherme G. Piccoli Ah yeah, good idea! Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH] selftests: rtc: rtctest: Do not open-code TEST_HARNESS_MAIN
On Sat, May 18, 2024 at 10:23:54PM +0200, Alexandre Belloni wrote: > On 17/05/2024 17:16:58-0700, Kees Cook wrote: > > Argument processing is specific to the test harness code. Any optional > > information needs to be passed via environment variables. Move alternate > > path to the RTC_DEV environment variable. Also do not open-code > > TEST_HARNESS_MAIN because its definition may change. > > Th main issue doing that is that this breaks the main use case of > rtctest as /dev/rtc1 is usually the main target for those tests. Having > the RTC_DEV environment variable only documented n this commit message > is definitively not enough, I'm going to have to handle zillion of > complaints that this is not working anymore. Hm, maybe switch the default to /dev/rtc1? Maybe there's a better way to integrate arguments into a test runner. Right now the core harness code is doing the argument parsing... -- Kees Cook
Re: [PATCH v2] drm/nouveau/nvif: Avoid build error due to potential integer overflows
On Sat, May 18, 2024 at 11:29:23AM -0700, Guenter Roeck wrote: > Trying to build parisc:allmodconfig with gcc 12.x or later results > in the following build error. > > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd': > drivers/gpu/drm/nouveau/nvif/object.c:161:9: error: > 'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 > overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict] > 161 | memcpy(data, args->mthd.data, size); > | ^~~ > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor': > drivers/gpu/drm/nouveau/nvif/object.c:298:17: error: > 'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 > overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict] > 298 | memcpy(data, args->new.data, size); > > gcc assumes that 'sizeof(*args) + size' can overflow, which would result > in the problem. > > The problem is not new, only it is now no longer a warning but an error > since W=1 has been enabled for the drm subsystem and since Werror is > enabled for test builds. > > Rearrange arithmetic and use check_add_overflow() for validating the > allocation size to avoid the overflow. > > Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the > subsystem") > Cc: Javier Martinez Canillas > Cc: Jani Nikula > Cc: Thomas Zimmermann > Cc: Danilo Krummrich > Cc: Maxime Ripard > Cc: Kees Cook > Cc: Christophe JAILLET > Signed-off-by: Guenter Roeck Yeah, looks good to me. Thanks! Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v2] drm/nouveau/nvif: Avoid build error due to potential integer overflows
On Sat, May 18, 2024 at 11:29:23AM -0700, Guenter Roeck wrote: > Trying to build parisc:allmodconfig with gcc 12.x or later results > in the following build error. > > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd': > drivers/gpu/drm/nouveau/nvif/object.c:161:9: error: > 'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 > overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict] > 161 | memcpy(data, args->mthd.data, size); > | ^~~ > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor': > drivers/gpu/drm/nouveau/nvif/object.c:298:17: error: > 'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 > overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict] > 298 | memcpy(data, args->new.data, size); > > gcc assumes that 'sizeof(*args) + size' can overflow, which would result > in the problem. > > The problem is not new, only it is now no longer a warning but an error > since W=1 has been enabled for the drm subsystem and since Werror is > enabled for test builds. > > Rearrange arithmetic and use check_add_overflow() for validating the > allocation size to avoid the overflow. > > Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the > subsystem") > Cc: Javier Martinez Canillas > Cc: Jani Nikula > Cc: Thomas Zimmermann > Cc: Danilo Krummrich > Cc: Maxime Ripard > Cc: Kees Cook > Cc: Christophe JAILLET > Signed-off-by: Guenter Roeck Yeah, looks good to me. Thanks! Reviewed-by: Kees Cook -- Kees Cook
[PATCH] kunit/fortify: Fix memcmp() test to be amplitude agnostic
When memcmp() returns a non-zero value, only the signed bit has any meaning. The actual value may differ between implementations. Reported-by: Nathan Chancellor Closes: https://github.com/ClangBuiltLinux/linux/issues/2025 Tested-by: Nathan Chancellor Signed-off-by: Kees Cook --- Cc: linux-hardening@vger.kernel.org --- lib/fortify_kunit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/fortify_kunit.c b/lib/fortify_kunit.c index d2377e00caab..39da5b3bc649 100644 --- a/lib/fortify_kunit.c +++ b/lib/fortify_kunit.c @@ -990,7 +990,7 @@ static void fortify_test_memcmp(struct kunit *test) KUNIT_ASSERT_EQ(test, memcmp(one, two, one_len), 0); KUNIT_EXPECT_EQ(test, fortify_read_overflows, 0); /* Still in bounds, but no longer matching. */ - KUNIT_ASSERT_EQ(test, memcmp(one, two, one_len + 1), -32); + KUNIT_ASSERT_LT(test, memcmp(one, two, one_len + 1), 0); KUNIT_EXPECT_EQ(test, fortify_read_overflows, 0); /* Catch too-large ranges. */ -- 2.34.1
Re: [PATCH] dma-buf/fence-array: Add flex array to struct dma_fence_array
On Sat, May 18, 2024 at 07:47:02PM +0200, Christophe JAILLET wrote: > This is an effort to get rid of all multiplications from allocation > functions in order to prevent integer overflows [1][2]. > > The "struct dma_fence_array" can be refactored to add a flex array in order > to have the "callback structures allocated behind the array" be more > explicit. > > Do so: >- makes the code more readable and safer. >- allows using __counted_by() for additional checks >- avoids some pointer arithmetic in dma_fence_array_enable_signaling() > > Link: > https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments > [1] > Link: https://github.com/KSPP/linux/issues/160 [2] > Signed-off-by: Christophe JAILLET Yes please! :) Reviewed-by: Kees Cook -- Kees Cook
Re: [WARNING] memcpy: detected field-spanning write (size 1005) of single field "_cmd->cmd.payload" at drivers/net/wireless/intel/iwlegacy/common.c:3173 (size 320)
On Sat, May 18, 2024 at 11:29:39AM +0200, Stanislaw Gruszka wrote: > Hi > > On Fri, Apr 12, 2024 at 07:48:39PM +0200, Xose Vazquez Perez wrote: > > Hi, > > > > In Fedora kernel 6.8.5-301.fc40.x86_64, dmesg shows: > > > > [ device: 03:00.0 Network controller [0280]: Intel Corporation PRO/Wireless > > 4965 AG or AGN [Kedron] Network Connection [8086:4230] (rev 61) ] > > > > Thanks. > > > > [ 53.407607] [ cut here ] > > [ 53.407622] memcpy: detected field-spanning write (size 1005) of single > > field "_cmd->cmd.payload" at > > drivers/net/wireless/intel/iwlegacy/common.c:3173 (size 320) > > [ 53.407721] WARNING: CPU: 1 PID: 1632 at > > drivers/net/wireless/intel/iwlegacy/common.c:3173 > > il_enqueue_hcmd+0x477/0x560 [iwlegacy] > > For CMD_SIZE_HUGE we have allocated 4k, so we do not do anything wrong. > Except maybe code is convoluted, since we use same structure for > huge and small il_device_cmd allocations. > > But I'm thinking how to fix this fortify warning without refactoring and > some extra runtime cost ... > > Xose, could you test below patch? I did not tested it, but I think > it should make this particular warning gone and does not break > anything. But maybe it will trigger some others fortify warnings. tl;dr: the proposed patch should work. Refactoring will still be needed in the future. :) Long version: struct il_device_cmd { struct il_cmd_header hdr; /* uCode API */ union { u32 flags; u8 val8; u16 val16; u32 val32; struct il_tx_cmd tx; u8 payload[DEF_CMD_PAYLOAD_SIZE]; } __packed cmd; } __packed; struct il_cmd_header { ... /* command or response/notification data follows immediately */ u8 data[]; } __packed; Yes, the proposed fix will silence the warning, but this struct is certainly on Gustavo's list to fix for "flexible arrays not-at-end" warnings[1]. This memcpy() is the perfect example of why we need to refactor these kinds of structs: the object size is ambiguous for the compiler. It could be as little as sizeof(struct il_device_cmd), but it could larger, because of the "hdr" member. Right now, it depends on how we happen to address it (as your change is doing). Regardless, thanks for tracking this down and fixing it! -Kees [1] https://lore.kernel.org/lkml/ZgsAFhl90kecIR00@neat/ > > Regards > Stanislaw > > diff --git a/drivers/net/wireless/intel/iwlegacy/common.c > b/drivers/net/wireless/intel/iwlegacy/common.c > index 9d33a66a49b5..c4ccc5df6419 100644 > --- a/drivers/net/wireless/intel/iwlegacy/common.c > +++ b/drivers/net/wireless/intel/iwlegacy/common.c > @@ -3170,7 +3170,7 @@ il_enqueue_hcmd(struct il_priv *il, struct il_host_cmd > *cmd) > out_meta->callback = cmd->callback; > > out_cmd->hdr.cmd = cmd->id; > - memcpy(_cmd->cmd.payload, cmd->data, cmd->len); > + memcpy(_cmd->hdr.data, cmd->data, cmd->len); > > /* At this point, the out_cmd now has all of the incoming cmd >* information */ -- Kees Cook
Re: [PATCH] drm/nouveau/nvif: Avoid build error due to potential integer overflows
On Sat, May 18, 2024 at 06:54:36PM +0200, Christophe JAILLET wrote: > (adding linux-harden...@vger.kernel.org) > > > Le 18/05/2024 à 16:37, Guenter Roeck a écrit : > > Trying to build parisc:allmodconfig with gcc 12.x or later results > > in the following build error. > > > > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd': > > drivers/gpu/drm/nouveau/nvif/object.c:161:9: error: > > 'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 > > overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict] > >161 | memcpy(data, args->mthd.data, size); > >| ^~~ > > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor': > > drivers/gpu/drm/nouveau/nvif/object.c:298:17: error: > > 'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 > > overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict] > >298 | memcpy(data, args->new.data, size); > > > > gcc assumes that 'sizeof(*args) + size' can overflow, which would result > > in the problem. > > > > The problem is not new, only it is now no longer a warning but an error > > since W=1 > > has been enabled for the drm subsystem and since Werror is enabled for test > > builds. > > > > Rearrange arithmetic and add extra size checks to avoid the overflow. > > > > Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the > > subsystem") > > Cc: Javier Martinez Canillas > > > > Cc: Jani Nikula > > Cc: Thomas Zimmermann > > Cc: Danilo Krummrich > > Cc: Maxime Ripard > > Signed-off-by: Guenter Roeck > > --- > > checkpatch complains about the line length in the description and the > > (pre-existing) > > assignlemts in if conditions, but I did not want to split lines in the > > description > > or rearrange the code further. > > > > I don't know why I only see the problem with parisc builds (at least so > > far). > > > > drivers/gpu/drm/nouveau/nvif/object.c | 8 +--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nvif/object.c > > b/drivers/gpu/drm/nouveau/nvif/object.c > > index 4d1aaee8fe15..baf623a48874 100644 > > --- a/drivers/gpu/drm/nouveau/nvif/object.c > > +++ b/drivers/gpu/drm/nouveau/nvif/object.c > > @@ -145,8 +145,9 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, > > void *data, u32 size) > > u8 stack[128]; > > int ret; > > - if (sizeof(*args) + size > sizeof(stack)) { > > - if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL))) > > + if (size > sizeof(stack) - sizeof(*args)) { > > + if (size > INT_MAX || > > + !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL))) > > Hi, > > Would it be cleaner or better to use size_add(sizeof(*args), size)? I think the INT_MAX test is actually better in this case because nvif_object_ioctl()'s size argument is u32: ret = nvif_object_ioctl(object, args, sizeof(*args) + size, NULL); So that could wrap around, even though the allocation may not. Better yet, since "sizeof(*args) + size" is repeated 3 times in the function, I'd recommend: ... u32 args_size; if (check_add_overflow(sizeof(*args), size, _size)) return -ENOMEM; if (args_size > sizeof(stack)) { if (!(args = kmalloc(args_size, GFP_KERNEL))) return -ENOMEM; } else { args = (void *)stack; } ... ret = nvif_object_ioctl(object, args, args_size, NULL); This will catch the u32 overflow to nvif_object_ioctl(), catch an allocation underflow on 32-bits systems, and make the code more readable. :) -Kees -- Kees Cook
Re: [PATCH] drm/nouveau/nvif: Avoid build error due to potential integer overflows
On Sat, May 18, 2024 at 06:54:36PM +0200, Christophe JAILLET wrote: > (adding linux-harden...@vger.kernel.org) > > > Le 18/05/2024 à 16:37, Guenter Roeck a écrit : > > Trying to build parisc:allmodconfig with gcc 12.x or later results > > in the following build error. > > > > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd': > > drivers/gpu/drm/nouveau/nvif/object.c:161:9: error: > > 'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 > > overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict] > >161 | memcpy(data, args->mthd.data, size); > >| ^~~ > > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor': > > drivers/gpu/drm/nouveau/nvif/object.c:298:17: error: > > 'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 > > overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict] > >298 | memcpy(data, args->new.data, size); > > > > gcc assumes that 'sizeof(*args) + size' can overflow, which would result > > in the problem. > > > > The problem is not new, only it is now no longer a warning but an error > > since W=1 > > has been enabled for the drm subsystem and since Werror is enabled for test > > builds. > > > > Rearrange arithmetic and add extra size checks to avoid the overflow. > > > > Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the > > subsystem") > > Cc: Javier Martinez Canillas > > > > Cc: Jani Nikula > > Cc: Thomas Zimmermann > > Cc: Danilo Krummrich > > Cc: Maxime Ripard > > Signed-off-by: Guenter Roeck > > --- > > checkpatch complains about the line length in the description and the > > (pre-existing) > > assignlemts in if conditions, but I did not want to split lines in the > > description > > or rearrange the code further. > > > > I don't know why I only see the problem with parisc builds (at least so > > far). > > > > drivers/gpu/drm/nouveau/nvif/object.c | 8 +--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/nouveau/nvif/object.c > > b/drivers/gpu/drm/nouveau/nvif/object.c > > index 4d1aaee8fe15..baf623a48874 100644 > > --- a/drivers/gpu/drm/nouveau/nvif/object.c > > +++ b/drivers/gpu/drm/nouveau/nvif/object.c > > @@ -145,8 +145,9 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, > > void *data, u32 size) > > u8 stack[128]; > > int ret; > > - if (sizeof(*args) + size > sizeof(stack)) { > > - if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL))) > > + if (size > sizeof(stack) - sizeof(*args)) { > > + if (size > INT_MAX || > > + !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL))) > > Hi, > > Would it be cleaner or better to use size_add(sizeof(*args), size)? I think the INT_MAX test is actually better in this case because nvif_object_ioctl()'s size argument is u32: ret = nvif_object_ioctl(object, args, sizeof(*args) + size, NULL); So that could wrap around, even though the allocation may not. Better yet, since "sizeof(*args) + size" is repeated 3 times in the function, I'd recommend: ... u32 args_size; if (check_add_overflow(sizeof(*args), size, _size)) return -ENOMEM; if (args_size > sizeof(stack)) { if (!(args = kmalloc(args_size, GFP_KERNEL))) return -ENOMEM; } else { args = (void *)stack; } ... ret = nvif_object_ioctl(object, args, args_size, NULL); This will catch the u32 overflow to nvif_object_ioctl(), catch an allocation underflow on 32-bits systems, and make the code more readable. :) -Kees -- Kees Cook
Re: [PATCH 1/2] selftests: harness: remove unneeded __constructor_order_last()
On Sat, May 18, 2024 at 12:29:00PM +0900, Masahiro Yamada wrote: > It will be set to "true" eventually, > but __LIST_APPEND() still sees "false" > on backward-order systems. Ah, yes -- you are right. I looked through the commit history (I had to go back to when the seccomp test, and the harness, was out of tree). There was a time when the logic happened during the list walking, rather than during list _creation_. I was remembering the former. So, yes, let's make this change. As you say, it also solves for defining TEST_HARNESS_MAIN before the tests. Thank you! I'd still like to replace all the open-coded TEST_HARNESS_MAIN calls, though. -- Kees Cook
[PATCH] selftests: drivers/s390x: Use SKIP() during FIXTURE_SETUP
Instead of mixing selftest harness and ksft helpers, perform SKIP testing from the FIXTURE_SETUPs. This also means TEST_HARNESS_MAIN does not need to be open-coded. Signed-off-by: Kees Cook --- Cc: Christian Borntraeger Cc: Janosch Frank Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: Shuah Khan Cc: Masahiro Yamada Cc: k...@vger.kernel.org Cc: linux-kselftest@vger.kernel.org --- I wasn't able to build or run test this, since it's very s390 specific, it seems? --- .../drivers/s390x/uvdevice/test_uvdevice.c| 32 --- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c b/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c index ea0cdc37b44f..9622fac42597 100644 --- a/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c +++ b/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c @@ -18,6 +18,16 @@ #define UV_PATH "/dev/uv" #define BUFFER_SIZE 0x200 + +#define open_uv() \ +do { \ + self->uv_fd = open(UV_PATH, O_ACCMODE); \ + if (self->uv_fd < 0) \ + SKIP(return, "No uv-device or cannot access " UV_PATH "\n" \ +"Enable CONFIG_S390_UV_UAPI and check the access rights on " \ +UV_PATH ".\n"); \ +} while (0) + FIXTURE(uvio_fixture) { int uv_fd; struct uvio_ioctl_cb uvio_ioctl; @@ -37,7 +47,7 @@ FIXTURE_VARIANT_ADD(uvio_fixture, att) { FIXTURE_SETUP(uvio_fixture) { - self->uv_fd = open(UV_PATH, O_ACCMODE); + open_uv(); self->uvio_ioctl.argument_addr = (__u64)self->buffer; self->uvio_ioctl.argument_len = variant->arg_size; @@ -159,7 +169,7 @@ FIXTURE(attest_fixture) { FIXTURE_SETUP(attest_fixture) { - self->uv_fd = open(UV_PATH, O_ACCMODE); + open_uv(); self->uvio_ioctl.argument_addr = (__u64)>uvio_attest; self->uvio_ioctl.argument_len = sizeof(self->uvio_attest); @@ -257,20 +267,4 @@ TEST_F(attest_fixture, att_inval_addr) att_inval_addr_test(>uvio_attest.meas_addr, _metadata, self); } -static void __attribute__((constructor)) __constructor_order_last(void) -{ - if (!__constructor_order) - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD; -} - -int main(int argc, char **argv) -{ - int fd = open(UV_PATH, O_ACCMODE); - - if (fd < 0) - ksft_exit_skip("No uv-device or cannot access " UV_PATH "\n" - "Enable CONFIG_S390_UV_UAPI and check the access rights on " - UV_PATH ".\n"); - close(fd); - return test_harness_run(argc, argv); -} +TEST_HARNESS_MAIN -- 2.34.1
[PATCH] selftests: hid: Do not open-code TEST_HARNESS_MAIN
Avoid open-coding TEST_HARNESS_MAIN. (It might change, for example.) Signed-off-by: Kees Cook --- Cc: Jiri Kosina Cc: Benjamin Tissoires Cc: Shuah Khan Cc: Masahiro Yamada Cc: linux-in...@vger.kernel.org Cc: linux-kselftest@vger.kernel.org --- tools/testing/selftests/hid/hid_bpf.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/hid/hid_bpf.c b/tools/testing/selftests/hid/hid_bpf.c index f825623e3edc..943fa62a4f78 100644 --- a/tools/testing/selftests/hid/hid_bpf.c +++ b/tools/testing/selftests/hid/hid_bpf.c @@ -961,17 +961,11 @@ static int libbpf_print_fn(enum libbpf_print_level level, return 0; } -static void __attribute__((constructor)) __constructor_order_last(void) -{ - if (!__constructor_order) - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD; -} - -int main(int argc, char **argv) +static void __attribute__((constructor)) __one_time_init(void) { /* Use libbpf 1.0 API mode */ libbpf_set_strict_mode(LIBBPF_STRICT_ALL); libbpf_set_print(libbpf_print_fn); - - return test_harness_run(argc, argv); } + +TEST_HARNESS_MAIN -- 2.34.1
[PATCH] selftests: rtc: rtctest: Do not open-code TEST_HARNESS_MAIN
Argument processing is specific to the test harness code. Any optional information needs to be passed via environment variables. Move alternate path to the RTC_DEV environment variable. Also do not open-code TEST_HARNESS_MAIN because its definition may change. Additionally, setup checking can be done in the FIXTURE_SETUP(). With this adjustment, also improve the error reporting when the device cannot be opened. Signed-off-by: Kees Cook --- Cc: Alexandre Belloni Cc: Shuah Khan Cc: Masahiro Yamada Cc: linux-...@vger.kernel.org Cc: linux-kselftest@vger.kernel.org --- tools/testing/selftests/rtc/Makefile | 2 +- tools/testing/selftests/rtc/rtctest.c | 66 +-- 2 files changed, 13 insertions(+), 55 deletions(-) diff --git a/tools/testing/selftests/rtc/Makefile b/tools/testing/selftests/rtc/Makefile index 55198ecc04db..654f9d58da3c 100644 --- a/tools/testing/selftests/rtc/Makefile +++ b/tools/testing/selftests/rtc/Makefile @@ -1,5 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 -CFLAGS += -O3 -Wl,-no-as-needed -Wall +CFLAGS += -O3 -Wl,-no-as-needed -Wall $(KHDR_INCLUDES) LDLIBS += -lrt -lpthread -lm TEST_GEN_PROGS = rtctest diff --git a/tools/testing/selftests/rtc/rtctest.c b/tools/testing/selftests/rtc/rtctest.c index 63ce02d1d5cc..41cfefcc20e1 100644 --- a/tools/testing/selftests/rtc/rtctest.c +++ b/tools/testing/selftests/rtc/rtctest.c @@ -30,7 +30,18 @@ FIXTURE(rtc) { }; FIXTURE_SETUP(rtc) { + char *alternate = getenv("RTC_DEV"); + + if (alternate) + rtc_file = alternate; + self->fd = open(rtc_file, O_RDONLY); + + if (self->fd == -1 && errno == ENOENT) + SKIP(return, "Skipping test since %s does not exist", rtc_file); + EXPECT_NE(-1, self->fd) { + TH_LOG("%s: %s\n", rtc_file, strerror(errno)); + } } FIXTURE_TEARDOWN(rtc) { @@ -41,10 +52,6 @@ TEST_F(rtc, date_read) { int rc; struct rtc_time rtc_tm; - if (self->fd == -1 && errno == ENOENT) - SKIP(return, "Skipping test since %s does not exist", rtc_file); - ASSERT_NE(-1, self->fd); - /* Read the RTC time/date */ rc = ioctl(self->fd, RTC_RD_TIME, _tm); ASSERT_NE(-1, rc); @@ -88,10 +95,6 @@ TEST_F_TIMEOUT(rtc, date_read_loop, READ_LOOP_DURATION_SEC + 2) { struct rtc_time rtc_tm; time_t start_rtc_read, prev_rtc_read; - if (self->fd == -1 && errno == ENOENT) - SKIP(return, "Skipping test since %s does not exist", rtc_file); - ASSERT_NE(-1, self->fd); - TH_LOG("Continuously reading RTC time for %ds (with %dms breaks after every read).", READ_LOOP_DURATION_SEC, READ_LOOP_SLEEP_MS); @@ -126,10 +129,6 @@ TEST_F_TIMEOUT(rtc, uie_read, NUM_UIE + 2) { int i, rc, irq = 0; unsigned long data; - if (self->fd == -1 && errno == ENOENT) - SKIP(return, "Skipping test since %s does not exist", rtc_file); - ASSERT_NE(-1, self->fd); - /* Turn on update interrupts */ rc = ioctl(self->fd, RTC_UIE_ON, 0); if (rc == -1) { @@ -155,10 +154,6 @@ TEST_F(rtc, uie_select) { int i, rc, irq = 0; unsigned long data; - if (self->fd == -1 && errno == ENOENT) - SKIP(return, "Skipping test since %s does not exist", rtc_file); - ASSERT_NE(-1, self->fd); - /* Turn on update interrupts */ rc = ioctl(self->fd, RTC_UIE_ON, 0); if (rc == -1) { @@ -198,10 +193,6 @@ TEST_F(rtc, alarm_alm_set) { time_t secs, new; int rc; - if (self->fd == -1 && errno == ENOENT) - SKIP(return, "Skipping test since %s does not exist", rtc_file); - ASSERT_NE(-1, self->fd); - rc = ioctl(self->fd, RTC_RD_TIME, ); ASSERT_NE(-1, rc); @@ -256,10 +247,6 @@ TEST_F(rtc, alarm_wkalm_set) { time_t secs, new; int rc; - if (self->fd == -1 && errno == ENOENT) - SKIP(return, "Skipping test since %s does not exist", rtc_file); - ASSERT_NE(-1, self->fd); - rc = ioctl(self->fd, RTC_RD_TIME, ); ASSERT_NE(-1, rc); @@ -308,10 +295,6 @@ TEST_F_TIMEOUT(rtc, alarm_alm_set_minute, 65) { time_t secs, new; int rc; - if (self->fd == -1 && errno == ENOENT) - SKIP(return, "Skipping test since %s does not exist", rtc_file); - ASSERT_NE(-1, self->fd); - rc = ioctl(self->fd, RTC_RD_TIME, ); ASSERT_NE(-1, rc); @@ -366,10 +349,6 @@ TEST_F_TIMEOUT(rtc, alarm_wkalm_set_minute, 65) { time_t secs, new; int rc; - if (self->fd == -1 && errno == ENOENT) - SKIP(return, "Skipping test since
Re: [PATCH 1/2] selftests: harness: remove unneeded __constructor_order_last()
On Fri, May 17, 2024 at 08:45:05PM +0900, Masahiro Yamada wrote: > __constructor_order_last() is unneeded. > > If __constructor_order_last() is not called on reverse-order systems, > __constructor_order will remain 0 instead of being set to > _CONSTRUCTOR_ORDER_BACKWARD (= -1). > > __LIST_APPEND() will still take the 'else' branch, so there is no > difference in the behavior. > > Signed-off-by: Masahiro Yamada > --- > > .../selftests/drivers/s390x/uvdevice/test_uvdevice.c | 6 -- > tools/testing/selftests/hid/hid_bpf.c | 6 -- > tools/testing/selftests/kselftest_harness.h| 10 +- > tools/testing/selftests/rtc/rtctest.c | 7 --- > 4 files changed, 1 insertion(+), 28 deletions(-) > > diff --git a/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c > b/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c > index ea0cdc37b44f..7ee7492138c6 100644 > --- a/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c > +++ b/tools/testing/selftests/drivers/s390x/uvdevice/test_uvdevice.c > @@ -257,12 +257,6 @@ TEST_F(attest_fixture, att_inval_addr) > att_inval_addr_test(>uvio_attest.meas_addr, _metadata, self); > } > > -static void __attribute__((constructor)) __constructor_order_last(void) > -{ > - if (!__constructor_order) > - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD; > -} > - > int main(int argc, char **argv) > { > int fd = open(UV_PATH, O_ACCMODE); > diff --git a/tools/testing/selftests/hid/hid_bpf.c > b/tools/testing/selftests/hid/hid_bpf.c > index 2cf96f818f25..f47feef2aced 100644 > --- a/tools/testing/selftests/hid/hid_bpf.c > +++ b/tools/testing/selftests/hid/hid_bpf.c > @@ -853,12 +853,6 @@ static int libbpf_print_fn(enum libbpf_print_level level, > return 0; > } > > -static void __attribute__((constructor)) __constructor_order_last(void) > -{ > - if (!__constructor_order) > - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD; > -} > - > int main(int argc, char **argv) > { > /* Use libbpf 1.0 API mode */ > diff --git a/tools/testing/selftests/kselftest_harness.h > b/tools/testing/selftests/kselftest_harness.h > index ba3ddeda24bf..60c1cf5b0f0d 100644 > --- a/tools/testing/selftests/kselftest_harness.h > +++ b/tools/testing/selftests/kselftest_harness.h > @@ -444,12 +444,6 @@ > * Use once to append a main() to the test file. > */ > #define TEST_HARNESS_MAIN \ > - static void __attribute__((constructor)) \ > - __constructor_order_last(void) \ > - { \ > - if (!__constructor_order) \ > - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD; \ > - } \ > int main(int argc, char **argv) { \ > return test_harness_run(argc, argv); \ > } This won't work. All constructors are executed, so we have to figure out which is run _first_. Switching this to a boolean means we gain no information about ordering: it'll always be set to "true". We need to detect which constructor sets it first so that we can walk the lists (that are built via all the constructors in between) in the correct order. > @@ -846,7 +840,6 @@ static struct __fixture_metadata *__fixture_list = > &_fixture_global; > static int __constructor_order; > > #define _CONSTRUCTOR_ORDER_FORWARD 1 > -#define _CONSTRUCTOR_ORDER_BACKWARD -1 > > static inline void __register_fixture(struct __fixture_metadata *f) > { > @@ -1272,8 +1265,7 @@ static int test_harness_run(int argc, char **argv) > > static void __attribute__((constructor)) __constructor_order_first(void) > { > - if (!__constructor_order) > - __constructor_order = _CONSTRUCTOR_ORDER_FORWARD; > + __constructor_order = _CONSTRUCTOR_ORDER_FORWARD; > } > > #endif /* __KSELFTEST_HARNESS_H */ > diff --git a/tools/testing/selftests/rtc/rtctest.c > b/tools/testing/selftests/rtc/rtctest.c > index 63ce02d1d5cc..9647b14b47c5 100644 > --- a/tools/testing/selftests/rtc/rtctest.c > +++ b/tools/testing/selftests/rtc/rtctest.c > @@ -410,13 +410,6 @@ TEST_F_TIMEOUT(rtc, alarm_wkalm_set_minute, 65) { > ASSERT_EQ(new, secs); > } > > -static void __attribute__((constructor)) > -__constructor_order_last(void) > -{ > - if (!__constructor_order) > - __constructor_order = _CONSTRUCTOR_ORDER_BACKWARD; > -} > - > int main(int argc, char **argv) > { > switch (argc) { A better question is why these tests are open-coding the execution of "main"... -- Kees Cook