Re: [PATCH v6 13/41] mm: Make pte_mkwrite() take a VMA
On Sat, Feb 18, 2023 at 01:14:05PM -0800, Rick Edgecombe wrote: > The x86 Control-flow Enforcement Technology (CET) feature includes a new > type of memory called shadow stack. This shadow stack memory has some > unusual properties, which requires some core mm changes to function > properly. > > One of these unusual properties is that shadow stack memory is writable, > but only in limited ways. These limits are applied via a specific PTE > bit combination. Nevertheless, the memory is writable, and core mm code > will need to apply the writable permissions in the typical paths that > call pte_mkwrite(). > > In addition to VM_WRITE, the shadow stack VMA's will have a flag denoting > that they are special shadow stack flavor of writable memory. So make > pte_mkwrite() take a VMA, so that the x86 implementation of it can know to > create regular writable memory or shadow stack memory. > > Apply the same changes for pmd_mkwrite() and huge_pte_mkwrite(). > > No functional change. > > Cc: linux-...@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > Cc: linux-al...@vger.kernel.org > Cc: linux-snps-...@lists.infradead.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-c...@vger.kernel.org > Cc: linux-hexa...@vger.kernel.org > Cc: linux-i...@vger.kernel.org > Cc: loonga...@lists.linux.dev > Cc: linux-m...@lists.linux-m68k.org > Cc: Michal Simek > Cc: Dinh Nguyen > Cc: linux-m...@vger.kernel.org > Cc: openr...@lists.librecores.org > Cc: linux-par...@vger.kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-ri...@lists.infradead.org > Cc: linux-s...@vger.kernel.org > Cc: linux...@vger.kernel.org > Cc: sparcli...@vger.kernel.org > Cc: linux...@lists.infradead.org > Cc: xen-de...@lists.xenproject.org > Cc: linux-a...@vger.kernel.org > Cc: linux...@kvack.org > Tested-by: Pengfei Xu > Suggested-by: David Hildenbrand > Signed-off-by: Rick Edgecombe I'm not an arch maintainer, but it looks like a correct tree-wide refactor. Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH] powerpc/rtas: Replace one-element arrays with flexible arrays
On Fri, Jan 27, 2023 at 07:10:28AM -0600, Nathan Lynch wrote: > Andrew Donnellan writes: > > Using a one-element array as a fake flexible array is deprecated. > > > > Replace the one-element flexible arrays in rtas-types.h with C99 standard > > flexible array members instead. > > > > This helps us move towards enabling -fstrict-flex-arrays=3 in future. > > > > Found using scripts/coccinelle/misc/flexible_array.cocci. > > > > Cc: Nathan Lynch > > Cc: Leonardo Bras > > Cc: linux-harden...@vger.kernel.org > > Link: https://github.com/KSPP/linux/issues/21 > > Link: https://github.com/KSPP/linux/issues/79 > > Signed-off-by: Andrew Donnellan > > --- > > arch/powerpc/include/asm/rtas-types.h | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/rtas-types.h > > b/arch/powerpc/include/asm/rtas-types.h > > index 8df6235d64d1..40ec03a05c0b 100644 > > --- a/arch/powerpc/include/asm/rtas-types.h > > +++ b/arch/powerpc/include/asm/rtas-types.h > > @@ -44,7 +44,7 @@ struct rtas_error_log { > > */ > > u8 byte3; /* General event or error*/ > > __be32 extended_log_length;/* length in bytes */ > > - unsigned char buffer[1]; /* Start of extended log */ > > + unsigned char buffer[]; /* Start of extended log */ > > /* Variable length. */ > > }; > > > > @@ -85,7 +85,7 @@ struct rtas_ext_event_log_v6 { > > /* that defines the format for */ > > /* the vendor specific log type */ > > /* Byte 16-end of log */ > > - u8 vendor_log[1]; /* Start of vendor specific log */ > > + u8 vendor_log[];/* Start of vendor specific log */ > > /* Variable length. */ > > }; > > I see at least one place that consults the size of one of these structs, > in get_pseries_errorlog(): > > /* Check that we understand the format */ > if (ext_log_length < sizeof(struct rtas_ext_event_log_v6) || ... > > Don't all such sites need to be audited/adjusted for changes like this? Yeah, I'd expect a binary comparison[1] before/after to catch things like this. E.g. the following C files mention those structs: arch/powerpc/platforms/pseries/io_event_irq.c arch/powerpc/platforms/pseries/ras.c arch/powerpc/kernel/rtasd.c arch/powerpc/kernel/rtas.c -Kees [1] https://outflux.net/blog/archives/2022/06/24/finding-binary-differences/ -- Kees Cook
Re: [PATCH mm-unstable v1 20/20] mm: rename FOLL_FORCE to FOLL_PTRACE
On Fri, Nov 18, 2022 at 12:09:02PM +0100, Peter Zijlstra wrote: > On Wed, Nov 16, 2022 at 10:16:34AM -0800, Linus Torvalds wrote: > > Following the history of it is a big of a mess, because there's a > > number of renamings and re-organizations, but it seems to go back to > > 2007 and commit b6a2fea39318 ("mm: variable length argument support"). > > I went back and read parts of the discussions with Ollie, and the > .force=1 thing just magically appeared one day when we were sending > work-in-progress patches back and forth without mention of where it came > from :-/ > > And I certainly can't remember now.. > > Looking at it now, I have the same reaction as both you and Kees had, it > seems entirely superflous. So I'm all for trying to remove it. Thanks for digging through the history! I've pushed the change to -next: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=for-next/execve=cd57e443831d8eeb083c7165bce195d886e216d4 -- Kees Cook
Re: [PATCH mm-unstable v1 20/20] mm: rename FOLL_FORCE to FOLL_PTRACE
On Thu, Nov 17, 2022 at 03:20:01PM -0800, Linus Torvalds wrote: > On Thu, Nov 17, 2022 at 2:58 PM Kees Cook wrote: > > > > Oh, er, why does get_arg_page() even need FOLL_FORCE? This is writing the > > new stack contents to the nascent brpm->vma, which was newly allocated > > with VM_STACK_FLAGS, which an arch can override, but they all appear to > > include > > VM_WRITE | VM_MAYWRITE. > > Yeah, it does seem entirely superfluous. > > It's been there since the very beginning (although in that original > commit b6a2fea39318 it was there as a '1' to the 'force' argument to > get_user_pages()). > > I *think* it can be just removed. But as long as it exists, it should > most definitely not be renamed to FOLL_PTRACE. > > There's a slight worry that it currently hides some other setup issue > that makes it matter, since it's been that way so long, but I can't > see what it is. My test system boots happily with it removed. I'll throw it into -next and see if anything melts... -- Kees Cook
Re: [PATCH mm-unstable v1 20/20] mm: rename FOLL_FORCE to FOLL_PTRACE
On Wed, Nov 16, 2022 at 10:16:34AM -0800, Linus Torvalds wrote: > There _are_ also small random cases too, like get_cmdline(). Maybe > that counts as ptrace, but the execve() case most definitely does not. Oh, er, why does get_arg_page() even need FOLL_FORCE? This is writing the new stack contents to the nascent brpm->vma, which was newly allocated with VM_STACK_FLAGS, which an arch can override, but they all appear to include VM_WRITE | VM_MAYWRITE. -- Kees Cook
Re: [PATCH v3 1/3] treewide: use get_random_u32_below() instead of deprecated function
On Thu, Nov 17, 2022 at 09:29:04PM +0100, Jason A. Donenfeld wrote: > This is a simple mechanical transformation done by: > > @@ > expression E; > @@ > - prandom_u32_max > + get_random_u32_below > (E) > > Reviewed-by: Greg Kroah-Hartman > Acked-by: Darrick J. Wong # for xfs > Reviewed-by: SeongJae Park # for damon > Reviewed-by: Jason Gunthorpe # for infiniband > Reviewed-by: Russell King (Oracle) # for arm > Acked-by: Ulf Hansson # for mmc > Signed-off-by: Jason A. Donenfeld Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v3 2/3] treewide: use get_random_u32_{above,below}() instead of manual loop
On Thu, Nov 17, 2022 at 09:29:05PM +0100, Jason A. Donenfeld wrote: > These cases were done with this Coccinelle: > > @@ > expression E; > identifier I; > @@ > - do { > ... when != I > - I = get_random_u32(); > ... when != I > - } while (I > E); > + I = get_random_u32_below(E + 1); > > @@ > expression E; > identifier I; > @@ > - do { > ... when != I > - I = get_random_u32(); > ... when != I > - } while (I >= E); > + I = get_random_u32_below(E); > > @@ > expression E; > identifier I; > @@ > - do { > ... when != I > - I = get_random_u32(); > ... when != I > - } while (I < E); > + I = get_random_u32_above(E - 1); > > @@ > expression E; > identifier I; > @@ > - do { > ... when != I > - I = get_random_u32(); > ... when != I > - } while (I <= E); > + I = get_random_u32_above(E); > > @@ > identifier I; > @@ > - do { > ... when != I > - I = get_random_u32(); > ... when != I > - } while (!I); > + I = get_random_u32_above(0); > > @@ > identifier I; > @@ > - do { > ... when != I > - I = get_random_u32(); > ... when != I > - } while (I == 0); > + I = get_random_u32_above(0); > > @@ > expression E; > @@ > - E + 1 + get_random_u32_below(U32_MAX - E) > + get_random_u32_above(E) > > Reviewed-by: Greg Kroah-Hartman > Signed-off-by: Jason A. Donenfeld Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v3 3/3] treewide: use get_random_u32_inclusive() when possible
On Thu, Nov 17, 2022 at 09:29:06PM +0100, Jason A. Donenfeld wrote: > These cases were done with this Coccinelle: > > @@ > expression H; > expression L; > @@ > - (get_random_u32_below(H) + L) > + get_random_u32_inclusive(L, H + L - 1) > > @@ > expression H; > expression L; > expression E; > @@ > get_random_u32_inclusive(L, > H > - + E > - - E > ) > > @@ > expression H; > expression L; > expression E; > @@ > get_random_u32_inclusive(L, > H > - - E > - + E > ) > > @@ > expression H; > expression L; > expression E; > expression F; > @@ > get_random_u32_inclusive(L, > H > - - E > + F > - + E > ) > > @@ > expression H; > expression L; > expression E; > expression F; > @@ > get_random_u32_inclusive(L, > H > - + E > + F > - - E > ) > > And then subsequently cleaned up by hand, with several automatic cases > rejected if it didn't make sense contextually. > > Reviewed-by: Greg Kroah-Hartman > Reviewed-by: Jason Gunthorpe # for infiniband > Signed-off-by: Jason A. Donenfeld > --- > arch/x86/kernel/module.c | 2 +- > crypto/rsa-pkcs1pad.c | 2 +- > crypto/testmgr.c | 10 > drivers/bus/mhi/host/internal.h | 2 +- > drivers/dma-buf/st-dma-fence-chain.c | 2 +- > drivers/infiniband/core/cma.c | 2 +- > drivers/infiniband/hw/hns/hns_roce_ah.c | 5 ++-- > drivers/mtd/nand/raw/nandsim.c| 2 +- > drivers/net/wireguard/selftest/allowedips.c | 8 +++--- > .../broadcom/brcm80211/brcmfmac/p2p.c | 2 +- > .../net/wireless/intel/iwlwifi/mvm/mac-ctxt.c | 2 +- > fs/f2fs/segment.c | 6 ++--- > kernel/kcsan/selftest.c | 2 +- > lib/test_hexdump.c| 10 > lib/test_printf.c | 2 +- > lib/test_vmalloc.c| 6 ++--- > mm/kasan/kasan_test.c | 6 ++--- > mm/kfence/kfence_test.c | 2 +- > mm/swapfile.c | 5 ++-- > net/bluetooth/mgmt.c | 5 ++-- > net/core/pktgen.c | 25 --- > net/ipv4/tcp_input.c | 2 +- > net/ipv6/addrconf.c | 6 ++--- > net/netfilter/nf_nat_helper.c | 2 +- > net/xfrm/xfrm_state.c | 2 +- > 25 files changed, 56 insertions(+), 64 deletions(-) Even the diffstat agrees this is a nice clean-up. :) Reviewed-by: Kees Cook The only comment I have is that maybe these cases can just be left as-is with _below()? > - size_t len = get_random_u32_below(rs) + gs; > + size_t len = get_random_u32_inclusive(gs, rs + gs - 1); It seems like writing it in the form of base plus [0, limit) is clearer? size_t len = gs + get_random_u32_below(rs); But there is only a handful, so *shrug* All the others are much cleaner rewritten as _inclusive(). -- Kees Cook
Re: [PATCH v2 3/3] treewide: use get_random_u32_between() when possible
On November 16, 2022 4:43:54 PM PST, "Jason A. Donenfeld" wrote: >On Wed, Nov 16, 2022 at 04:31:18PM -0800, Kees Cook wrote: >> On Thu, Nov 17, 2022 at 01:03:14AM +0100, Jason A. Donenfeld wrote: >> > On Thu, Nov 17, 2022 at 12:55:47AM +0100, Jason A. Donenfeld wrote: >> > > 2) What to call it: >> > >- between I still like, because it mirrors "I'm thinking of a number >> > > between 1 and 10 and..." that everybody knows, >> > >- inclusive I guess works, but it's not a preposition, >> > >- bikeshed color #3? >> > >> > - between >> > - ranged >> > - spanning >> > >> > https://www.thefreedictionary.com/List-of-prepositions.htm >> > - amid >> > >> > Sigh, names. >> >> I think "inclusive" is best. > >I find it not very descriptive of what the function does. Is there one >you like second best? Or are you convinced they're all much much much >worse than "inclusive" that they shouldn't be considered? Right, I don't think any are sufficiently descriptive. "Incluisve" with two arguments seems unambiguous and complete to me. :) -- Kees Cook
Re: [PATCH v2 3/3] treewide: use get_random_u32_between() when possible
On Thu, Nov 17, 2022 at 01:03:14AM +0100, Jason A. Donenfeld wrote: > On Thu, Nov 17, 2022 at 12:55:47AM +0100, Jason A. Donenfeld wrote: > > 2) What to call it: > >- between I still like, because it mirrors "I'm thinking of a number > > between 1 and 10 and..." that everybody knows, > >- inclusive I guess works, but it's not a preposition, > >- bikeshed color #3? > > - between > - ranged > - spanning > > https://www.thefreedictionary.com/List-of-prepositions.htm > - amid > > Sigh, names. I think "inclusive" is best. The other words still don't provide unambiguous language. It's the language used in formal math, e.g. sigma-notation, etc. It's an adjective for "get random" (verb, noun). -- Kees Cook
Re: [PATCH v2 3/3] treewide: use get_random_u32_between() when possible
On Mon, Nov 14, 2022 at 05:45:58PM +0100, Jason A. Donenfeld wrote: > - (get_random_u32_below(1024) + 1) * PAGE_SIZE; > + get_random_u32_between(1, 1024 + 1) * PAGE_SIZE; I really don't like "between". Can't this be named "inclusive" (and avoid adding 1 everywhere, which seems ugly), or at least named something less ambiguous? > - n = get_random_u32_below(100) + 1; > + n = get_random_u32_between(1, 101); Because I find this much less readable. "Below 100" is clear: 0-99 inclusive, plus 1, so 1-100 inclusive. "Between 1 and 101" is not obvious to me to mean: 1-100 inclusive. These seem so much nicer: get_random_u32_inclusive(1, 1024) get_random_u32_inclusive(1, 100) -- Kees Cook
Re: [PATCH v5 0/7] treewide cleanup of random integer usage
On Fri, Oct 07, 2022 at 11:53:52PM -0600, Jason A. Donenfeld wrote: > This is a five part treewide cleanup of random integer handling. The > rules for random integers are: Reviewing the delta between of my .cocci rules and your v5, everything matches, except for get_random_int() conversions for files not in your tree: diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c b/drivers/gpu/drm/tests/drm_buddy_test.c index 7a2b2d6bc3fe..62f69589a72d 100644 --- a/drivers/gpu/drm/tests/drm_buddy_test.c +++ b/drivers/gpu/drm/tests/drm_buddy_test.c @@ -729,7 +729,7 @@ static void drm_test_buddy_alloc_limit(struct kunit *test) static int drm_buddy_init_test(struct kunit *test) { while (!random_seed) - random_seed = get_random_int(); + random_seed = get_random_u32(); return 0; } diff --git a/drivers/gpu/drm/tests/drm_mm_test.c b/drivers/gpu/drm/tests/drm_mm_test.c index 659d1af4dca7..c4b66eeae203 100644 --- a/drivers/gpu/drm/tests/drm_mm_test.c +++ b/drivers/gpu/drm/tests/drm_mm_test.c @@ -2212,7 +2212,7 @@ static void drm_test_mm_color_evict_range(struct kunit *test) static int drm_mm_init_test(struct kunit *test) { while (!random_seed) - random_seed = get_random_int(); + random_seed = get_random_u32(); return 0; } So, I guess I mean to say that "prandom: remove unused functions" is going to cause some pain. :) Perhaps don't push that to -next, and do a final pass next merge window to catch any new stuff, and then send those updates and the removal before -rc1 closes? -- Kees Cook
Re: [PATCH v4 2/6] treewide: use prandom_u32_max() when possible
[resending because I failed to CC] On October 7, 2022 7:21:28 PM PDT, "Jason A. Donenfeld" wrote: >On Fri, Oct 07, 2022 at 03:47:44PM -0700, Kees Cook wrote: >> On Fri, Oct 07, 2022 at 12:01:03PM -0600, Jason A. Donenfeld wrote: >> > Rather than incurring a division or requesting too many random bytes for >> > the given range, use the prandom_u32_max() function, which only takes >> > the minimum required bytes from the RNG and avoids divisions. >> >> I actually meant splitting the by-hand stuff by subsystem, but nearly >> all of these can be done mechanically too, so it shouldn't be bad. Notes >> below... > >Oh, cool, more coccinelle. You're basically giving me a class on these >recipes. Much appreciated. You're welcome! This was a fun exercise. :) > >> > [...] >> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c >> > index 92bcc1768f0b..87203429f802 100644 >> > --- a/arch/arm64/kernel/process.c >> > +++ b/arch/arm64/kernel/process.c >> > @@ -595,7 +595,7 @@ unsigned long __get_wchan(struct task_struct *p) >> > unsigned long arch_align_stack(unsigned long sp) >> > { >> >if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space) >> > - sp -= get_random_int() & ~PAGE_MASK; >> > + sp -= prandom_u32_max(PAGE_SIZE); >> >return sp & ~0xf; >> > } >> > >> >> @mask@ >> expression MASK; >> @@ >> >> - (get_random_int() & ~(MASK)) >> + prandom_u32_max(MASK) > >Not quite! PAGE_MASK != PAGE_SIZE. In this case, things get a litle >more complicated where you can do: > >get_random_int() & MASK == prandom_u32_max(MASK + 1) >*only if all the top bits of MASK are set* That is, if MASK one less Oh whoops! Yes, right, I totally misread SIZE as MASK. >than a power of two. Or if MASK & (MASK + 1) == 0. > >(If those top bits aren't set, you can technically do >prandom_u32_max(MASK >> n + 1) << n. That'd be a nice thing to work out. >But yeesh, maybe a bit much for the time being and probably a bit beyond >coccinelle.) > >This case here, though, is a bit more special, where we can just rely on >an obvious given kernel identity. Namely, PAGE_MASK == ~(PAGE_SIZE - 1). >So ~PAGE_MASK == PAGE_SIZE - 1. >So get_random_int() & ~PAGE_MASK == prandom_u32_max(PAGE_SIZE - 1 + 1). >So get_random_int() & ~PAGE_MASK == prandom_u32_max(PAGE_SIZE). > >And most importantly, this makes the code more readable, since everybody >knows what bounding by PAGE_SIZE means, where as what on earth is >happening with the &~PAGE_MASK thing. So it's a good change. I'll try to >teach coccinelle about that special case. Yeah, it should be possible to just check for the literal. > > > >> > diff --git a/arch/loongarch/kernel/vdso.c b/arch/loongarch/kernel/vdso.c >> > index f32c38abd791..8c9826062652 100644 >> > --- a/arch/loongarch/kernel/vdso.c >> > +++ b/arch/loongarch/kernel/vdso.c >> > @@ -78,7 +78,7 @@ static unsigned long vdso_base(void) >> >unsigned long base = STACK_TOP; >> > >> >if (current->flags & PF_RANDOMIZE) { >> > - base += get_random_int() & (VDSO_RANDOMIZE_SIZE - 1); >> > + base += prandom_u32_max(VDSO_RANDOMIZE_SIZE); >> >base = PAGE_ALIGN(base); >> >} >> > >> >> @minus_one@ >> expression FULL; >> @@ >> >> - (get_random_int() & ((FULL) - 1) >> + prandom_u32_max(FULL) > >Ahh, well, okay, this is the example I mentioned above. Only works if >FULL is saturated. Any clever way to get coccinelle to prove that? Can >it look at the value of constants? I'm not sure if Cocci will do that without a lot of work. The literals trick I used below would need a lot of fanciness. :) > >> >> > diff --git a/arch/parisc/kernel/vdso.c b/arch/parisc/kernel/vdso.c >> > index 63dc44c4c246..47e5960a2f96 100644 >> > --- a/arch/parisc/kernel/vdso.c >> > +++ b/arch/parisc/kernel/vdso.c >> > @@ -75,7 +75,7 @@ int arch_setup_additional_pages(struct linux_binprm >> > *bprm, >> > >> >map_base = mm->mmap_base; >> >if (current->flags & PF_RANDOMIZE) >> > - map_base -= (get_random_int() & 0x1f) * PAGE_SIZE; >> > + map_base -= prandom_u32_max(0x20) * PAGE_SIZE; >> > >> >vdso_text_start = get_unmapped_area(NULL, map_base, vdso_text_len, 0, >> > 0); >> > >> >> These are more
Re: [PATCH v4 2/6] treewide: use prandom_u32_max() when possible
(ngroups); > for (i = 0; i < ngroups; i++) { > group = (parent_group + i) % ngroups; > desc = ext2_get_group_desc (sb, group, NULL); Okay, that one is too much for me -- checking that group is never used after the assignment removal is likely possible, but beyond my cocci know-how. :) > diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c > index f73e5eb43eae..36d5bc595cc2 100644 > --- a/fs/ext4/ialloc.c > +++ b/fs/ext4/ialloc.c > @@ -463,10 +463,9 @@ static int find_group_orlov(struct super_block *sb, > struct inode *parent, > hinfo.hash_version = DX_HASH_HALF_MD4; > hinfo.seed = sbi->s_hash_seed; > ext4fs_dirhash(parent, qstr->name, qstr->len, ); > - grp = hinfo.hash; > + parent_group = hinfo.hash % ngroups; > } else > - grp = prandom_u32(); > - parent_group = (unsigned)grp % ngroups; > + parent_group = prandom_u32_max(ngroups); > for (i = 0; i < ngroups; i++) { > g = (parent_group + i) % ngroups; > get_orlov_stats(sb, g, flex_size, ); Much less easy mechanically. :) > diff --git a/lib/test_hexdump.c b/lib/test_hexdump.c > index 0927f44cd478..41a0321f641a 100644 > --- a/lib/test_hexdump.c > +++ b/lib/test_hexdump.c > @@ -208,7 +208,7 @@ static void __init test_hexdump_overflow(size_t buflen, > size_t len, > static void __init test_hexdump_overflow_set(size_t buflen, bool ascii) > { > unsigned int i = 0; > - int rs = (prandom_u32_max(2) + 1) * 16; > + int rs = prandom_u32_max(2) + 1 * 16; > > do { > int gs = 1 << i; This looks wrong. Cocci says: - int rs = (get_random_int() % 2 + 1) * 16; + int rs = (prandom_u32_max(2) + 1) * 16; > diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c > index 4f2f2d1bac56..56ffaa8dd3f6 100644 > --- a/lib/test_vmalloc.c > +++ b/lib/test_vmalloc.c > @@ -151,9 +151,7 @@ static int random_size_alloc_test(void) > int i; > > for (i = 0; i < test_loop_count; i++) { > - n = prandom_u32(); > - n = (n % 100) + 1; > - > + n = prandom_u32_max(n % 100) + 1; > p = vmalloc(n * PAGE_SIZE); > > if (!p) This looks wrong. Cocci says: - n = prandom_u32(); - n = (n % 100) + 1; + n = prandom_u32_max(100) + 1; > @@ -293,16 +291,12 @@ pcpu_alloc_test(void) > return -1; > > for (i = 0; i < 35000; i++) { > - unsigned int r; > - > - r = prandom_u32(); > - size = (r % (PAGE_SIZE / 4)) + 1; > + size = prandom_u32_max(PAGE_SIZE / 4) + 1; > > /* >* Maximum PAGE_SIZE >*/ > - r = prandom_u32(); > - align = 1 << ((r % 11) + 1); > + align = 1 << (prandom_u32_max(11) + 1); > > pcpu[i] = __alloc_percpu(size, align); > if (!pcpu[i]) > @@ -393,14 +387,11 @@ static struct test_driver { > > static void shuffle_array(int *arr, int n) > { > - unsigned int rnd; > int i, j; > > for (i = n - 1; i > 0; i--) { > - rnd = prandom_u32(); > - > /* Cut the range. */ > - j = rnd % i; > + j = prandom_u32_max(i); > > /* Swap indexes. */ > swap(arr[i], arr[j]); Yup, agrees with Cocci on these. -- Kees Cook
Re: [PATCH v3 3/5] treewide: use get_random_u32() when possible
On Fri, Oct 07, 2022 at 08:07:58AM -0600, Jason A. Donenfeld wrote: > On Fri, Oct 07, 2022 at 04:57:24AM +, Christophe Leroy wrote: > > > > > > Le 07/10/2022 à 01:36, Jason A. Donenfeld a écrit : > > > On 10/6/22, Christophe Leroy wrote: > > >> > > >> > > >> Le 06/10/2022 à 19:31, Christophe Leroy a écrit : > > >>> > > >>> > > >>> Le 06/10/2022 à 19:24, Jason A. Donenfeld a écrit : > > >>>> Hi Christophe, > > >>>> > > >>>> On Thu, Oct 6, 2022 at 11:21 AM Christophe Leroy > > >>>> wrote: > > >>>>> Le 06/10/2022 à 18:53, Jason A. Donenfeld a écrit : > > >>>>>> The prandom_u32() function has been a deprecated inline wrapper > > >>>>>> around > > >>>>>> get_random_u32() for several releases now, and compiles down to the > > >>>>>> exact same code. Replace the deprecated wrapper with a direct call to > > >>>>>> the real function. The same also applies to get_random_int(), which > > >>>>>> is > > >>>>>> just a wrapper around get_random_u32(). > > >>>>>> > > >>>>>> Reviewed-by: Kees Cook > > >>>>>> Acked-by: Toke Høiland-Jørgensen # for sch_cake > > >>>>>> Acked-by: Chuck Lever # for nfsd > > >>>>>> Reviewed-by: Jan Kara # for ext4 > > >>>>>> Signed-off-by: Jason A. Donenfeld > > >>>>>> --- > > >>>>> > > >>>>>> diff --git a/arch/powerpc/kernel/process.c > > >>>>>> b/arch/powerpc/kernel/process.c > > >>>>>> index 0fbda89cd1bb..9c4c15afbbe8 100644 > > >>>>>> --- a/arch/powerpc/kernel/process.c > > >>>>>> +++ b/arch/powerpc/kernel/process.c > > >>>>>> @@ -2308,6 +2308,6 @@ void notrace __ppc64_runlatch_off(void) > > >>>>>> unsigned long arch_align_stack(unsigned long sp) > > >>>>>> { > > >>>>>> if (!(current->personality & ADDR_NO_RANDOMIZE) && > > >>>>>> randomize_va_space) > > >>>>>> - sp -= get_random_int() & ~PAGE_MASK; > > >>>>>> + sp -= get_random_u32() & ~PAGE_MASK; > > >>>>>> return sp & ~0xf; > > >>>>> > > >>>>> Isn't that a candidate for prandom_u32_max() ? > > >>>>> > > >>>>> Note that sp is deemed to be 16 bytes aligned at all time. > > >>>> > > >>>> Yes, probably. It seemed non-trivial to think about, so I didn't. But > > >>>> let's see here... maybe it's not too bad: > > >>>> > > >>>> If PAGE_MASK is always ~(PAGE_SIZE-1), then ~PAGE_MASK is > > >>>> (PAGE_SIZE-1), so prandom_u32_max(PAGE_SIZE) should yield the same > > >>>> thing? Is that accurate? And holds across platforms (this comes up a > > >>>> few places)? If so, I'll do that for a v4. > > >>>> > > >>> > > >>> On powerpc it is always (from arch/powerpc/include/asm/page.h) : > > >>> > > >>> /* > > >>>* Subtle: (1 << PAGE_SHIFT) is an int, not an unsigned long. So if we > > >>>* assign PAGE_MASK to a larger type it gets extended the way we want > > >>>* (i.e. with 1s in the high bits) > > >>>*/ > > >>> #define PAGE_MASK (~((1 << PAGE_SHIFT) - 1)) > > >>> > > >>> #define PAGE_SIZE(1UL << PAGE_SHIFT) > > >>> > > >>> > > >>> So it would work I guess. > > >> > > >> But taking into account that sp must remain 16 bytes aligned, would it > > >> be better to do something like ? > > >> > > >> sp -= prandom_u32_max(PAGE_SIZE >> 4) << 4; > > >> > > >> return sp; > > > > > > Does this assume that sp is already aligned at the beginning of the > > > function? I'd assume from the function's name that this isn't the > > > case? > > > > Ah you are right, I overlooked it. > > So I think to stay on the safe side, I'm going to go with > `prandom_u32_max(PAGE_SIZE)`. Sound good? Given these kinds of less mechanical changes, it may make sense to split these from the "trivial" conversions in a treewide patch. The chance of needing a revert from the simple 1:1 conversions is much lower than the need to revert by-hand changes. The Cocci script I suggested in my v1 review gets 80% of the first patch, for example. -- Kees Cook
Re: [PATCH v1 0/5] treewide cleanup of random integer usage
On Wed, Oct 05, 2022 at 09:55:43PM -0700, Kees Cook wrote: > If any of the subsystems ask you to break this up (I hope not), I've got > this[1], which does a reasonable job of splitting a commit up into > separate commits for each matching subsystem. [1] https://github.com/kees/kernel-tools/blob/trunk/split-on-maintainer -- Kees Cook
Re: [PATCH v1 0/5] treewide cleanup of random integer usage
On Wed, Oct 05, 2022 at 11:48:39PM +0200, Jason A. Donenfeld wrote: > Hi folks, > > This is a five part treewide cleanup of random integer handling. The > rules for random integers are: > > - If you want a secure or an insecure random u64, use get_random_u64(). > - If you want a secure or an insecure random u32, use get_random_u32(). > * The old function prandom_u32() has been deprecated for a while now > and is just a wrapper around get_random_u32(). > - If you want a secure or an insecure random u16, use get_random_u16(). > - If you want a secure or an insecure random u8, use get_random_u8(). > - If you want secure or insecure random bytes, use get_random_bytes(). > * The old function prandom_bytes() has been deprecated for a while now > and has long been a wrapper around get_random_bytes(). > - If you want a non-uniform random u32, u16, or u8 bounded by a certain > open interval maximum, use prandom_u32_max(). > * I say "non-uniform", because it doesn't do any rejection sampling or > divisions. Hence, it stays within the prandom_* namespace. > > These rules ought to be applied uniformly, so that we can clean up the > deprecated functions, and earn the benefits of using the modern > functions. In particular, in addition to the boring substitutions, this > patchset accomplishes a few nice effects: > > - By using prandom_u32_max() with an upper-bound that the compiler can > prove at compile-time is ≤65536 or ≤256, internally get_random_u16() > or get_random_u8() is used, which wastes fewer batched random bytes, > and hence has higher throughput. > > - By using prandom_u32_max() instead of %, when the upper-bound is not a > constant, division is still avoided, because prandom_u32_max() uses > a faster multiplication-based trick instead. > > - By using get_random_u16() or get_random_u8() in cases where the return > value is intended to indeed be a u16 or a u8, we waste fewer batched > random bytes, and hence have higher throughput. > > So, based on those rules and benefits from following them, this patchset > breaks down into the following five steps: > > 1) Replace `prandom_u32() % max` and variants thereof with >prandom_u32_max(max). > > 2) Replace `(type)get_random_u32()` and variants thereof with >get_random_u16() or get_random_u8(). I took the pains to actually >look and see what every lvalue type was across the entire tree. > > 3) Replace remaining deprecated uses of prandom_u32() with >get_random_u32(). > > 4) Replace remaining deprecated uses of prandom_bytes() with >get_random_bytes(). > > 5) Remove the deprecated and now-unused prandom_u32() and >prandom_bytes() inline wrapper functions. > > I was thinking of taking this through my random.git tree (on which this > series is currently based) and submitting it near the end of the merge > window, or waiting for the very end of the 6.1 cycle when there will be > the fewest new patches brewing. If somebody with some treewide-cleanup > experience might share some wisdom about what the best timing usually > winds up being, I'm all ears. It'd be nice to capture some (all?) of the above somewhere. Perhaps just a massive comment in the header? > I've CC'd get_maintainers.pl, which is a pretty big list. Probably some > portion of those are going to bounce, too, and everytime you reply to > this thread, you'll have to deal with a bunch of bounces coming > immediately after. And a recipient list this big will probably dock my > email domain's spam reputation, at least temporarily. Sigh. I think > that's just how it goes with treewide cleanups though. Again, let me > know if I'm doing it wrong. I usually stick to just mailing lists and subsystem maintainers. If any of the subsystems ask you to break this up (I hope not), I've got this[1], which does a reasonable job of splitting a commit up into separate commits for each matching subsystem. Showing that a treewide change can be reproduced mechanically helps with keeping it together as one bit treewide patch, too, I've found. :) Thank you for the cleanup! The "u8 rnd = get_random_u32()" in the tree has bothered me for a lng time. -Kees -- Kees Cook
Re: [PATCH v1 4/5] treewide: use get_random_bytes when possible
On Wed, Oct 05, 2022 at 11:48:43PM +0200, Jason A. Donenfeld wrote: > The prandom_bytes() function has been a deprecated inline wrapper around > get_random_bytes() for several releases now, and compiles down to the > exact same code. Replace the deprecated wrapper with a direct call to > the real function. > > Signed-off-by: Jason A. Donenfeld Global search/replace matches. :) Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v1 4/5] treewide: use get_random_bytes when possible
On Wed, Oct 05, 2022 at 11:48:43PM +0200, Jason A. Donenfeld wrote: > The prandom_bytes() function has been a deprecated inline wrapper around > get_random_bytes() for several releases now, and compiles down to the > exact same code. Replace the deprecated wrapper with a direct call to > the real function. Global search/replace matches. :) Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v1 5/5] prandom: remove unused functions
On Wed, Oct 05, 2022 at 11:48:44PM +0200, Jason A. Donenfeld wrote: > With no callers left of prandom_u32() and prandom_bytes(), remove these > deprecated wrappers. > > Signed-off-by: Jason A. Donenfeld Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v1 2/5] treewide: use get_random_{u8,u16}() when possible
On Wed, Oct 05, 2022 at 11:48:41PM +0200, Jason A. Donenfeld wrote: > Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value, > simply use the get_random_{u8,u16}() functions, which are faster than > wasting the additional bytes from a 32-bit value. > > Signed-off-by: Jason A. Donenfeld Same question about "mechanism of transformation". > diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c > b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c > index ddfe9208529a..ac452a0111a9 100644 > --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c > +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c > @@ -1467,7 +1467,7 @@ static void make_established(struct sock *sk, u32 > snd_isn, unsigned int opt) > tp->write_seq = snd_isn; > tp->snd_nxt = snd_isn; > tp->snd_una = snd_isn; > - inet_sk(sk)->inet_id = prandom_u32(); > + inet_sk(sk)->inet_id = get_random_u16(); > assign_rxopt(sk, opt); > > if (tp->rcv_wnd > (RCV_BUFSIZ_M << 10)) This one I had to go look at -- inet_id is u16, so yeah. :) > diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c > index 56ffaa8dd3f6..0131ed2cd1bd 100644 > --- a/lib/test_vmalloc.c > +++ b/lib/test_vmalloc.c > @@ -80,7 +80,7 @@ static int random_size_align_alloc_test(void) > int i; > > for (i = 0; i < test_loop_count; i++) { > - rnd = prandom_u32(); > + rnd = get_random_u8(); > > /* >* Maximum 1024 pages, if PAGE_SIZE is 4096. This wasn't obvious either, but it looks like it's because it never consumes more than u8? > diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c > index 7981be526f26..57c7686ac485 100644 > --- a/net/netfilter/nf_nat_core.c > +++ b/net/netfilter/nf_nat_core.c > @@ -468,7 +468,7 @@ static void nf_nat_l4proto_unique_tuple(struct > nf_conntrack_tuple *tuple, > if (range->flags & NF_NAT_RANGE_PROTO_OFFSET) > off = (ntohs(*keyptr) - ntohs(range->base_proto.all)); > else > - off = prandom_u32(); > + off = get_random_u16(); > > attempts = range_size; Yup, u16 off; > diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c > index 2829455211f8..7eb70acb4d58 100644 > --- a/net/sched/sch_sfb.c > +++ b/net/sched/sch_sfb.c > @@ -379,7 +379,7 @@ static int sfb_enqueue(struct sk_buff *skb, struct Qdisc > *sch, > goto enqueue; > } > > - r = prandom_u32() & SFB_MAX_PROB; > + r = get_random_u16() & SFB_MAX_PROB; > > if (unlikely(r < p_min)) { > if (unlikely(p_min > SFB_MAX_PROB / 2)) { include/uapi/linux/pkt_sched.h:#define SFB_MAX_PROB 0x Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v1 1/5] treewide: use prandom_u32_max() when possible
On Wed, Oct 05, 2022 at 11:48:40PM +0200, Jason A. Donenfeld wrote: > Rather than incurring a division or requesting too many random bytes for > the given range, use the prandom_u32_max() function, which only takes > the minimum required bytes from the RNG and avoids divisions. Yes please! Since this is a treewide patch, it's helpful for (me at least) doing reviews to detail the mechanism of the transformation. e.g. I imagine this could be done with something like Coccinelle and @no_modulo@ expression E; @@ - (prandom_u32() % (E)) + prandom_u32_max(E) > diff --git a/drivers/mtd/ubi/debug.h b/drivers/mtd/ubi/debug.h > index 118248a5d7d4..4236c799a47c 100644 > --- a/drivers/mtd/ubi/debug.h > +++ b/drivers/mtd/ubi/debug.h > @@ -73,7 +73,7 @@ static inline int ubi_dbg_is_bgt_disabled(const struct > ubi_device *ubi) > static inline int ubi_dbg_is_bitflip(const struct ubi_device *ubi) > { > if (ubi->dbg.emulate_bitflips) > - return !(prandom_u32() % 200); > + return !(prandom_u32_max(200)); > return 0; > } > Because some looks automated (why the parens?) > @@ -393,14 +387,11 @@ static struct test_driver { > > static void shuffle_array(int *arr, int n) > { > - unsigned int rnd; > int i, j; > > for (i = n - 1; i > 0; i--) { > - rnd = prandom_u32(); > - > /* Cut the range. */ > - j = rnd % i; > + j = prandom_u32_max(i); > > /* Swap indexes. */ > swap(arr[i], arr[j]); And some by hand. :) Reviewed-by: Kees Cook -- Kees Cook
Re: [powerpc] Build failure include/linux/compiler_types.h __alloc_size__ (next-20220928)
On Thu, Sep 29, 2022 at 11:49:28AM +0530, Sachin Sant wrote: > Linux-next 6.0.0-rc7-next-20220928 fails to build on powerpc with > following error: > > make -j 17 -s && make modules_install && make install > In file included from : > ./include/linux/percpu.h: In function '__alloc_reserved_percpu': > ././include/linux/compiler_types.h:279:30: error: expected declaration > specifiers before '__alloc_size__' > #define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc > ^~ Apologies for the breakage! This should be fixed by: https://lore.kernel.org/lkml/20220929081642.1932200-1-keesc...@chromium.org -- Kees Cook
Re: [powerpc] memcpy warning drivers/scsi/scsi_transport_fc.c:581 (next-20220921)
memcpy(>event_data, data_buf, data_len); + memcpy(event->event_data_flex, data_buf, data_len); + padding = len - offsetof(typeof(*event), event_data_flex) - data_len; + memset(event->event_data_flex + data_len, 0, padding); nlmsg_multicast(scsi_nl_sock, skb, 0, SCSI_NL_GRP_FC_EVENTS, GFP_KERNEL); diff --git a/include/uapi/scsi/scsi_netlink_fc.h b/include/uapi/scsi/scsi_netlink_fc.h index 7535253f1a96..b46e9cbeb001 100644 --- a/include/uapi/scsi/scsi_netlink_fc.h +++ b/include/uapi/scsi/scsi_netlink_fc.h @@ -35,7 +35,7 @@ * FC Transport Broadcast Event Message : * FC_NL_ASYNC_EVENT * - * Note: if Vendor Unique message, _data will be start of + * Note: if Vendor Unique message, event_data_flex will be start of * vendor unique payload, and the length of the payload is * per event_datalen * @@ -50,7 +50,10 @@ struct fc_nl_event { __u16 event_datalen; __u32 event_num; __u32 event_code; - __u32 event_data; + union { + __u32 event_data; + __DECLARE_FLEX_ARRAY(__u8, event_data_flex); + }; } __attribute__((aligned(sizeof(__u64; -- Kees Cook
Re: [PATCH][next] powerpc/xmon: Fix -Wswitch-unreachable warning in bpt_cmds
On Fri, Sep 16, 2022 at 03:15:04PM +0100, Gustavo A. R. Silva wrote: > When building with automatic stack variable initialization, GCC 12 > complains about variables defined outside of switch case statements. > Move the variable into the case that uses it, which silences the warning: > > arch/powerpc/xmon/xmon.c: In function ‘bpt_cmds’: > arch/powerpc/xmon/xmon.c:1529:13: warning: statement will never be executed > [-Wswitch-unreachable] > 1529 | int mode; > | ^~~~ > > Fixes: 09b6c1129f89 ("powerpc/xmon: Fix compile error with PPC_8xx=y") > Signed-off-by: Gustavo A. R. Silva Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH][next] powerpc: Fix fall-through warning for Clang
On Tue, Sep 06, 2022 at 10:32:13PM +0100, Gustavo A. R. Silva wrote: > Fix the following fallthrough warning: > > arch/powerpc/platforms/85xx/mpc85xx_cds.c:161:3: warning: unannotated > fall-through between switch labels [-Wimplicit-fallthrough] > > Link: https://github.com/KSPP/linux/issues/198 > Reported-by: kernel test robot > Link: https://lore.kernel.org/lkml/202209061224.kxorrgvg-...@intel.com/ > Signed-off-by: Gustavo A. R. Silva Thanks! Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH] gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin disabled for a file
On Tue, 16 Aug 2022 15:17:20 +1000, Andrew Donnellan wrote: > Commit 36d4b36b6959 ("lib/nodemask: inline next_node_in() and > node_random()") refactored some code by moving node_random() from > lib/nodemask.c to include/linux/nodemask.h, thus requiring nodemask.h to > include random.h, which conditionally defines add_latent_entropy() > depending on whether the macro LATENT_ENTROPY_PLUGIN is defined. > > This broke the build on powerpc, where nodemask.h is indirectly included > in arch/powerpc/kernel/prom_init.c, part of the early boot machinery that > is excluded from the latent entropy plugin using > DISABLE_LATENT_ENTROPY_PLUGIN. It turns out that while we add a gcc flag > to disable the actual plugin, we don't undefine LATENT_ENTROPY_PLUGIN. > > [...] Applied to for-next/hardening, thanks! [1/1] gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin disabled for a file https://git.kernel.org/kees/c/2d08c71d2c79 -- Kees Cook
Re: [PATCH -next v3 2/2] powerpc: add support for syscall stack randomization
On Fri, Jul 01, 2022 at 04:24:35PM +0800, Xiu Jianfeng wrote: > Add support for adding a random offset to the stack while handling > syscalls. This patch uses mftb() instead of get_random_int() for better > performance. > > In order to avoid unconditional stack canaries on syscall entry (due to > the use of alloca()), also disable stack protector to avoid triggering > needless checks and slowing down the entry path. As there is no general > way to control stack protector coverage with a function attribute, this > must be disabled at the compilation unit level. > > Signed-off-by: Xiu Jianfeng Reviewed-by: Kees Cook -Kees -- Kees Cook
Re: [PATCH] macintosh:fix oob read in do_adb_query function
On Wed, Jul 13, 2022 at 11:37:34PM +0800, Ning Qiang wrote: > In do_adb_query function of drivers/macintosh/adb.c, req->data is copy > form userland. the parameter "req->data[2]" is Missing check, the > array size of adb_handler[] is 16, so "adb_handler[ > req->data[2]].original_address" and "adb_handler[ > req->data[2]].handler_id" will lead to oob read. > > Signed-off-by: Ning Qiang Thanks for catching this! Do you have a reproducer for this? I'd expect CONFIG_UBSAN_BOUNDS=y to notice this at runtime, at least. > --- > drivers/macintosh/adb.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/macintosh/adb.c b/drivers/macintosh/adb.c > index 439fab4eaa85..1bbb9ca08d40 100644 > --- a/drivers/macintosh/adb.c > +++ b/drivers/macintosh/adb.c > @@ -647,7 +647,7 @@ do_adb_query(struct adb_request *req) > > switch(req->data[1]) { > case ADB_QUERY_GETDEVINFO: > - if (req->nbytes < 3) > + if (req->nbytes < 3 || req->data[2] >= 16) I'd prefer this was: + if (req->nbytes < 3 || req->data[2] >= ARRAY_SIZE(adb_handler)) so it's tied to the actual variable (if its size ever changes). With that: Reviewed-by: Kees Cook -Kees > break; > mutex_lock(_handler_mutex); > req->reply[0] = adb_handler[req->data[2]].original_address; > -- > 2.25.1 > -- Kees Cook
Re: [PATCH v2] stack: Declare {randomize_,}kstack_offset to fix Sparse warnings
On Wed, 29 Jun 2022 14:04:23 +0800, GONG, Ruiqi wrote: > Fix the following Sparse warnings that got noticed when the PPC-dev > patchwork was checking another patch (see the link below): > > init/main.c:862:1: warning: symbol 'randomize_kstack_offset' was not > declared. Should it be static? > init/main.c:864:1: warning: symbol 'kstack_offset' was not declared. Should > it be static? > > Which in fact are triggered on all architectures that have > HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET support (for instances x86, arm64 > etc). > > [...] Applied to for-next/hardening, thanks! [1/1] stack: Declare {randomize_,}kstack_offset to fix Sparse warnings https://git.kernel.org/kees/c/375561bd6195 -- Kees Cook
Re: [PATCH] powerpc: Restore CONFIG_DEBUG_INFO in defconfigs
On Sat, Jun 11, 2022 at 08:51:57AM +0200, Christophe Leroy wrote: > Commit f9b3cd245784 ("Kconfig.debug: make DEBUG_INFO selectable from a > choice") broke the selection of CONFIG_DEBUG_INFO by powerpc defconfigs. > > It is now necessary to select one of the three DEBUG_INFO_DWARF* > options to get DEBUG_INFO enabled. > > Replace DEBUG_INFO=y by DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y in all > defconfigs using the following command: > > sed -i s/DEBUG_INFO=y/DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y/g `git grep -l > DEBUG_INFO arch/powerpc/configs/` > > Fixes: f9b3cd245784 ("Kconfig.debug: make DEBUG_INFO selectable from a > choice") > Cc: sta...@vger.kernel.org > Cc: Kees Cook > Signed-off-by: Christophe Leroy Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH 0/5] kallsyms: make kallsym APIs more safe with scnprintf
On Sat, May 21, 2022 at 11:07:52PM -0700, Christoph Hellwig wrote: > On Fri, May 20, 2022 at 02:06:56PM +0530, Maninder Singh wrote: > > kallsyms functionality depends on KSYM_NAME_LEN directly. > > but if user passed array length lesser than it, sprintf > > can cause issues of buffer overflow attack. > > > > So changing *sprint* and *lookup* APIs in this patch set > > to have buffer size as an argument and replacing sprintf with > > scnprintf. > > This is still a pretty horrible API. Passing something like > a struct seq_buf seems like the much better API here. Also with > the amount of arguments and by reference passing it might be worth > to pass them as a structure while you're at it. Yeah, I agree. It really seems like seq_buf would be nicer. -- Kees Cook
Re: [PATCH -next] powerpc: add support for syscall stack randomization
On Tue, May 10, 2022 at 07:23:46PM +1000, Nicholas Piggin wrote: > Excerpts from Xiu Jianfeng's message of May 5, 2022 9:19 pm: > > Add support for adding a random offset to the stack while handling > > syscalls. This patch uses mftb() instead of get_random_int() for better > > performance. > > Hey, very nice. Agreed! :) > > [...] > > @@ -82,6 +83,7 @@ notrace long system_call_exception(long r3, long r4, long > > r5, > > > > kuap_lock(); > > > > + add_random_kstack_offset(); > > regs->orig_gpr3 = r3; > > > > if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) > > This looks like the right place. I wonder why other interrupts don't > get the same treatment. Userspace can induce the kernel to take a > synchronous interrupt, or wait for async ones. Smaller surface area > maybe but certain instruction emulation for example could result in > significant logic that depends on user state. Anyway that's for > hardening gurus to ponder. I welcome it being used for any userspace controllable entry to the kernel! :) Also, related, have you validated the result using the LKDTM test? See tools/testing/selftests/lkdtm/stack-entropy.sh > > > @@ -405,6 +407,7 @@ interrupt_exit_user_prepare_main(unsigned long ret, > > struct pt_regs *regs) > > > > /* Restore user access locks last */ > > kuap_user_restore(regs); > > + choose_random_kstack_offset(mftb() & 0xFF); > > > > return ret; > > } > > So this seems to be what x86 and s390 do, but why are we choosing a > new offset for every interrupt when it's only used on a syscall? > I would rather you do what arm64 does and just choose the offset > at the end of system_call_exception. > > I wonder why the choose is separated from the add? I guess it's to > avoid a data dependency for stack access on an expensive random > function, so that makes sense (a comment would be nice in the > generic code). How does this read? I can send a "real" patch if it looks good: diff --git a/include/linux/randomize_kstack.h b/include/linux/randomize_kstack.h index 1468caf001c0..ad3e80275c74 100644 --- a/include/linux/randomize_kstack.h +++ b/include/linux/randomize_kstack.h @@ -40,8 +40,11 @@ DECLARE_PER_CPU(u32, kstack_offset); */ #define KSTACK_OFFSET_MAX(x) ((x) & 0x3FF) -/* - * These macros must be used during syscall entry when interrupts and +/** + * add_random_kstack_offset - Increase stack utilization by previously + * chosen random offset + * + * This should be used in the syscall entry path when interrupts and * preempt are disabled, and after user registers have been stored to * the stack. */ @@ -55,6 +58,24 @@ DECLARE_PER_CPU(u32, kstack_offset); } \ } while (0) +/** + * choose_random_kstack_offset - Choose the random offsset for the next + * add_random_kstack_offset() + * + * This should only be used during syscall exit when interrupts and + * preempt are disabled, and before user registers have been restored + * from the stack. This is done to frustrate attack attempts from + * userspace to learn the offset: + * - Maximize the timing uncertainty visible from userspace: if the + * the offset is chosen at syscall entry, userspace has much more + * control over the timing between chosen offsets. "How long will we + * be in kernel mode?" tends to be more difficult to know than "how + * long will be be in user mode?" + * - Reduce the lifetime of the new offset sitting in memory during + * kernel mode execution. Exposures of "thread-local" (e.g. current, + * percpu, etc) memory contents tends to be easier than arbitrary + * location memory exposures. + */ #define choose_random_kstack_offset(rand) do { \ if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ _kstack_offset)) {\ -- Kees Cook
Re: [PATCH v6 00/23] Rust support
On Sat, May 07, 2022 at 07:23:58AM +0200, Miguel Ojeda wrote: > ## Patch series status > > The Rust support is still to be considered experimental. However, > support is good enough that kernel developers can start working on the > Rust abstractions for subsystems and write drivers and other modules. I'd really like to see this landed for a few reasons: - It's under active development, and I'd rather review the changes "normally", incrementally, etc. Right now it can be hard to re-review some of the "mostly the same each version" patches in the series. - I'd like to break the catch-22 of "ask for a new driver to be written in rust but the rust support isn't landed" vs "the rust support isn't landed because there aren't enough drivers". It really feels like "release early, release often" is needed here; it's hard to develop against -next. :) Should we give it a try for this coming merge window? -- Kees Cook
Re: [PATCH RFC 2/8] arm64: stacktrace: Add arch_within_stack_frames
On Mon, Apr 18, 2022 at 09:22:11PM +0800, He Zhe wrote: > This function checks if the given address range crosses frame boundary. > It is based on the existing x86 algorithm, but implemented via stacktrace. > This can be tested by USERCOPY_STACK_FRAME_FROM and > USERCOPY_STACK_FRAME_TO in lkdtm. Hi, Thanks for doing this implementation! One reason usercopy hardening didn't persue doing a "full" stacktrace was because it seemed relatively expensive. Did you do any usercopy-heavily workload testing to see if there was a noticeable performance impact? It would be nice to block the exposure of canaries and PAC bits, though, so I'm not opposed, but I'd like to get a better sense of how "heavy" this might be. Thanks! -Kees -- Kees Cook
Re: [RFC PATCH] lkdtm: Replace lkdtm_rodata_do_nothing() by do_nothing()
On Sun, Oct 17, 2021 at 07:19:47PM +0200, Christophe Leroy wrote: > But for EXEC_RODATA test, execute_location() uses > lkdtm_rodata_do_nothing() which is already in rodata section > at build time instead of using a copy of do_nothing(). However > it still uses the function descriptor of do_nothing(). There > is a risk that running lkdtm_rodata_do_nothing() with the > function descriptor of do_thing() is wrong. Wrong how? (Could there be two descriptors?) > To remove the above risk, change the approach and do the same > as for other EXEC tests: use a copy of do_nothing(). The copy > cannot be done during the test because RODATA area is write > protected. Do the copy during init, before RODATA becomes > write protected. Hmm, hmm. This is a nice way to handle it, but I'm not sure which "weird" way is better. I kind of prefer the code going through all the "regular" linking goo to end up in .rodata, but is it really any different from doing this via the ro_after_init section? It makes me nervous because they can technically be handled differently. For example, .rodata is mapped differently on some architectures compared to ro_after_init. Honestly, I actually this this patch should be modified to _add_ a new test for EXEC_RO_AFTER_INIT, and leave the existing .rodata one alone... -Kees -- Kees Cook
Re: [PATCH v5 3/8] powerpc/mm/ptdump: debugfs handler for W+X checks at runtime
*thread necromancy* Is this patch still something folks are working on? It'd be nice to be able to trigger this check at runtime. -Kees On Wed, Feb 26, 2020 at 05:35:46PM +1100, Russell Currey wrote: > Very rudimentary, just > > echo 1 > [debugfs]/check_wx_pages > > and check the kernel log. Useful for testing strict module RWX. > > Updated the Kconfig entry to reflect this. > > Also fixed a typo. > > Signed-off-by: Russell Currey > --- > arch/powerpc/Kconfig.debug | 6 -- > arch/powerpc/mm/ptdump/ptdump.c | 21 - > 2 files changed, 24 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug > index 0b063830eea8..e37960ef68c6 100644 > --- a/arch/powerpc/Kconfig.debug > +++ b/arch/powerpc/Kconfig.debug > @@ -370,7 +370,7 @@ config PPC_PTDUMP > If you are unsure, say N. > > config PPC_DEBUG_WX > - bool "Warn on W+X mappings at boot" > + bool "Warn on W+X mappings at boot & enable manual checks at runtime" > depends on PPC_PTDUMP && STRICT_KERNEL_RWX > help > Generate a warning if any W+X mappings are found at boot. > @@ -384,7 +384,9 @@ config PPC_DEBUG_WX > of other unfixed kernel bugs easier. > > There is no runtime or memory usage effect of this option > - once the kernel has booted up - it's a one time check. > + once the kernel has booted up, it only automatically checks once. > + > + Enables the "check_wx_pages" debugfs entry for checking at runtime. > > If in doubt, say "Y". > > diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c > index 206156255247..a15e19a3b14e 100644 > --- a/arch/powerpc/mm/ptdump/ptdump.c > +++ b/arch/powerpc/mm/ptdump/ptdump.c > @@ -4,7 +4,7 @@ > * > * This traverses the kernel pagetables and dumps the > * information about the used sections of memory to > - * /sys/kernel/debug/kernel_pagetables. > + * /sys/kernel/debug/kernel_page_tables. > * > * Derived from the arm64 implementation: > * Copyright (c) 2014, The Linux Foundation, Laura Abbott. > @@ -413,6 +413,25 @@ void ptdump_check_wx(void) > else > pr_info("Checked W+X mappings: passed, no W+X pages found\n"); > } > + > +static int check_wx_debugfs_set(void *data, u64 val) > +{ > + if (val != 1ULL) > + return -EINVAL; > + > + ptdump_check_wx(); > + > + return 0; > +} > + > +DEFINE_SIMPLE_ATTRIBUTE(check_wx_fops, NULL, check_wx_debugfs_set, "%llu\n"); > + > +static int ptdump_check_wx_init(void) > +{ > + return debugfs_create_file("check_wx_pages", 0200, NULL, > +NULL, _wx_fops) ? 0 : -ENOMEM; > +} > +device_initcall(ptdump_check_wx_init); > #endif > > static int ptdump_init(void) > -- > 2.25.1 > -- Kees Cook
Re: [PATCH v4 00/13] Fix LKDTM for PPC64/IA64/PARISC v4
On Wed, Mar 09, 2022 at 12:37:14PM +1100, Michael Ellerman wrote: > Michael Ellerman writes: > > On Tue, 15 Feb 2022 13:40:55 +0100, Christophe Leroy wrote: > >> PPC64/IA64/PARISC have function descriptors. LKDTM doesn't work > >> on those three architectures because LKDTM messes up function > >> descriptors with functions. > >> > >> This series does some cleanup in the three architectures and > >> refactors function descriptors so that it can then easily use it > >> in a generic way in LKDTM. > >> > >> [...] > > > > Applied to powerpc/next. > > I also have it in an rc2-based topic branch if there are any merge > conflicts that people want to resolve, I don't see any in linux-next at > the moment though. > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/log/?h=topic/func-desc-lkdtm Thanks! I've got some core changes coming for lkdtm, but I'm waiting until after the merge window to rebase them and get them into -next. -- Kees Cook
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Wed, Mar 02, 2022 at 12:18:45PM -0800, Linus Torvalds wrote: > On Wed, Mar 2, 2022 at 12:07 PM Kees Cook wrote: > > > > I've long wanted to change kfree() to explicitly set pointers to NULL on > > free. https://github.com/KSPP/linux/issues/87 > > We've had this discussion with the gcc people in the past, and gcc > actually has some support for it, but it's sadly tied to the actual > function name (ie gcc has some special-casing for "free()") > > See > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94527 > > for some of that discussion. > > Oh, and I see some patch actually got merged since I looked there last > so that you can mark "deallocator" functions, but I think it's only > for the context matching, not for actually killing accesses to the > pointer afterwards. Ah! I missed that getting added in GCC 11. But yes, there it is: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute Hah, now we may need to split __malloc from __alloc_size. ;) I'd still like the NULL assignment behavior, though, since some things can easily avoid static analysis. -- Kees Cook
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Wed, Mar 02, 2022 at 10:29:31AM +0100, Rasmus Villemoes wrote: > This won't help the current issue (because it doesn't exist and might > never), but just in case some compiler people are listening, I'd like to > have some sort of way to tell the compiler "treat this variable as > uninitialized from here on". So one could do > > #define kfree(p) do { __kfree(p); __magic_uninit(p); } while (0) > > with __magic_uninit being a magic no-op that doesn't affect the > semantics of the code, but could be used by the compiler's "[is/may be] > used uninitialized" machinery to flag e.g. double frees on some odd > error path etc. It would probably only work for local automatic > variables, but it should be possible to just ignore the hint if p is > some expression like foo->bar or has side effects. If we had that, the > end-of-loop test could include that to "uninitialize" the iterator. I've long wanted to change kfree() to explicitly set pointers to NULL on free. https://github.com/KSPP/linux/issues/87 The thing stopping a trivial transformation of kfree() is: kfree(get_some_pointer()); I would argue, though, that the above is poor form: the thing holding the pointer should be the thing freeing it, so these cases should be refactored and kfree() could do the NULLing by default. Quoting myself in the above issue: Without doing massive tree-wide changes, I think we need compiler support. If we had something like __builtin_is_lvalue(), we could distinguish function returns from lvalues. For example, right now a common case are things like: kfree(get_some_ptr()); But if we could at least gain coverage of the lvalue cases, and detect them statically at compile-time, we could do: #define __kfree_and_null(x) do { __kfree(*x); *x = NULL; } while (0) #define kfree(x) __builtin_choose_expr(__builtin_is_lvalue(x), __kfree_and_null(&(x)), __kfree(x)) Alternatively, we could do a tree-wide change of the former case (findable with Coccinelle) and change them into something like kfree_no_null() and redefine kfree() itself: #define kfree_no_null(x) do { void *__ptr = (x); __kfree(__ptr); } while (0) #define kfree(x) do { __kfree(x); x = NULL; } while (0) -- Kees Cook
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Tue, Mar 01, 2022 at 12:28:15PM +0100, Jakob Koschel wrote: > Based on the coccinelle script there are ~480 cases that need fixing > in total. I'll now finish all of them and then split them by > submodules as Greg suggested and repost a patch set per submodule. > Sounds good? To help with this splitting, see: https://github.com/kees/kernel-tools/blob/trunk/split-on-maintainer It's not perfect, but it'll get you really close. For example, if you had a single big tree-wide patch applied to your tree: $ rm 0*.patch $ git format-patch -1 HEAD $ mv 0*.patch treewide.patch $ split-on-maintainer treewide.patch $ ls 0*.patch If you have a build log before the patch that spits out warnings, the --build-log argument can extract those warnings on a per-file basis, too (though this can be fragile). -- Kees Cook
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
On Mon, Feb 28, 2022 at 04:45:11PM -0800, Linus Torvalds wrote: > Really. The "-Wshadow doesn't work on the kernel" is not some new > issue, because you have to do completely insane things to the source > code to enable it. The first big glitch with -Wshadow was with shadowed global variables. GCC 4.8 fixed that, but it still yells about shadowed functions. What _almost_ works is -Wshadow=local. At first glace, all the warnings look solvable, but then one will eventually discover __wait_event() and associated macros that mix when and how deeply it intentionally shadows variables. :) Another way to try to catch misused shadow variables is -Wunused-but-set-varible, but it, too, has tons of false positives. I tried to capture some of the rationale and research here: https://github.com/KSPP/linux/issues/152 -- Kees Cook
Re: [PATCH v2] usercopy: Check valid lifetime via stack depth
On Thu, Feb 24, 2022 at 08:58:20AM +, David Laight wrote: > From: Kees Cook > > Sent: 24 February 2022 06:04 > > > > Under CONFIG_HARDENED_USERCOPY=y, when exact stack frame boundary checking > > is not available (i.e. everything except x86 with FRAME_POINTER), check > > a stack object as being at least "current depth valid", in the sense > > that any object within the stack region but not between start-of-stack > > and current_stack_pointer should be considered unavailable (i.e. its > > lifetime is from a call no longer present on the stack). > > > ... > > diff --git a/mm/usercopy.c b/mm/usercopy.c > > index d0d268135d96..5d28725af95f 100644 > > --- a/mm/usercopy.c > > +++ b/mm/usercopy.c > > @@ -22,6 +22,30 @@ > > #include > > #include "slab.h" > > > > +/* > > + * Only called if obj is within stack/stackend bounds. Determine if within > > + * current stack depth. > > + */ > > +static inline int check_stack_object_depth(const void *obj, > > + unsigned long len) > > +{ > > +#ifdef CONFIG_ARCH_HAS_CURRENT_STACK_POINTER > > +#ifndef CONFIG_STACK_GROWSUP > > Pointless negation > > > + const void * const high = stackend; > > + const void * const low = (void *)current_stack_pointer; > > +#else > > + const void * const high = (void *)current_stack_pointer; > > + const void * const low = stack; > > +#endif > > + > > + /* Reject: object not within current stack depth. */ > > + if (obj < low || high < obj + len) > > + return BAD_STACK; > > + > > +#endif > > + return GOOD_STACK; > > +} > > If the comment at the top of the function is correct then > only a single test for the correct end of the buffer against > the current stack pointer is needed. > Something like: > #ifdef CONFIG_STACK_GROWSUP > if ((void *)current_stack_pointer < obj + len) > return BAD_STACK; > #else > if (obj < (void *)current_stack_pointer) > return BAD_STACK; > #endif > return GOOD_STACK; Oh, yeah, excellent point. I suspect the compiler would probably optimize it all away, but yes, this is, in fact, easier to read, and short enough I should probably just not bother with a separate function. Thanks! -Kees > > Although it may depend on exactly where the stack pointer > points to - especially for GROWSUP. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 > 1PT, UK > Registration No: 1397386 (Wales) > -- Kees Cook
[PATCH v2] usercopy: Check valid lifetime via stack depth
Under CONFIG_HARDENED_USERCOPY=y, when exact stack frame boundary checking is not available (i.e. everything except x86 with FRAME_POINTER), check a stack object as being at least "current depth valid", in the sense that any object within the stack region but not between start-of-stack and current_stack_pointer should be considered unavailable (i.e. its lifetime is from a call no longer present on the stack). Introduce ARCH_HAS_CURRENT_STACK_POINTER to track which architectures have actually implemented the common global register alias. Additionally report usercopy bounds checking failures with an offset from current_stack_pointer, which may assist with diagnosing failures. The LKDTM USERCOPY_STACK_FRAME_TO and USERCOPY_STACK_FRAME_FROM tests (once slightly adjusted in a separate patch) will pass again with this fixed. Cc: Matthew Wilcox (Oracle) Cc: Josh Poimboeuf Cc: Andrew Morton Cc: linux...@kvack.org Reported-by: Muhammad Usama Anjum Signed-off-by: Kees Cook --- v1: https://lore.kernel.org/all/20220216201449.2087956-1-keesc...@chromium.org/ v2: adjust for only some archs having current_stack_pointer --- arch/arm/Kconfig | 1 + arch/arm64/Kconfig | 1 + arch/powerpc/Kconfig | 1 + arch/s390/Kconfig| 1 + arch/sh/Kconfig | 1 + arch/x86/Kconfig | 1 + mm/usercopy.c| 41 ++--- 7 files changed, 44 insertions(+), 3 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 4c97cb40eebb..a7a09eef1852 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -5,6 +5,7 @@ config ARM select ARCH_32BIT_OFF_T select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE if HAVE_KRETPROBES && FRAME_POINTER && !ARM_UNWIND select ARCH_HAS_BINFMT_FLAT + select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VIRTUAL if MMU select ARCH_HAS_DMA_WRITE_COMBINE if !ARM_DMA_MEM_BUFFERABLE select ARCH_HAS_ELF_RANDOMIZE diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f2b5a4abef21..b8ab790555c8 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -18,6 +18,7 @@ config ARM64 select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2 select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE select ARCH_HAS_CACHE_LINE_SIZE + select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VIRTUAL select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DMA_PREP_COHERENT diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index b779603978e1..7e7387bd7d53 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -108,6 +108,7 @@ config PPC select ARCH_ENABLE_MEMORY_HOTPLUG select ARCH_ENABLE_MEMORY_HOTREMOVE select ARCH_HAS_COPY_MC if PPC64 + select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VIRTUAL select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DEBUG_WXif STRICT_KERNEL_RWX diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index be9f39fd06df..4845ab549dd1 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -60,6 +60,7 @@ config S390 select ARCH_ENABLE_MEMORY_HOTPLUG if SPARSEMEM select ARCH_ENABLE_MEMORY_HOTREMOVE select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2 + select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DEBUG_WX select ARCH_HAS_DEVMEM_IS_ALLOWED diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig index 2474a04ceac4..1c2b53bf3093 100644 --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -7,6 +7,7 @@ config SUPERH select ARCH_HAVE_CUSTOM_GPIO_H select ARCH_HAVE_NMI_SAFE_CMPXCHG if (GUSA_RB || CPU_SH4A) select ARCH_HAS_BINFMT_FLAT if !MMU + select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_GIGANTIC_PAGE select ARCH_HAS_GCOV_PROFILE_ALL select ARCH_HAS_PTE_SPECIAL diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 9f5bd41bf660..90494fba3620 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -69,6 +69,7 @@ config X86 select ARCH_ENABLE_THP_MIGRATION if X86_64 && TRANSPARENT_HUGEPAGE select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI select ARCH_HAS_CACHE_LINE_SIZE + select ARCH_HAS_CURRENT_STACK_POINTER select ARCH_HAS_DEBUG_VIRTUAL select ARCH_HAS_DEBUG_VM_PGTABLEif !X86_PAE select ARCH_HAS_DEVMEM_IS_ALLOWED diff --git a/mm/usercopy.c b/mm/usercopy.c index d0d268135d96..5d28725af95f 100644 --- a/mm/usercopy.c +++ b/mm/usercopy.c @@ -22,6 +22,30 @@ #include #include "slab.h" +/* + * Only called if obj is within stack/stackend bounds. Determine if within + * current stack depth. + */ +static inline int check_stack_object_depth(const void *obj, + unsigned long len) +{ +#ifdef CONFIG_ARCH_HAS_CURRENT_STACK_POINTER +#ifnd
Re: [PATCH] powerpc/32: Clear volatile regs on syscall exit
On Wed, Feb 23, 2022 at 06:11:36PM +0100, Christophe Leroy wrote: > Commit a82adfd5c7cb ("hardening: Introduce CONFIG_ZERO_CALL_USED_REGS") > added zeroing of used registers at function exit. > > At the time being, PPC64 clears volatile registers on syscall exit but > PPC32 doesn't do it for performance reason. > > Add that clearing in PPC32 syscall exit as well, but only when > CONFIG_ZERO_CALL_USED_REGS is selected. > > On an 8xx, the null_syscall selftest gives: > - Without CONFIG_ZERO_CALL_USED_REGS : 288 cycles > - With CONFIG_ZERO_CALL_USED_REGS : 305 cycles > - With CONFIG_ZERO_CALL_USED_REGS + this patch: 319 cycles > > Note that (independent of this patch), with pmac32_defconfig, > vmlinux size is as follows with/without CONFIG_ZERO_CALL_USED_REGS: > >text databss dec hex filename > 9578869 2525210 194400 12298479bba8ef vmlinux.without > 10318045 2525210 194400 13037655c6f057 vmlinux.with > > That is a 7.7% increase on text size, 6.0% on overall size. > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/kernel/entry_32.S | 15 +++ > 1 file changed, 15 insertions(+) > > diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S > index 7748c278d13c..199f23092c02 100644 > --- a/arch/powerpc/kernel/entry_32.S > +++ b/arch/powerpc/kernel/entry_32.S > @@ -151,6 +151,21 @@ syscall_exit_finish: > bne 3f > mtcrr5 > > +#ifdef CONFIG_ZERO_CALL_USED_REGS > + /* Zero volatile regs that may contain sensitive kernel data */ > + li r0,0 > + li r4,0 > + li r5,0 > + li r6,0 > + li r7,0 > + li r8,0 > + li r9,0 > + li r10,0 > + li r11,0 > + li r12,0 > + mtctr r0 > + mtxer r0 > +#endif I think this should probably be unconditional -- if this is actually leaking kernel pointers (or data) that's pretty bad. :| If you really want to leave it build-time selectable, maybe add a new config that gets "select"ed by CONFIG_ZERO_CALL_USED_REGS? (And you may want to consider wiping all "unused" registers at syscall entry as well.) -Kees > 1: lwz r2,GPR2(r1) > lwz r1,GPR1(r1) > rfi > -- > 2.34.1 > -- Kees Cook
Re: [PATCH v4 00/13] Fix LKDTM for PPC64/IA64/PARISC v4
On Wed, Feb 16, 2022 at 11:22:33PM +1100, Michael Ellerman wrote: > Kees Cook writes: > > On Tue, Feb 15, 2022 at 01:40:55PM +0100, Christophe Leroy wrote: > >> PPC64/IA64/PARISC have function descriptors. LKDTM doesn't work > >> on those three architectures because LKDTM messes up function > >> descriptors with functions. > >> > >> This series does some cleanup in the three architectures and > >> refactors function descriptors so that it can then easily use it > >> in a generic way in LKDTM. > > > > Thanks for doing this! It looks good to me. :) > > How should we merge this series, it's a bit all over the map. > > I could put it in a topic branch? That's fine by me -- I had assumed it'd go via the ppc tree. But if you'd rather I take it as a topic branch I can do that too. -- Kees Cook
Re: [PATCH v4 00/13] Fix LKDTM for PPC64/IA64/PARISC v4
On Tue, Feb 15, 2022 at 01:40:55PM +0100, Christophe Leroy wrote: > PPC64/IA64/PARISC have function descriptors. LKDTM doesn't work > on those three architectures because LKDTM messes up function > descriptors with functions. > > This series does some cleanup in the three architectures and > refactors function descriptors so that it can then easily use it > in a generic way in LKDTM. Thanks for doing this! It looks good to me. :) -Kees > > Changes in v4: > - Added patch 1 which Fixes 'sparse' for powerpc64le after wrong report on > previous series, refer > https://github.com/ruscur/linux-ci/actions/runs/1351427671 > - Exported dereference_function_descriptor() to modules > - Addressed other received comments > - Rebased on latest powerpc/next (5a72345e6a78120368fcc841b570331b6c5a50da) > > Changes in v3: > - Addressed received comments > - Swapped some of the powerpc patches to keep func_descr_t renamed as struct > func_desc and remove 'struct ppc64_opd_entry' > - Changed HAVE_FUNCTION_DESCRIPTORS macro to a config item > CONFIG_HAVE_FUNCTION_DESCRIPTORS > - Dropped patch 11 ("Fix lkdtm_EXEC_RODATA()") > > Changes in v2: > - Addressed received comments > - Moved dereference_[kernel]_function_descriptor() out of line > - Added patches to remove func_descr_t and func_desc_t in powerpc > - Using func_desc_t instead of funct_descr_t > - Renamed HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR to HAVE_FUNCTION_DESCRIPTORS > - Added a new lkdtm test to check protection of function descriptors > > Christophe Leroy (13): > powerpc: Fix 'sparse' checking on PPC64le > powerpc: Move and rename func_descr_t > powerpc: Use 'struct func_desc' instead of 'struct ppc64_opd_entry' > powerpc: Remove 'struct ppc64_opd_entry' > powerpc: Prepare func_desc_t for refactorisation > ia64: Rename 'ip' to 'addr' in 'struct fdesc' > asm-generic: Define CONFIG_HAVE_FUNCTION_DESCRIPTORS > asm-generic: Define 'func_desc_t' to commonly describe function > descriptors > asm-generic: Refactor dereference_[kernel]_function_descriptor() > lkdtm: Force do_nothing() out of line > lkdtm: Really write into kernel text in WRITE_KERN > lkdtm: Fix execute_[user]_location() > lkdtm: Add a test for function descriptors protection > > arch/Kconfig | 3 + > arch/ia64/Kconfig| 1 + > arch/ia64/include/asm/elf.h | 2 +- > arch/ia64/include/asm/sections.h | 24 +--- > arch/ia64/kernel/module.c| 6 +- > arch/parisc/Kconfig | 1 + > arch/parisc/include/asm/sections.h | 16 ++ > arch/parisc/kernel/process.c | 21 --- > arch/powerpc/Kconfig | 1 + > arch/powerpc/Makefile| 2 +- > arch/powerpc/include/asm/code-patching.h | 2 +- > arch/powerpc/include/asm/elf.h | 6 ++ > arch/powerpc/include/asm/sections.h | 29 ++ > arch/powerpc/include/asm/types.h | 6 -- > arch/powerpc/include/uapi/asm/elf.h | 8 --- > arch/powerpc/kernel/module_64.c | 42 ++ > arch/powerpc/kernel/ptrace/ptrace.c | 6 ++ > arch/powerpc/kernel/signal_64.c | 8 +-- > drivers/misc/lkdtm/core.c| 1 + > drivers/misc/lkdtm/lkdtm.h | 1 + > drivers/misc/lkdtm/perms.c | 71 +++- > include/asm-generic/sections.h | 15 - > include/linux/kallsyms.h | 2 +- > kernel/extable.c | 24 +++- > tools/testing/selftests/lkdtm/tests.txt | 1 + > 25 files changed, 155 insertions(+), 144 deletions(-) > > -- > 2.34.1 > -- Kees Cook
Re: [PATCH v4 01/13] powerpc: Fix 'sparse' checking on PPC64le
On Tue, Feb 15, 2022 at 01:40:56PM +0100, Christophe Leroy wrote: > 'sparse' is architecture agnostic and knows nothing about ELF ABI > version. > > Just like it gets arch and powerpc type and endian from Makefile, > it also need to get _CALL_ELF from there, otherwise it won't set > PPC64_ELF_ABI_v2 macro for PPC64le and won't check the correct code. > > Signed-off-by: Christophe Leroy Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v3 12/12] lkdtm: Add a test for function descriptors protection
On Sun, Oct 17, 2021 at 02:38:25PM +0200, Christophe Leroy wrote: > Add WRITE_OPD to check that you can't modify function > descriptors. > > Gives the following result when function descriptors are > not protected: > > lkdtm: Performing direct entry WRITE_OPD > lkdtm: attempting bad 16 bytes write at c269b358 > lkdtm: FAIL: survived bad write > lkdtm: do_nothing was hijacked! > > Looks like a standard compiler barrier() is not enough to force > GCC to use the modified function descriptor. Had to add a fake empty > inline assembly to force GCC to reload the function descriptor. > > Signed-off-by: Christophe Leroy > --- > drivers/misc/lkdtm/core.c | 1 + > drivers/misc/lkdtm/lkdtm.h | 1 + > drivers/misc/lkdtm/perms.c | 22 ++ > 3 files changed, 24 insertions(+) > > diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c > index fe6fd34b8caf..de092aa03b5d 100644 > --- a/drivers/misc/lkdtm/core.c > +++ b/drivers/misc/lkdtm/core.c > @@ -148,6 +148,7 @@ static const struct crashtype crashtypes[] = { > CRASHTYPE(WRITE_RO), > CRASHTYPE(WRITE_RO_AFTER_INIT), > CRASHTYPE(WRITE_KERN), > + CRASHTYPE(WRITE_OPD), > CRASHTYPE(REFCOUNT_INC_OVERFLOW), > CRASHTYPE(REFCOUNT_ADD_OVERFLOW), > CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW), > diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h > index c212a253edde..188bd0fd6575 100644 > --- a/drivers/misc/lkdtm/lkdtm.h > +++ b/drivers/misc/lkdtm/lkdtm.h > @@ -105,6 +105,7 @@ void __init lkdtm_perms_init(void); > void lkdtm_WRITE_RO(void); > void lkdtm_WRITE_RO_AFTER_INIT(void); > void lkdtm_WRITE_KERN(void); > +void lkdtm_WRITE_OPD(void); > void lkdtm_EXEC_DATA(void); > void lkdtm_EXEC_STACK(void); > void lkdtm_EXEC_KMALLOC(void); > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > index 1cf24c4a79e9..2c6aba3ff32b 100644 > --- a/drivers/misc/lkdtm/perms.c > +++ b/drivers/misc/lkdtm/perms.c > @@ -44,6 +44,11 @@ static noinline void do_overwritten(void) > return; > } > > +static noinline void do_almost_nothing(void) > +{ > + pr_info("do_nothing was hijacked!\n"); > +} > + > static void *setup_function_descriptor(func_desc_t *fdesc, void *dst) > { > if (!have_function_descriptors()) > @@ -144,6 +149,23 @@ void lkdtm_WRITE_KERN(void) > do_overwritten(); > } > > +void lkdtm_WRITE_OPD(void) > +{ > + size_t size = sizeof(func_desc_t); > + void (*func)(void) = do_nothing; > + > + if (!have_function_descriptors()) { > + pr_info("XFAIL: Platform doesn't use function descriptors.\n"); > + return; > + } > + pr_info("attempting bad %zu bytes write at %px\n", size, do_nothing); > + memcpy(do_nothing, do_almost_nothing, size); > + pr_err("FAIL: survived bad write\n"); Non-function-descriptor architectures would successfully crash at the memcpy too, right? (i.e. for them this is just repeating WRITE_KERN) I'm pondering the utility of the XFAIL vs just letting is succeed, but I think it more accurate to say "hey, no OPD" as you have it. > + > + asm("" : "=m"(func)); > + func(); > +} > + > void lkdtm_EXEC_DATA(void) > { > execute_location(data_area, CODE_WRITE); > -- > 2.31.1 > One tiny suggestion, since I think you need to respin for the EXPORT_SYMBOL_GPL() anyway. Please update the selftests too: diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt index 6b36b7f5dcf9..243c781f0780 100644 --- a/tools/testing/selftests/lkdtm/tests.txt +++ b/tools/testing/selftests/lkdtm/tests.txt @@ -44,6 +44,7 @@ ACCESS_NULL WRITE_RO WRITE_RO_AFTER_INIT WRITE_KERN +WRITE_OPD REFCOUNT_INC_OVERFLOW REFCOUNT_ADD_OVERFLOW REFCOUNT_INC_NOT_ZERO_OVERFLOW (Though for the future I've been considering making the selftests an opt-out list so the "normal" stuff doesn't need to keep getting added there.) Thanks! Acked-by: Kees Cook -Kees -- Kees Cook
Re: [PATCH v3 11/12] lkdtm: Fix execute_[user]_location()
On Sun, Oct 17, 2021 at 02:38:24PM +0200, Christophe Leroy wrote: > execute_location() and execute_user_location() intent > to copy do_nothing() text and execute it at a new location. > However, at the time being it doesn't copy do_nothing() function > but do_nothing() function descriptor which still points to the > original text. So at the end it still executes do_nothing() at > its original location allthough using a copied function descriptor. > > So, fix that by really copying do_nothing() text and build a new > function descriptor by copying do_nothing() function descriptor and > updating the target address with the new location. > > Also fix the displayed addresses by dereferencing do_nothing() > function descriptor. > > Signed-off-by: Christophe Leroy This looks good. I might rename variables in the future (e.g. to avoid the churn from adding _text) but also, that does help keep it clear. :) Acked-by: Kees Cook -Kees > --- > drivers/misc/lkdtm/perms.c | 37 - > 1 file changed, 28 insertions(+), 9 deletions(-) > > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > index 035fcca441f0..1cf24c4a79e9 100644 > --- a/drivers/misc/lkdtm/perms.c > +++ b/drivers/misc/lkdtm/perms.c > @@ -44,19 +44,34 @@ static noinline void do_overwritten(void) > return; > } > > +static void *setup_function_descriptor(func_desc_t *fdesc, void *dst) > +{ > + if (!have_function_descriptors()) > + return dst; > + > + memcpy(fdesc, do_nothing, sizeof(*fdesc)); > + fdesc->addr = (unsigned long)dst; > + barrier(); > + > + return fdesc; > +} > + > static noinline void execute_location(void *dst, bool write) > { > - void (*func)(void) = dst; > + void (*func)(void); > + func_desc_t fdesc; > + void *do_nothing_text = dereference_function_descriptor(do_nothing); > > - pr_info("attempting ok execution at %px\n", do_nothing); > + pr_info("attempting ok execution at %px\n", do_nothing_text); > do_nothing(); > > if (write == CODE_WRITE) { > - memcpy(dst, do_nothing, EXEC_SIZE); > + memcpy(dst, do_nothing_text, EXEC_SIZE); > flush_icache_range((unsigned long)dst, > (unsigned long)dst + EXEC_SIZE); > } > - pr_info("attempting bad execution at %px\n", func); > + pr_info("attempting bad execution at %px\n", dst); > + func = setup_function_descriptor(, dst); > func(); > pr_err("FAIL: func returned\n"); > } > @@ -66,16 +81,19 @@ static void execute_user_location(void *dst) > int copied; > > /* Intentionally crossing kernel/user memory boundary. */ > - void (*func)(void) = dst; > + void (*func)(void); > + func_desc_t fdesc; > + void *do_nothing_text = dereference_function_descriptor(do_nothing); > > - pr_info("attempting ok execution at %px\n", do_nothing); > + pr_info("attempting ok execution at %px\n", do_nothing_text); > do_nothing(); > > - copied = access_process_vm(current, (unsigned long)dst, do_nothing, > + copied = access_process_vm(current, (unsigned long)dst, do_nothing_text, > EXEC_SIZE, FOLL_WRITE); > if (copied < EXEC_SIZE) > return; > - pr_info("attempting bad execution at %px\n", func); > + pr_info("attempting bad execution at %px\n", dst); > + func = setup_function_descriptor(, dst); > func(); > pr_err("FAIL: func returned\n"); > } > @@ -153,7 +171,8 @@ void lkdtm_EXEC_VMALLOC(void) > > void lkdtm_EXEC_RODATA(void) > { > - execute_location(lkdtm_rodata_do_nothing, CODE_AS_IS); > + > execute_location(dereference_function_descriptor(lkdtm_rodata_do_nothing), > + CODE_AS_IS); > } > > void lkdtm_EXEC_USERSPACE(void) > -- > 2.31.1 > -- Kees Cook
Re: [PATCH v3 08/12] asm-generic: Refactor dereference_[kernel]_function_descriptor()
On Thu, Feb 10, 2022 at 09:30:43PM +1100, Michael Ellerman wrote: > Christophe Leroy writes: > > diff --git a/kernel/extable.c b/kernel/extable.c > > index b0ea5eb0c3b4..1ef13789bea9 100644 > > --- a/kernel/extable.c > > +++ b/kernel/extable.c > > @@ -159,12 +160,32 @@ int kernel_text_address(unsigned long addr) > > } > > > > /* > > - * On some architectures (PPC64, IA64) function pointers > > + * On some architectures (PPC64, IA64, PARISC) function pointers > > * are actually only tokens to some data that then holds the > > * real function address. As a result, to find if a function > > * pointer is part of the kernel text, we need to do some > > * special dereferencing first. > > */ > > +#ifdef CONFIG_HAVE_FUNCTION_DESCRIPTORS > > +void *dereference_function_descriptor(void *ptr) > > +{ > > + func_desc_t *desc = ptr; > > + void *p; > > + > > + if (!get_kernel_nofault(p, (void *)>addr)) > > + ptr = p; > > + return ptr; > > +} > > This needs an EXPORT_SYMBOL_GPL(), otherwise the build breaks after > patch 10 with CONFIG_LKDTM=m. Oh good catch! (There have been a few cases of LKDTM=m being the only thing needed a symbol, so I've pondered giving it a namespace or constructing a little ifdef wrapper... but this seems ok to export...) -- Kees Cook
Re: [PATCH v3 04/12] powerpc: Prepare func_desc_t for refactorisation
On Sun, Oct 17, 2021 at 02:38:17PM +0200, Christophe Leroy wrote: > In preparation of making func_desc_t generic, change the ELFv2 > version to a struct containing 'addr' element. > > This allows using single helpers common to ELFv1 and ELFv2. > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/kernel/module_64.c | 32 ++-- > 1 file changed, 14 insertions(+), 18 deletions(-) > > diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c > index a89da0ee25e2..b687ef88c4c4 100644 > --- a/arch/powerpc/kernel/module_64.c > +++ b/arch/powerpc/kernel/module_64.c > @@ -33,19 +33,13 @@ > #ifdef PPC64_ELF_ABI_v2 > > /* An address is simply the address of the function. */ > -typedef unsigned long func_desc_t; > +typedef struct { > + unsigned long addr; > +} func_desc_t; > > static func_desc_t func_desc(unsigned long addr) > { > - return addr; > -} > -static unsigned long func_addr(unsigned long addr) > -{ > - return addr; > -} > -static unsigned long stub_func_addr(func_desc_t func) > -{ > - return func; > + return (func_desc_t){addr}; There's only 1 element in the struct, so okay, but it hurt my eyes a little. I would have been happier with: return (func_desc_t){ .addr = addr; }; But of course that also looks bonkers because it starts with "return". So no matter what I do my eyes bug out. ;) So it's fine either way. :) Reviewed-by: Kees Cook > } > > /* PowerPC64 specific values for the Elf64_Sym st_other field. */ > @@ -70,14 +64,6 @@ static func_desc_t func_desc(unsigned long addr) > { > return *(struct func_desc *)addr; > } > -static unsigned long func_addr(unsigned long addr) > -{ > - return func_desc(addr).addr; > -} > -static unsigned long stub_func_addr(func_desc_t func) > -{ > - return func.addr; > -} > static unsigned int local_entry_offset(const Elf64_Sym *sym) > { > return 0; > @@ -93,6 +79,16 @@ void *dereference_module_function_descriptor(struct module > *mod, void *ptr) > } > #endif > > +static unsigned long func_addr(unsigned long addr) > +{ > + return func_desc(addr).addr; > +} > + > +static unsigned long stub_func_addr(func_desc_t func) > +{ > + return func.addr; > +} > + > #define STUB_MAGIC 0x73747562 /* stub */ > > /* Like PPC32, we need little trampolines to do > 24-bit jumps (into > -- > 2.31.1 > -- Kees Cook
Re: [PATCH v3 01/12] powerpc: Move and rename func_descr_t
On Sun, Oct 17, 2021 at 02:38:14PM +0200, Christophe Leroy wrote: > There are three architectures with function descriptors, try to > have common names for the address they contain in order to > refactor some functions into generic functions later. > > powerpc has 'entry' > ia64 has 'ip' > parisc has 'addr' > > Vote for 'addr' and update 'func_descr_t' accordingly. > > Move it in asm/elf.h to have it at the same place on all > three architectures, remove the typedef which hides its real > type, and change it to a smoother name 'struct func_desc'. > > Signed-off-by: Christophe Leroy I like the name. :) Reviewed-by: Kees Cook -- Kees Cook
Re: [powerpc] ftrace warning kernel/trace/ftrace.c:2068 with code-patching selftests
On Thu, Jan 27, 2022 at 11:46:49AM +, Mark Rutland wrote: > I'm not sure how x86 works here; AFAICT the relocations are performed during > decompression, but it looks like there's some special build-time processing > associated with that, and the vmlinux doesn't contain standard ELF > relocations. > > Kees, IIUC you added the x86_64 support there, can you shed any light on > if/how > this works on x86? I think Sven beat me to it, and this was answered in https://lore.kernel.org/lkml/yt9dy231gzae@linux.ibm.com but let me know if anything needs further info. An additional note is that x86 is built with "-2G addressing" (-mcmodel=kernel). There was some work done to make it actually PIE, which would allow the KASLR base to move further: https://github.com/KSPP/linux/issues/38 -Kees -- Kees Cook
Re: [PATCH v3 11/12] lkdtm: Fix execute_[user]_location()
On Wed, Jan 19, 2022 at 08:28:54PM +0100, Christophe Leroy wrote: > Hi Kees, > > > Le 17/12/2021 à 12:49, Christophe Leroy a écrit : > > Hi Kees, > > > > Le 17/10/2021 à 14:38, Christophe Leroy a écrit : > > > execute_location() and execute_user_location() intent > > > to copy do_nothing() text and execute it at a new location. > > > However, at the time being it doesn't copy do_nothing() function > > > but do_nothing() function descriptor which still points to the > > > original text. So at the end it still executes do_nothing() at > > > its original location allthough using a copied function descriptor. > > > > > > So, fix that by really copying do_nothing() text and build a new > > > function descriptor by copying do_nothing() function descriptor and > > > updating the target address with the new location. > > > > > > Also fix the displayed addresses by dereferencing do_nothing() > > > function descriptor. > > > > > > Signed-off-by: Christophe Leroy > > > > Do you have any comment to this patch and to patch 12 ? > > > > If not, is it ok to get your acked-by ? > > Any feedback please, even if it's to say no feedback ? Hi! Thanks for the ping; I haven't had time yet to look at this, but with -rc1 coming, I should be able to task-switch back to LKDTM for the dev cycle and I can give some feedback. -Kees > > Many thanks, > Christophe > > > > > Thanks > > Christophe > > > > > --- > > > drivers/misc/lkdtm/perms.c | 37 - > > > 1 file changed, 28 insertions(+), 9 deletions(-) > > > > > > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > > > index 035fcca441f0..1cf24c4a79e9 100644 > > > --- a/drivers/misc/lkdtm/perms.c > > > +++ b/drivers/misc/lkdtm/perms.c > > > @@ -44,19 +44,34 @@ static noinline void do_overwritten(void) > > > return; > > > } > > > +static void *setup_function_descriptor(func_desc_t *fdesc, void *dst) > > > +{ > > > + if (!have_function_descriptors()) > > > + return dst; > > > + > > > + memcpy(fdesc, do_nothing, sizeof(*fdesc)); > > > + fdesc->addr = (unsigned long)dst; > > > + barrier(); > > > + > > > + return fdesc; > > > +} > > > + > > > static noinline void execute_location(void *dst, bool write) > > > { > > > - void (*func)(void) = dst; > > > + void (*func)(void); > > > + func_desc_t fdesc; > > > + void *do_nothing_text = dereference_function_descriptor(do_nothing); > > > - pr_info("attempting ok execution at %px\n", do_nothing); > > > + pr_info("attempting ok execution at %px\n", do_nothing_text); > > > do_nothing(); > > > if (write == CODE_WRITE) { > > > - memcpy(dst, do_nothing, EXEC_SIZE); > > > + memcpy(dst, do_nothing_text, EXEC_SIZE); > > > flush_icache_range((unsigned long)dst, > > > (unsigned long)dst + EXEC_SIZE); > > > } > > > - pr_info("attempting bad execution at %px\n", func); > > > + pr_info("attempting bad execution at %px\n", dst); > > > + func = setup_function_descriptor(, dst); > > > func(); > > > pr_err("FAIL: func returned\n"); > > > } > > > @@ -66,16 +81,19 @@ static void execute_user_location(void *dst) > > > int copied; > > > /* Intentionally crossing kernel/user memory boundary. */ > > > - void (*func)(void) = dst; > > > + void (*func)(void); > > > + func_desc_t fdesc; > > > + void *do_nothing_text = dereference_function_descriptor(do_nothing); > > > - pr_info("attempting ok execution at %px\n", do_nothing); > > > + pr_info("attempting ok execution at %px\n", do_nothing_text); > > > do_nothing(); > > > - copied = access_process_vm(current, (unsigned long)dst, do_nothing, > > > + copied = access_process_vm(current, (unsigned long)dst, > > > do_nothing_text, > > > EXEC_SIZE, FOLL_WRITE); > > > if (copied < EXEC_SIZE) > > > return; > > > - pr_info("attempting bad execution at %px\n", func); > > > + pr_info("attempting bad execution at %px\n", dst); > > > + func = setup_function_descriptor(, dst); > > > func(); > > > pr_err("FAIL: func returned\n"); > > > } > > > @@ -153,7 +171,8 @@ void lkdtm_EXEC_VMALLOC(void) > > > void lkdtm_EXEC_RODATA(void) > > > { > > > - execute_location(lkdtm_rodata_do_nothing, CODE_AS_IS); > > > + > > > execute_location(dereference_function_descriptor(lkdtm_rodata_do_nothing), > > > > > > + CODE_AS_IS); > > > } > > > void lkdtm_EXEC_USERSPACE(void) > > > -- Kees Cook
Re: [PATCH] powerpc/process, kasan: Silence KASAN warnings in __get_wchan()
On Tue, Jan 18, 2022 at 08:50:25PM -0500, He Ying wrote: > The following KASAN warning was reported in our kernel. > > BUG: KASAN: stack-out-of-bounds in get_wchan+0x188/0x250 > Read of size 4 at addr d216f958 by task ps/14437 > > CPU: 3 PID: 14437 Comm: ps Tainted: G O 5.10.0 #1 > Call Trace: > [daa63858] [c0654348] dump_stack+0x9c/0xe4 (unreliable) > [daa63888] [c035cf0c] print_address_description.constprop.3+0x8c/0x570 > [daa63908] [c035d6bc] kasan_report+0x1ac/0x218 > [daa63948] [c00496e8] get_wchan+0x188/0x250 > [daa63978] [c0461ec8] do_task_stat+0xce8/0xe60 > [daa63b98] [c0455ac8] proc_single_show+0x98/0x170 > [daa63bc8] [c03cab8c] seq_read_iter+0x1ec/0x900 > [daa63c38] [c03cb47c] seq_read+0x1dc/0x290 > [daa63d68] [c037fc94] vfs_read+0x164/0x510 > [daa63ea8] [c03808e4] ksys_read+0x144/0x1d0 > [daa63f38] [c005b1dc] ret_from_syscall+0x0/0x38 > --- interrupt: c00 at 0x8fa8f4 > LR = 0x8fa8cc > > The buggy address belongs to the page: > page:98ebcdd2 refcount:0 mapcount:0 mapping: index:0x2 pfn:0x1216f > flags: 0x0() > raw: 01010122 0002 > raw: > page dumped because: kasan: bad access detected > > Memory state around the buggy address: >d216f800: 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 >d216f880: f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >d216f900: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 > ^ >d216f980: f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 >d216fa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > After looking into this issue, I find the buggy address belongs > to the task stack region. It seems KASAN has something wrong. > I look into the code of __get_wchan in x86 architecture and > find the same issue has been resolved by the commit > f7d27c35ddff ("x86/mm, kasan: Silence KASAN warnings in get_wchan()"). > The solution could be applied to powerpc architecture too. > > As Andrey Ryabinin said, get_wchan() is racy by design, it may > access volatile stack of running task, thus it may access > redzone in a stack frame and cause KASAN to warn about this. > > Use READ_ONCE_NOCHECK() to silence these warnings. > > Signed-off-by: He Ying Looks reasonable to me; thanks! Reviewed-by: Kees Cook -Kees > --- > arch/powerpc/kernel/process.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c > index 984813a4d5dc..a75d20f23dac 100644 > --- a/arch/powerpc/kernel/process.c > +++ b/arch/powerpc/kernel/process.c > @@ -2160,12 +2160,12 @@ static unsigned long ___get_wchan(struct task_struct > *p) > return 0; > > do { > - sp = *(unsigned long *)sp; > + sp = READ_ONCE_NOCHECK(*(unsigned long *)sp); > if (!validate_sp(sp, p, STACK_FRAME_OVERHEAD) || > task_is_running(p)) > return 0; > if (count > 0) { > - ip = ((unsigned long *)sp)[STACK_FRAME_LR_SAVE]; > + ip = READ_ONCE_NOCHECK(((unsigned long > *)sp)[STACK_FRAME_LR_SAVE]); > if (!in_sched_functions(ip)) > return ip; > } > -- > 2.17.1 > -- Kees Cook
Re: [PATCH] powerpc/signal32: Use struct_group() to zero spe regs
On Wed, Nov 24, 2021 at 11:08:25AM +1100, Michael Ellerman wrote: > Kees Cook writes: > > On Mon, Nov 22, 2021 at 04:43:36PM +1100, Michael Ellerman wrote: > >> LEROY Christophe writes: > >> > Le 18/11/2021 à 21:36, Kees Cook a écrit : > >> >> In preparation for FORTIFY_SOURCE performing compile-time and run-time > >> >> field bounds checking for memset(), avoid intentionally writing across > >> >> neighboring fields. > >> >> > >> >> Add a struct_group() for the spe registers so that memset() can > >> >> correctly reason > >> >> about the size: > >> >> > >> >> In function 'fortify_memset_chk', > >> >> inlined from 'restore_user_regs.part.0' at > >> >> arch/powerpc/kernel/signal_32.c:539:3: > >> >> >> include/linux/fortify-string.h:195:4: error: call to > >> >> '__write_overflow_field' declared with attribute warning: detected > >> >> write beyond size of field (1st parameter); maybe use struct_group()? > >> >> [-Werror=attribute-warning] > >> >> 195 |__write_overflow_field(); > >> >> |^~~~ > >> >> > >> >> Reported-by: kernel test robot > >> >> Signed-off-by: Kees Cook > >> > > >> > Reviewed-by: Christophe Leroy > >> > >> Acked-by: Michael Ellerman > > > > Thanks! Should I take this via my tree, or do you want to take it via > > ppc? > > I don't mind. If it's easier for you to take it as part of an existing > series then do that, otherwise I can pick it up. Most of the other patches are going via other maintainers, so I'd love it if ppc could snag this one too. :) Thanks! -Kees -- Kees Cook
Re: [PATCH] powerpc/signal32: Use struct_group() to zero spe regs
On Mon, Nov 22, 2021 at 04:43:36PM +1100, Michael Ellerman wrote: > LEROY Christophe writes: > > Le 18/11/2021 à 21:36, Kees Cook a écrit : > >> In preparation for FORTIFY_SOURCE performing compile-time and run-time > >> field bounds checking for memset(), avoid intentionally writing across > >> neighboring fields. > >> > >> Add a struct_group() for the spe registers so that memset() can correctly > >> reason > >> about the size: > >> > >> In function 'fortify_memset_chk', > >> inlined from 'restore_user_regs.part.0' at > >> arch/powerpc/kernel/signal_32.c:539:3: > >> >> include/linux/fortify-string.h:195:4: error: call to > >> '__write_overflow_field' declared with attribute warning: detected write > >> beyond size of field (1st parameter); maybe use struct_group()? > >> [-Werror=attribute-warning] > >> 195 |__write_overflow_field(); > >> |^~~~ > >> > >> Reported-by: kernel test robot > >> Signed-off-by: Kees Cook > > > > Reviewed-by: Christophe Leroy > > Acked-by: Michael Ellerman Thanks! Should I take this via my tree, or do you want to take it via ppc? -Kees -- Kees Cook
Re: [PATCH] powerpc/signal32: Use struct_group() to zero spe regs
On Fri, Nov 19, 2021 at 05:35:00PM +0100, Christophe Leroy wrote: > Neither do I. I was just scared by what I saw while reviewing your patch. A > cleanup is probably required but it can be another patch. Heh, understood! For my end, my objective with the fortify work is to either split cross-member memcpy() calls (which is usually undesirable) or add a struct group so it can be seen as a "single member" memcpy again (and usually results in 0 differences in binary output). :) -- Kees Cook
Re: [PATCH] powerpc/signal32: Use struct_group() to zero spe regs
On Fri, Nov 19, 2021 at 08:46:27AM +, LEROY Christophe wrote: > > > Le 18/11/2021 à 21:36, Kees Cook a écrit : > > In preparation for FORTIFY_SOURCE performing compile-time and run-time > > field bounds checking for memset(), avoid intentionally writing across > > neighboring fields. > > > > Add a struct_group() for the spe registers so that memset() can correctly > > reason > > about the size: > > > > In function 'fortify_memset_chk', > > inlined from 'restore_user_regs.part.0' at > > arch/powerpc/kernel/signal_32.c:539:3: > > >> include/linux/fortify-string.h:195:4: error: call to > > '__write_overflow_field' declared with attribute warning: detected write > > beyond size of field (1st parameter); maybe use struct_group()? > > [-Werror=attribute-warning] > > 195 |__write_overflow_field(); > > |^~~~ > > > > Reported-by: kernel test robot > > Signed-off-by: Kees Cook > > Reviewed-by: Christophe Leroy > > However, is it really worth adding that grouping ? Wouldn't it be > cleaner to handle evr[] and acc separately ? Now that we are using > unsafe variants of get/put user performance wouldn't be impacted. I'm fine with whatever is desired here. I reworked an earlier version of this patch based on mpe's feedback, so I can certain rework it again. :) > > I have some doubts about things like: > > unsafe_copy_to_user(>mc_vregs, current->thread.evr, > ELF_NEVRREG * sizeof(u32), failed); > > Because as far as I can see, ELF_NEVRREG is 34 but mc_vregs is a table > of 33 u32 and is at the end of the structure: > > struct mcontext { > elf_gregset_t mc_gregs; > elf_fpregset_t mc_fregs; > unsigned long mc_pad[2]; > elf_vrregset_t mc_vregs __attribute__((__aligned__(16))); > }; > > typedef elf_vrreg_t elf_vrregset_t[ELF_NVRREG]; > > # define ELF_NEVRREG34 /* includes acc (as 2) */ > # define ELF_NVRREG 33 /* includes vscr */ I don't know these internals very well -- do you want me to change this specifically somehow? With the BUILD_BUG_ON()s added, there's no binary change here -- I wanted to make sure nothing was different in the output. -Kees > > > > > --- > > arch/powerpc/include/asm/processor.h | 6 -- > > arch/powerpc/kernel/signal_32.c | 14 +- > > 2 files changed, 13 insertions(+), 7 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/processor.h > > b/arch/powerpc/include/asm/processor.h > > index e39bd0ff69f3..978a80308466 100644 > > --- a/arch/powerpc/include/asm/processor.h > > +++ b/arch/powerpc/include/asm/processor.h > > @@ -191,8 +191,10 @@ struct thread_struct { > > int used_vsr; /* set if process has used VSX */ > > #endif /* CONFIG_VSX */ > > #ifdef CONFIG_SPE > > - unsigned long evr[32];/* upper 32-bits of SPE regs */ > > - u64 acc;/* Accumulator */ > > + struct_group(spe, > > + unsigned long evr[32];/* upper 32-bits of SPE regs */ > > + u64 acc;/* Accumulator */ > > + ); > > unsigned long spefscr;/* SPE & eFP status */ > > unsigned long spefscr_last; /* SPEFSCR value on last prctl > >call or trap return */ > > diff --git a/arch/powerpc/kernel/signal_32.c > > b/arch/powerpc/kernel/signal_32.c > > index 00a9c9cd6d42..5e1664b501e4 100644 > > --- a/arch/powerpc/kernel/signal_32.c > > +++ b/arch/powerpc/kernel/signal_32.c > > @@ -527,16 +527,20 @@ static long restore_user_regs(struct pt_regs *regs, > > regs_set_return_msr(regs, regs->msr & ~(MSR_FP | MSR_FE0 | MSR_FE1)); > > > > #ifdef CONFIG_SPE > > - /* force the process to reload the spe registers from > > - current->thread when it next does spe instructions */ > > + /* > > +* Force the process to reload the spe registers from > > +* current->thread when it next does spe instructions. > > +* Since this is user ABI, we must enforce the sizing. > > +*/ > > + BUILD_BUG_ON(sizeof(current->thread.spe) != ELF_NEVRREG * sizeof(u32)); > > regs_set_return_msr(regs, regs->msr & ~MSR_SPE); > > if (msr & MSR_SPE) { > > /* restore spe registers from the stack */ > > - unsafe_copy_from_user(current->thread.evr, >mc_vreg
[PATCH] powerpc/signal32: Use struct_group() to zero spe regs
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Add a struct_group() for the spe registers so that memset() can correctly reason about the size: In function 'fortify_memset_chk', inlined from 'restore_user_regs.part.0' at arch/powerpc/kernel/signal_32.c:539:3: >> include/linux/fortify-string.h:195:4: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 195 |__write_overflow_field(); |^~~~ Reported-by: kernel test robot Signed-off-by: Kees Cook --- arch/powerpc/include/asm/processor.h | 6 -- arch/powerpc/kernel/signal_32.c | 14 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index e39bd0ff69f3..978a80308466 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -191,8 +191,10 @@ struct thread_struct { int used_vsr; /* set if process has used VSX */ #endif /* CONFIG_VSX */ #ifdef CONFIG_SPE - unsigned long evr[32];/* upper 32-bits of SPE regs */ - u64 acc;/* Accumulator */ + struct_group(spe, + unsigned long evr[32];/* upper 32-bits of SPE regs */ + u64 acc;/* Accumulator */ + ); unsigned long spefscr;/* SPE & eFP status */ unsigned long spefscr_last; /* SPEFSCR value on last prctl call or trap return */ diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c index 00a9c9cd6d42..5e1664b501e4 100644 --- a/arch/powerpc/kernel/signal_32.c +++ b/arch/powerpc/kernel/signal_32.c @@ -527,16 +527,20 @@ static long restore_user_regs(struct pt_regs *regs, regs_set_return_msr(regs, regs->msr & ~(MSR_FP | MSR_FE0 | MSR_FE1)); #ifdef CONFIG_SPE - /* force the process to reload the spe registers from - current->thread when it next does spe instructions */ + /* +* Force the process to reload the spe registers from +* current->thread when it next does spe instructions. +* Since this is user ABI, we must enforce the sizing. +*/ + BUILD_BUG_ON(sizeof(current->thread.spe) != ELF_NEVRREG * sizeof(u32)); regs_set_return_msr(regs, regs->msr & ~MSR_SPE); if (msr & MSR_SPE) { /* restore spe registers from the stack */ - unsafe_copy_from_user(current->thread.evr, >mc_vregs, - ELF_NEVRREG * sizeof(u32), failed); + unsafe_copy_from_user(>thread.spe, >mc_vregs, + sizeof(current->thread.spe), failed); current->thread.used_spe = true; } else if (current->thread.used_spe) - memset(current->thread.evr, 0, ELF_NEVRREG * sizeof(u32)); + memset(>thread.spe, 0, sizeof(current->thread.spe)); /* Always get SPEFSCR back */ unsafe_get_user(current->thread.spefscr, (u32 __user *)>mc_vregs + ELF_NEVRREG, failed); -- 2.30.2
Re: Build regressions/improvements in v5.16-rc1
On Mon, Nov 15, 2021 at 05:44:33PM +0100, Marco Elver wrote: > On Mon, Nov 15, 2021 at 05:12PM +0100, Geert Uytterhoeven wrote: > [...] > > > + /kisskb/src/include/linux/fortify-string.h: error: call to > > > '__read_overflow' declared with attribute error: detected read beyond > > > size of object (1st parameter): => 263:25, 277:17 > > > > in lib/test_kasan.c > > > > s390-all{mod,yes}config > > arm64-allmodconfig (gcc11) > > Kees, wasn't that what [1] was meant to fix? > [1] https://lkml.kernel.org/r/20211006181544.1670992-1-keesc...@chromium.org Ah, I found it: http://kisskb.ellerman.id.au/kisskb/buildresult/14660585/log/ it's actually: inlined from 'kasan_memcmp' at /kisskb/src/lib/test_kasan.c:897:2: and inlined from 'kasan_memchr' at /kisskb/src/lib/test_kasan.c:872:2: I can send a patch doing the same as what [1] does for these cases too. -- Kees Cook
Re: Build regressions/improvements in v5.16-rc1
On Mon, Nov 15, 2021 at 05:44:33PM +0100, Marco Elver wrote: > On Mon, Nov 15, 2021 at 05:12PM +0100, Geert Uytterhoeven wrote: > [...] > > > + /kisskb/src/include/linux/fortify-string.h: error: call to > > > '__read_overflow' declared with attribute error: detected read beyond > > > size of object (1st parameter): => 263:25, 277:17 > > > > in lib/test_kasan.c > > > > s390-all{mod,yes}config > > arm64-allmodconfig (gcc11) > > Kees, wasn't that what [1] was meant to fix? > [1] https://lkml.kernel.org/r/20211006181544.1670992-1-keesc...@chromium.org [1] fixed the ones I found when scanning for __write_overflow(). [2] fixed some others, so it's possible there are yet more to fix? Taking a look at Linus's tree, though, the "263" and "277" lines don't line up correctly. I'll go see if I can reproduce this. Is this with W=1? -Kees [2] https://www.ozlabs.org/~akpm/mmotm/broken-out/kasan-test-consolidate-workarounds-for-unwanted-__alloc_size-protection.patch -- Kees Cook
Re: [PATCH 07/20] signal/powerpc: On swapcontext failure force SIGSEGV
On Wed, Oct 20, 2021 at 12:43:53PM -0500, Eric W. Biederman wrote: > If the register state may be partial and corrupted instead of calling > do_exit, call force_sigsegv(SIGSEGV). Which properly kills the > process with SIGSEGV and does not let any more userspace code execute, > instead of just killing one thread of the process and potentially > confusing everything. > > Cc: Michael Ellerman > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: linuxppc-dev@lists.ozlabs.org > History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git > Fixes: 756f1ae8a44e ("PPC32: Rework signal code and add a swapcontext system > call.") > Fixes: 04879b04bf50 ("[PATCH] ppc64: VMX (Altivec) support & signal32 rework, > from Ben Herrenschmidt") > Signed-off-by: "Eric W. Biederman" This looks right to me. Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v2 13/13] lkdtm: Add a test for function descriptors protection
On Thu, Oct 14, 2021 at 07:50:02AM +0200, Christophe Leroy wrote: > Add WRITE_OPD to check that you can't modify function > descriptors. > > Gives the following result when function descriptors are > not protected: > > lkdtm: Performing direct entry WRITE_OPD > lkdtm: attempting bad 16 bytes write at c269b358 > lkdtm: FAIL: survived bad write > lkdtm: do_nothing was hijacked! > > Looks like a standard compiler barrier(); is not enough to force > GCC to use the modified function descriptor. Add to add a fake empty > inline assembly to force GCC to reload the function descriptor. > > Signed-off-by: Christophe Leroy > --- > drivers/misc/lkdtm/core.c | 1 + > drivers/misc/lkdtm/lkdtm.h | 1 + > drivers/misc/lkdtm/perms.c | 22 ++ > 3 files changed, 24 insertions(+) > > diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c > index fe6fd34b8caf..de092aa03b5d 100644 > --- a/drivers/misc/lkdtm/core.c > +++ b/drivers/misc/lkdtm/core.c > @@ -148,6 +148,7 @@ static const struct crashtype crashtypes[] = { > CRASHTYPE(WRITE_RO), > CRASHTYPE(WRITE_RO_AFTER_INIT), > CRASHTYPE(WRITE_KERN), > + CRASHTYPE(WRITE_OPD), > CRASHTYPE(REFCOUNT_INC_OVERFLOW), > CRASHTYPE(REFCOUNT_ADD_OVERFLOW), > CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW), > diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h > index c212a253edde..188bd0fd6575 100644 > --- a/drivers/misc/lkdtm/lkdtm.h > +++ b/drivers/misc/lkdtm/lkdtm.h > @@ -105,6 +105,7 @@ void __init lkdtm_perms_init(void); > void lkdtm_WRITE_RO(void); > void lkdtm_WRITE_RO_AFTER_INIT(void); > void lkdtm_WRITE_KERN(void); > +void lkdtm_WRITE_OPD(void); > void lkdtm_EXEC_DATA(void); > void lkdtm_EXEC_STACK(void); > void lkdtm_EXEC_KMALLOC(void); > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > index 96b3ebfcb8ed..3870bc82d40d 100644 > --- a/drivers/misc/lkdtm/perms.c > +++ b/drivers/misc/lkdtm/perms.c > @@ -44,6 +44,11 @@ static noinline void do_overwritten(void) > return; > } > > +static noinline void do_almost_nothing(void) > +{ > + pr_info("do_nothing was hijacked!\n"); > +} > + > static void *setup_function_descriptor(func_desc_t *fdesc, void *dst) > { > memcpy(fdesc, do_nothing, sizeof(*fdesc)); > @@ -143,6 +148,23 @@ void lkdtm_WRITE_KERN(void) > do_overwritten(); > } > > +void lkdtm_WRITE_OPD(void) > +{ > + size_t size = sizeof(func_desc_t); > + void (*func)(void) = do_nothing; > + > + if (!have_function_descriptors()) { > + pr_info("Platform doesn't have function descriptors.\n"); This should be more explicit ('xfail'): pr_info("XFAIL: platform doesn't use function descriptors.\n"); > + return; > + } > + pr_info("attempting bad %zu bytes write at %px\n", size, do_nothing); > + memcpy(do_nothing, do_almost_nothing, size); > + pr_err("FAIL: survived bad write\n"); > + > + asm("" : "=m"(func)); Since this is a descriptor, I assume no icache flush is needed. Are function descriptors strictly dcache? (Is anything besides just a barrier needed?) > + func(); > +} > + > void lkdtm_EXEC_DATA(void) > { > execute_location(data_area, CODE_WRITE); > -- > 2.31.1 > -- Kees Cook
Re: [PATCH v2 11/13] lkdtm: Fix lkdtm_EXEC_RODATA()
On Thu, Oct 14, 2021 at 07:50:00AM +0200, Christophe Leroy wrote: > Behind its location, lkdtm_EXEC_RODATA() executes > lkdtm_rodata_do_nothing() which is a real function, > not a copy of do_nothing(). > > So executes it directly instead of using execute_location(). > > This is necessary because following patch will fix execute_location() > to use a copy of the function descriptor of do_nothing() and > function descriptor of lkdtm_rodata_do_nothing() might be different. > > And fix displayed addresses by dereferencing the function descriptors. > > Signed-off-by: Christophe Leroy I still don't understand this -- it doesn't look needed at all given the changes in patch 12. (i.e. everything is using dereference_function_descriptor() now) Can't this patch be dropped? -Kees > --- > drivers/misc/lkdtm/perms.c | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > index 035fcca441f0..5266dc28df6e 100644 > --- a/drivers/misc/lkdtm/perms.c > +++ b/drivers/misc/lkdtm/perms.c > @@ -153,7 +153,14 @@ void lkdtm_EXEC_VMALLOC(void) > > void lkdtm_EXEC_RODATA(void) > { > - execute_location(lkdtm_rodata_do_nothing, CODE_AS_IS); > + pr_info("attempting ok execution at %px\n", > + dereference_function_descriptor(do_nothing)); > + do_nothing(); > + > + pr_info("attempting bad execution at %px\n", > + dereference_function_descriptor(lkdtm_rodata_do_nothing)); > + lkdtm_rodata_do_nothing(); > + pr_err("FAIL: func returned\n"); > } > > void lkdtm_EXEC_USERSPACE(void) > -- > 2.31.1 > -- Kees Cook
Re: [PATCH v2 12/13] lkdtm: Fix execute_[user]_location()
On Thu, Oct 14, 2021 at 07:50:01AM +0200, Christophe Leroy wrote: > execute_location() and execute_user_location() intent > to copy do_nothing() text and execute it at a new location. > However, at the time being it doesn't copy do_nothing() function > but do_nothing() function descriptor which still points to the > original text. So at the end it still executes do_nothing() at > its original location allthough using a copied function descriptor. > > So, fix that by really copying do_nothing() text and build a new > function descriptor by copying do_nothing() function descriptor and > updating the target address with the new location. > > Also fix the displayed addresses by dereferencing do_nothing() > function descriptor. > > Signed-off-by: Christophe Leroy > --- > drivers/misc/lkdtm/perms.c | 25 + > include/asm-generic/sections.h | 5 + > 2 files changed, 26 insertions(+), 4 deletions(-) > > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > index 5266dc28df6e..96b3ebfcb8ed 100644 > --- a/drivers/misc/lkdtm/perms.c > +++ b/drivers/misc/lkdtm/perms.c > @@ -44,19 +44,32 @@ static noinline void do_overwritten(void) > return; > } > > +static void *setup_function_descriptor(func_desc_t *fdesc, void *dst) > +{ > + memcpy(fdesc, do_nothing, sizeof(*fdesc)); > + fdesc->addr = (unsigned long)dst; > + barrier(); > + > + return fdesc; > +} How about collapsing the "have_function_descriptors()" check into setup_function_descriptor()? static void *setup_function_descriptor(func_desc_t *fdesc, void *dst) { if (__is_defined(HAVE_FUNCTION_DESCRIPTORS)) { memcpy(fdesc, do_nothing, sizeof(*fdesc)); fdesc->addr = (unsigned long)dst; barrier(); return fdesc; } else { return dst; } } > + > static noinline void execute_location(void *dst, bool write) > { > void (*func)(void) = dst; > + func_desc_t fdesc; > + void *do_nothing_text = dereference_function_descriptor(do_nothing); > > - pr_info("attempting ok execution at %px\n", do_nothing); > + pr_info("attempting ok execution at %px\n", do_nothing_text); > do_nothing(); > > if (write == CODE_WRITE) { > - memcpy(dst, do_nothing, EXEC_SIZE); > + memcpy(dst, do_nothing_text, EXEC_SIZE); > flush_icache_range((unsigned long)dst, > (unsigned long)dst + EXEC_SIZE); > } > pr_info("attempting bad execution at %px\n", func); > + if (have_function_descriptors()) > + func = setup_function_descriptor(, dst); > func(); > pr_err("FAIL: func returned\n"); > } > @@ -67,15 +80,19 @@ static void execute_user_location(void *dst) > > /* Intentionally crossing kernel/user memory boundary. */ > void (*func)(void) = dst; > + func_desc_t fdesc; > + void *do_nothing_text = dereference_function_descriptor(do_nothing); > > - pr_info("attempting ok execution at %px\n", do_nothing); > + pr_info("attempting ok execution at %px\n", do_nothing_text); > do_nothing(); > > - copied = access_process_vm(current, (unsigned long)dst, do_nothing, > + copied = access_process_vm(current, (unsigned long)dst, do_nothing_text, > EXEC_SIZE, FOLL_WRITE); > if (copied < EXEC_SIZE) > return; > pr_info("attempting bad execution at %px\n", func); > + if (have_function_descriptors()) > + func = setup_function_descriptor(, dst); > func(); > pr_err("FAIL: func returned\n"); > } > diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h > index 76163883c6ff..d225318538bd 100644 > --- a/include/asm-generic/sections.h > +++ b/include/asm-generic/sections.h > @@ -70,6 +70,11 @@ typedef struct { > } func_desc_t; > #endif > > +static inline bool have_function_descriptors(void) > +{ > + return __is_defined(HAVE_FUNCTION_DESCRIPTORS); > +} > + > /* random extra sections (if any). Override > * in asm/sections.h */ > #ifndef arch_is_kernel_text This hunk seems like it should live in a separate patch. -- Kees Cook
Re: [PATCH 2/2] kbuild: use more subdir- for visiting subdirectories while cleaning
On Wed, Oct 13, 2021 at 03:36:22PM +0900, Masahiro Yamada wrote: > Documentation/kbuild/makefiles.rst suggests to use "archclean" for > cleaning arch/$(SRCARCH)/boot/. > > Since commit d92cc4d51643 ("kbuild: require all architectures to have > arch/$(SRCARCH)/Kbuild"), we can use the "subdir- += boot" trick for > all architectures. This can take advantage of the parallel option (-j) > for "make clean". > > I also cleaned up the comments. The "archdep" target does not exist. > > Signed-off-by: Masahiro Yamada I like the clean-up! Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v1 05/10] asm-generic: Define 'funct_descr_t' to commonly describe function descriptors
On Wed, Oct 13, 2021 at 09:23:56AM +0200, Christophe Leroy wrote: > > > Le 13/10/2021 à 09:01, Kees Cook a écrit : > > On Mon, Oct 11, 2021 at 05:25:32PM +0200, Christophe Leroy wrote: > > > We have three architectures using function descriptors, each with its > > > own name. > > > > > > Add a common typedef that can be used in generic code. > > > > > > Also add a stub typedef for architecture without function descriptors, > > > > nit: funct_descr_t reads weird to me. why not func_desc_t ? Either way: > > func_desc_t already exists in powerpc. I have a patch to remove it as it is > redundant with struct ppc64_opd_entry, but I didnt' want to include it in > this series. > > But after all I can add it in this series, I'll add it in v2. Ah-ha! That works for me. :) Thanks! -Kees -- Kees Cook
Re: [PATCH v1 09/10] lkdtm: Fix lkdtm_EXEC_RODATA()
On Mon, Oct 11, 2021 at 05:25:36PM +0200, Christophe Leroy wrote: > Behind a location, lkdtm_EXEC_RODATA() executes a real function, > not a copy of do_nothing(). > > So do it directly instead of using execute_location(). > > And fix displayed addresses by dereferencing the function descriptors. > > Signed-off-by: Christophe Leroy > --- > drivers/misc/lkdtm/perms.c | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > index 442d60ed25ef..da16564e1ecd 100644 > --- a/drivers/misc/lkdtm/perms.c > +++ b/drivers/misc/lkdtm/perms.c > @@ -153,7 +153,14 @@ void lkdtm_EXEC_VMALLOC(void) > > void lkdtm_EXEC_RODATA(void) > { > - execute_location(lkdtm_rodata_do_nothing, CODE_AS_IS); > + pr_info("attempting ok execution at %px\n", > + dereference_symbol_descriptor(do_nothing)); > + do_nothing(); > + > + pr_info("attempting bad execution at %px\n", > + dereference_symbol_descriptor(lkdtm_rodata_do_nothing)); > + lkdtm_rodata_do_nothing(); > + pr_err("FAIL: func returned\n"); > } (In re-reading this more carefully, I see now why kallsyms.h is used earlier: _function_ vs _symbol_ descriptor.) In the next patch: static noinline void execute_location(void *dst, bool write) { ... func = setup_function_descriptor(, dst); if (IS_ERR(func)) return; pr_info("attempting bad execution at %px\n", dst); func(); pr_err("FAIL: func returned\n"); } What are the conditions for which dereference_symbol_descriptor works but dereference _function_descriptor doesn't? -- Kees Cook
Re: [PATCH v1 10/10] lkdtm: Fix execute_[user]_location()
On Mon, Oct 11, 2021 at 05:25:37PM +0200, Christophe Leroy wrote: > execute_location() and execute_user_location() intent > to copy do_nothing() text and execute it at a new location. > However, at the time being it doesn't copy do_nothing() function > but do_nothing() function descriptor which still points to the > original text. So at the end it still executes do_nothing() at > its original location allthough using a copied function descriptor. > > So, fix that by really copying do_nothing() text and build a new > function descriptor by copying do_nothing() function descriptor and > updating the target address with the new location. > > Also fix the displayed addresses by dereferencing do_nothing() > function descriptor. > > Signed-off-by: Christophe Leroy > --- > drivers/misc/lkdtm/perms.c | 45 +++--- > 1 file changed, 37 insertions(+), 8 deletions(-) > > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > index da16564e1ecd..1d03cd44c21d 100644 > --- a/drivers/misc/lkdtm/perms.c > +++ b/drivers/misc/lkdtm/perms.c > @@ -44,19 +44,42 @@ static noinline void do_overwritten(void) > return; > } > > +static void *setup_function_descriptor(funct_descr_t *fdesc, void *dst) > +{ > + int err; > + > + if (!__is_defined(HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR)) > + return dst; > + > + err = copy_from_kernel_nofault(fdesc, do_nothing, sizeof(*fdesc)); > + if (err < 0) > + return ERR_PTR(err); > + > + fdesc->addr = (unsigned long)dst; > + barrier(); > + > + return fdesc; > +} > + > static noinline void execute_location(void *dst, bool write) > { > - void (*func)(void) = dst; > + void (*func)(void); > + funct_descr_t fdesc; > + void *do_nothing_text = dereference_symbol_descriptor(do_nothing); > > - pr_info("attempting ok execution at %px\n", do_nothing); > + pr_info("attempting ok execution at %px\n", do_nothing_text); > do_nothing(); > > if (write == CODE_WRITE) { > - memcpy(dst, do_nothing, EXEC_SIZE); > + memcpy(dst, do_nothing_text, EXEC_SIZE); > flush_icache_range((unsigned long)dst, > (unsigned long)dst + EXEC_SIZE); > } > - pr_info("attempting bad execution at %px\n", func); > + func = setup_function_descriptor(, dst); > + if (IS_ERR(func)) > + return; I think this error case should complain with some details. :) Maybe: pr_err("FAIL: could not build function descriptor for %px\n", dst); > + > + pr_info("attempting bad execution at %px\n", dst); And then leave this pr_info() as it was, before the setup_function_descriptor() call. > func(); > pr_err("FAIL: func returned\n"); > } > @@ -66,16 +89,22 @@ static void execute_user_location(void *dst) > int copied; > > /* Intentionally crossing kernel/user memory boundary. */ > - void (*func)(void) = dst; > + void (*func)(void); > + funct_descr_t fdesc; > + void *do_nothing_text = dereference_symbol_descriptor(do_nothing); > > - pr_info("attempting ok execution at %px\n", do_nothing); > + pr_info("attempting ok execution at %px\n", do_nothing_text); > do_nothing(); > > - copied = access_process_vm(current, (unsigned long)dst, do_nothing, > + copied = access_process_vm(current, (unsigned long)dst, do_nothing_text, > EXEC_SIZE, FOLL_WRITE); > if (copied < EXEC_SIZE) > return; > - pr_info("attempting bad execution at %px\n", func); > + func = setup_function_descriptor(, dst); > + if (IS_ERR(func)) > + return; > + > + pr_info("attempting bad execution at %px\n", dst); Same here. > func(); > pr_err("FAIL: func returned\n"); > } > -- > 2.31.1 > -- Kees Cook
Re: [PATCH v1 09/10] lkdtm: Fix lkdtm_EXEC_RODATA()
On Mon, Oct 11, 2021 at 05:25:36PM +0200, Christophe Leroy wrote: > Behind a location, lkdtm_EXEC_RODATA() executes a real function, > not a copy of do_nothing(). > > So do it directly instead of using execute_location(). I don't understand this. Why does the next patch not fix this? -Kees > > And fix displayed addresses by dereferencing the function descriptors. > > Signed-off-by: Christophe Leroy > --- > drivers/misc/lkdtm/perms.c | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > index 442d60ed25ef..da16564e1ecd 100644 > --- a/drivers/misc/lkdtm/perms.c > +++ b/drivers/misc/lkdtm/perms.c > @@ -153,7 +153,14 @@ void lkdtm_EXEC_VMALLOC(void) > > void lkdtm_EXEC_RODATA(void) > { > - execute_location(lkdtm_rodata_do_nothing, CODE_AS_IS); > + pr_info("attempting ok execution at %px\n", > + dereference_symbol_descriptor(do_nothing)); > + do_nothing(); > + > + pr_info("attempting bad execution at %px\n", > + dereference_symbol_descriptor(lkdtm_rodata_do_nothing)); > + lkdtm_rodata_do_nothing(); > + pr_err("FAIL: func returned\n"); > } > > void lkdtm_EXEC_USERSPACE(void) > -- > 2.31.1 > -- Kees Cook
Re: [PATCH v1 08/10] lkdtm: Really write into kernel text in WRITE_KERN
On Mon, Oct 11, 2021 at 05:25:35PM +0200, Christophe Leroy wrote: > WRITE_KERN is supposed to overwrite some kernel text, namely > do_overwritten() function. > > But at the time being it overwrites do_overwritten() function > descriptor, not function text. > > Fix it by dereferencing the function descriptor to obtain > function text pointer. > > And make do_overwritten() noinline so that it is really > do_overwritten() which is called by lkdtm_WRITE_KERN(). > > Signed-off-by: Christophe Leroy > --- > drivers/misc/lkdtm/perms.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c > index 60b3b2fe929d..442d60ed25ef 100644 > --- a/drivers/misc/lkdtm/perms.c > +++ b/drivers/misc/lkdtm/perms.c > @@ -5,6 +5,7 @@ > * even non-readable regions. > */ > #include "lkdtm.h" > +#include Why not #include instead here? > #include > #include > #include > @@ -37,7 +38,7 @@ static noinline void do_nothing(void) > } > > /* Must immediately follow do_nothing for size calculuations to work out. */ > -static void do_overwritten(void) > +static noinline void do_overwritten(void) > { > pr_info("do_overwritten wasn't overwritten!\n"); > return; > @@ -113,8 +114,9 @@ void lkdtm_WRITE_KERN(void) > size_t size; > volatile unsigned char *ptr; > > - size = (unsigned long)do_overwritten - (unsigned long)do_nothing; > - ptr = (unsigned char *)do_overwritten; > + size = (unsigned long)dereference_symbol_descriptor(do_overwritten) - > +(unsigned long)dereference_symbol_descriptor(do_nothing); > + ptr = dereference_symbol_descriptor(do_overwritten); But otherwise, yup, I expect there will be a bunch of things like this to clean up in LKDTM. :| Sorry about that! Acked-by: Kees Cook > > pr_info("attempting bad %zu byte write at %px\n", size, ptr); > memcpy((void *)ptr, (unsigned char *)do_nothing, size); > -- > 2.31.1 > -- Kees Cook
Re: [PATCH v1 07/10] lkdtm: Force do_nothing() out of line
On Mon, Oct 11, 2021 at 05:25:34PM +0200, Christophe Leroy wrote: > LKDTM tests display that the run do_nothing() at a given > address, but in reality do_nothing() is inlined into the > caller. > > Force it out of line so that it really runs text at the > displayed address. > > Signed-off-by: Christophe Leroy Acked-by: Kees Cook -- Kees Cook
Re: [PATCH v1 06/10] asm-generic: Refactor dereference_[kernel]_function_descriptor()
On Mon, Oct 11, 2021 at 05:25:33PM +0200, Christophe Leroy wrote: > dereference_function_descriptor() and > dereference_kernel_function_descriptor() are identical on the > three architectures implementing them. > > Make it common. > > Signed-off-by: Christophe Leroy > --- > arch/ia64/include/asm/sections.h| 19 --- > arch/parisc/include/asm/sections.h | 9 - > arch/parisc/kernel/process.c| 21 - > arch/powerpc/include/asm/sections.h | 23 --- > include/asm-generic/sections.h | 18 ++ > 5 files changed, 18 insertions(+), 72 deletions(-) A diffstat to love. :) Reviewed-by: Kees Cook > > diff --git a/arch/ia64/include/asm/sections.h > b/arch/ia64/include/asm/sections.h > index 929b5c535620..d9addaea8339 100644 > --- a/arch/ia64/include/asm/sections.h > +++ b/arch/ia64/include/asm/sections.h > @@ -30,23 +30,4 @@ extern char __start_gate_brl_fsys_bubble_down_patchlist[], > __end_gate_brl_fsys_b > extern char __start_unwind[], __end_unwind[]; > extern char __start_ivt_text[], __end_ivt_text[]; > > -#undef dereference_function_descriptor > -static inline void *dereference_function_descriptor(void *ptr) > -{ > - struct fdesc *desc = ptr; > - void *p; > - > - if (!get_kernel_nofault(p, (void *)>addr)) > - ptr = p; > - return ptr; > -} > - > -#undef dereference_kernel_function_descriptor > -static inline void *dereference_kernel_function_descriptor(void *ptr) > -{ > - if (ptr < (void *)__start_opd || ptr >= (void *)__end_opd) > - return ptr; > - return dereference_function_descriptor(ptr); > -} > - > #endif /* _ASM_IA64_SECTIONS_H */ > diff --git a/arch/parisc/include/asm/sections.h > b/arch/parisc/include/asm/sections.h > index 329e80f7af0a..b041fb32a300 100644 > --- a/arch/parisc/include/asm/sections.h > +++ b/arch/parisc/include/asm/sections.h > @@ -12,13 +12,4 @@ typedef Elf64_Fdesc funct_descr_t; > > extern char __alt_instructions[], __alt_instructions_end[]; > > -#ifdef CONFIG_64BIT > - > -#undef dereference_function_descriptor > -void *dereference_function_descriptor(void *); > - > -#undef dereference_kernel_function_descriptor > -void *dereference_kernel_function_descriptor(void *); > -#endif > - > #endif > diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c > index 38ec4ae81239..7382576b52a8 100644 > --- a/arch/parisc/kernel/process.c > +++ b/arch/parisc/kernel/process.c > @@ -266,27 +266,6 @@ get_wchan(struct task_struct *p) > return 0; > } > > -#ifdef CONFIG_64BIT > -void *dereference_function_descriptor(void *ptr) > -{ > - Elf64_Fdesc *desc = ptr; > - void *p; > - > - if (!get_kernel_nofault(p, (void *)>addr)) > - ptr = p; > - return ptr; > -} > - > -void *dereference_kernel_function_descriptor(void *ptr) > -{ > - if (ptr < (void *)__start_opd || > - ptr >= (void *)__end_opd) > - return ptr; > - > - return dereference_function_descriptor(ptr); > -} > -#endif > - > static inline unsigned long brk_rnd(void) > { > return (get_random_int() & BRK_RND_MASK) << PAGE_SHIFT; > diff --git a/arch/powerpc/include/asm/sections.h > b/arch/powerpc/include/asm/sections.h > index d0d5287fa568..8f8e95f234e2 100644 > --- a/arch/powerpc/include/asm/sections.h > +++ b/arch/powerpc/include/asm/sections.h > @@ -72,29 +72,6 @@ static inline int overlaps_kernel_text(unsigned long > start, unsigned long end) > (unsigned long)_stext < end; > } > > -#ifdef PPC64_ELF_ABI_v1 > - > -#undef dereference_function_descriptor > -static inline void *dereference_function_descriptor(void *ptr) > -{ > - struct ppc64_opd_entry *desc = ptr; > - void *p; > - > - if (!get_kernel_nofault(p, (void *)>addr)) > - ptr = p; > - return ptr; > -} > - > -#undef dereference_kernel_function_descriptor > -static inline void *dereference_kernel_function_descriptor(void *ptr) > -{ > - if (ptr < (void *)__start_opd || ptr >= (void *)__end_opd) > - return ptr; > - > - return dereference_function_descriptor(ptr); > -} > -#endif /* PPC64_ELF_ABI_v1 */ > - > #endif > > #endif /* __KERNEL__ */ > diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h > index 436412d94054..5baaf9d7c671 100644 > --- a/include/asm-generic/sections.h > +++ b/include/asm-generic/sections.h > @@ -60,6 +60,24 @@ extern __visible const void __nosave_begin, __nosave_end; > &g
Re: [PATCH v1 05/10] asm-generic: Define 'funct_descr_t' to commonly describe function descriptors
On Mon, Oct 11, 2021 at 05:25:32PM +0200, Christophe Leroy wrote: > We have three architectures using function descriptors, each with its > own name. > > Add a common typedef that can be used in generic code. > > Also add a stub typedef for architecture without function descriptors, nit: funct_descr_t reads weird to me. why not func_desc_t ? Either way: Reviewed-by: Kees Cook > to avoid a forest of #ifdefs. > > Signed-off-by: Christophe Leroy > --- > arch/ia64/include/asm/sections.h| 1 + > arch/parisc/include/asm/sections.h | 1 + > arch/powerpc/include/asm/sections.h | 1 + > include/asm-generic/sections.h | 3 +++ > 4 files changed, 6 insertions(+) > > diff --git a/arch/ia64/include/asm/sections.h > b/arch/ia64/include/asm/sections.h > index 80f5868afb06..929b5c535620 100644 > --- a/arch/ia64/include/asm/sections.h > +++ b/arch/ia64/include/asm/sections.h > @@ -8,6 +8,7 @@ > */ > > #define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1 > +typedef struct fdesc funct_descr_t; > > #include > #include > diff --git a/arch/parisc/include/asm/sections.h > b/arch/parisc/include/asm/sections.h > index 2e781ee19b66..329e80f7af0a 100644 > --- a/arch/parisc/include/asm/sections.h > +++ b/arch/parisc/include/asm/sections.h > @@ -4,6 +4,7 @@ > > #ifdef CONFIG_64BIT > #define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1 > +typedef Elf64_Fdesc funct_descr_t; > #endif > > /* nothing to see, move along */ > diff --git a/arch/powerpc/include/asm/sections.h > b/arch/powerpc/include/asm/sections.h > index b7f1ba04e756..d0d5287fa568 100644 > --- a/arch/powerpc/include/asm/sections.h > +++ b/arch/powerpc/include/asm/sections.h > @@ -10,6 +10,7 @@ > > #ifdef PPC64_ELF_ABI_v1 > #define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1 > +typedef struct ppc64_opd_entry funct_descr_t; > #endif > > #include > diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h > index 1db5cfd69817..436412d94054 100644 > --- a/include/asm-generic/sections.h > +++ b/include/asm-generic/sections.h > @@ -63,6 +63,9 @@ extern __visible const void __nosave_begin, __nosave_end; > #else > #define dereference_function_descriptor(p) ((void *)(p)) > #define dereference_kernel_function_descriptor(p) ((void *)(p)) > +typedef struct { > + unsigned long addr; > +} funct_descr_t; > #endif > > /* random extra sections (if any). Override > -- > 2.31.1 > -- Kees Cook
Re: [PATCH v1 04/10] asm-generic: Use HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR to define associated stubs
On Mon, Oct 11, 2021 at 05:25:31PM +0200, Christophe Leroy wrote: > Use HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR instead of > 'dereference_function_descriptor' > to know whether arch has function descriptors. > > Signed-off-by: Christophe Leroy I'd mention the intentionally-empty #if/#else in the commit log, as I, like Helge, mentally tripped over it in the review. :) Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v1 03/10] ia64: Rename 'ip' to 'addr' in 'struct fdesc'
On Mon, Oct 11, 2021 at 05:25:30PM +0200, Christophe Leroy wrote: > There are three architectures with function descriptors, try to > have common names for the address they contain in order to > refactor some functions into generic functions later. > > powerpc has 'funcaddr' > ia64 has 'ip' > parisc has 'addr' > > Vote for 'addr' and update 'struct fdesc' accordingly. > > Signed-off-by: Christophe Leroy Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v1 02/10] powerpc: Rename 'funcaddr' to 'addr' in 'struct ppc64_opd_entry'
On Mon, Oct 11, 2021 at 05:25:29PM +0200, Christophe Leroy wrote: > There are three architectures with function descriptors, try to > have common names for the address they contain in order to > refactor some functions into generic functions later. > > powerpc has 'funcaddr' > ia64 has 'ip' > parisc has 'addr' > > Vote for 'addr' and update 'struct ppc64_opd_entry' accordingly. > > Signed-off-by: Christophe Leroy Reasonable. :) Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v1 01/10] powerpc: Move 'struct ppc64_opd_entry' back into asm/elf.h
On Mon, Oct 11, 2021 at 05:25:28PM +0200, Christophe Leroy wrote: > 'struct ppc64_opd_entry' doesn't belong to uapi/asm/elf.h > > It was initially in module_64.c and commit 2d291e902791 ("Fix compile > failure with non modular builds") moved it into asm/elf.h > > But it was by mistake added outside of __KERNEL__ section, > therefore commit c3617f72036c ("UAPI: (Scripted) Disintegrate > arch/powerpc/include/asm") moved it to uapi/asm/elf.h > > Move it back into asm/elf.h, this brings it back in line with > IA64 and PARISC architectures. > > Fixes: 2d291e902791 ("Fix compile failure with non modular builds") > Signed-off-by: Christophe Leroy I'd agree with Arnd: this is a reasonable cleanup and nothing should be using it. Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH] lkdtm: Fix content of section containing lkdtm_rodata_do_nothing()
On Fri, 8 Oct 2021 18:58:40 +0200, Christophe Leroy wrote: > On a kernel without CONFIG_STRICT_KERNEL_RWX, running EXEC_RODATA > test leads to "Illegal instruction" failure. > > Looking at the content of rodata_objcopy.o, we see that the > function content zeroes only: > > Disassembly of section .rodata: > > [...] Applied to for-next/lkdtm, thanks! [1/1] lkdtm: Fix content of section containing lkdtm_rodata_do_nothing() https://git.kernel.org/kees/c/19c3069c5f5f Also, can you take a moment and get "patatt" set up[1] for signing your patches? I would appreciate that since b4 yells at me when patches aren't signed. :) -Kees [1] https://github.com/mricon/patatt -- Kees Cook
Re: [PATCH] lkdtm: Fix content of section containing lkdtm_rodata_do_nothing()
On Fri, Oct 08, 2021 at 11:09:47AM -0700, Nick Desaulniers wrote: > On Fri, Oct 8, 2021 at 9:59 AM Christophe Leroy > wrote: > > > > On a kernel without CONFIG_STRICT_KERNEL_RWX, running EXEC_RODATA > > test leads to "Illegal instruction" failure. > > > > Looking at the content of rodata_objcopy.o, we see that the > > function content zeroes only: > > > > Disassembly of section .rodata: > > > > <.lkdtm_rodata_do_nothing>: > >0: 00 00 00 00 .long 0x0 > > > > Add the contents flag in order to keep the content of the section > > while renaming it. > > > > Disassembly of section .rodata: > > > > <.lkdtm_rodata_do_nothing>: > >0: 4e 80 00 20 blr > > > > Fixes: e9e08a07385e ("lkdtm: support llvm-objcopy") > > Thanks for the patch; sorry I broke this. > Reviewed-by: Nick Desaulniers Hah! Whoops; sorry I don't have an inverted version of this test! I should have caught this when it broke. :| -Kees -- Kees Cook
Re: [RFC PATCH 4/8] powerpc: add CPU field to struct thread_info
On Thu, Sep 30, 2021 at 08:46:04AM +1000, Michael Ellerman wrote: > Ard Biesheuvel writes: > > On Tue, 28 Sept 2021 at 02:16, Michael Ellerman wrote: > >> > >> Michael Ellerman writes: > >> > Ard Biesheuvel writes: > >> >> On Tue, 14 Sept 2021 at 14:11, Ard Biesheuvel wrote: > >> >>> > >> >>> The CPU field will be moved back into thread_info even when > >> >>> THREAD_INFO_IN_TASK is enabled, so add it back to powerpc's definition > >> >>> of struct thread_info. > >> >>> > >> >>> Signed-off-by: Ard Biesheuvel > >> >> > >> >> Michael, > >> >> > >> >> Do you have any objections or issues with this patch or the subsequent > >> >> ones cleaning up the task CPU kludge for ppc32? Christophe indicated > >> >> that he was happy with it. > >> > > >> > No objections, it looks good to me, thanks for cleaning up that horror :) > >> > > >> > It didn't apply cleanly to master so I haven't tested it at all, if you > >> > can point me at a > >> > git tree with the dependencies I'd be happy to run some tests over it. > >> > >> Actually I realised I can just drop the last patch. > >> > >> So that looks fine, passes my standard quick build & boot on qemu tests, > >> and builds with/without stack protector enabled. > >> > > > > Thanks. > > > > Do you have any opinion on how this series should be merged? Kees Cook > > is willing to take them via his cross-arch tree, or you could carry > > them if you prefer. Taking it via multiple trees at the same time is > > going to be tricky, or take two cycles, with I'd prefer to avoid. > > I don't really mind. If Kees is happy to take it then that's OK by me. > > If Kees put the series in a topic branch based off rc2 then I could > merge that, and avoid any conflicts. I've created: git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/thread_info/cpu it includes a --no-ff merge commit, which I'm not sure is desirable? Let me know if I should adjust this, or if Linus will yell about this if I send him a PR containing a merge commit? I'm not sure what's right here. Thanks! -- Kees Cook
Re: [RFC PATCH 4/8] powerpc: add CPU field to struct thread_info
On Thu, Sep 30, 2021 at 08:46:04AM +1000, Michael Ellerman wrote: > Ard Biesheuvel writes: > > On Tue, 28 Sept 2021 at 02:16, Michael Ellerman wrote: > >> > >> Michael Ellerman writes: > >> > Ard Biesheuvel writes: > >> >> On Tue, 14 Sept 2021 at 14:11, Ard Biesheuvel wrote: > >> >>> > >> >>> The CPU field will be moved back into thread_info even when > >> >>> THREAD_INFO_IN_TASK is enabled, so add it back to powerpc's definition > >> >>> of struct thread_info. > >> >>> > >> >>> Signed-off-by: Ard Biesheuvel > >> >> > >> >> Michael, > >> >> > >> >> Do you have any objections or issues with this patch or the subsequent > >> >> ones cleaning up the task CPU kludge for ppc32? Christophe indicated > >> >> that he was happy with it. > >> > > >> > No objections, it looks good to me, thanks for cleaning up that horror :) > >> > > >> > It didn't apply cleanly to master so I haven't tested it at all, if you > >> > can point me at a > >> > git tree with the dependencies I'd be happy to run some tests over it. > >> > >> Actually I realised I can just drop the last patch. > >> > >> So that looks fine, passes my standard quick build & boot on qemu tests, > >> and builds with/without stack protector enabled. > >> > > > > Thanks. > > > > Do you have any opinion on how this series should be merged? Kees Cook > > is willing to take them via his cross-arch tree, or you could carry > > them if you prefer. Taking it via multiple trees at the same time is > > going to be tricky, or take two cycles, with I'd prefer to avoid. > > I don't really mind. If Kees is happy to take it then that's OK by me. > > If Kees put the series in a topic branch based off rc2 then I could > merge that, and avoid any conflicts. If that helps, yeah, I can make a separate stable branch. Thanks! -Kees -- Kees Cook
Re: [PATCH] mm: Remove HARDENED_USERCOPY_FALLBACK
On Tue, Sep 21, 2021 at 08:11:49AM +0200, Stephen Kitt wrote: > This has served its purpose and is no longer used. All usercopy > violations appear to have been handled by now, any remaining > instances (or new bugs) will cause copies to be rejected. > > This isn't a direct revert of commit 2d891fbc3bb6 ("usercopy: Allow > strict enforcement of whitelists"); since usercopy_fallback is > effectively 0, the fallback handling is removed too. > > This also removes the usercopy_fallback module parameter on > slab_common. > > Link: https://github.com/KSPP/linux/issues/153 > Signed-off-by: Stephen Kitt > Suggested-by: Kees Cook Thanks for doing this! Acked-by: Kees Cook -- Kees Cook
[PATCH for-next 01/25] scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Instead of writing beyond the end of evt_struct->iu.srp.cmd, target the upper union (evt_struct->iu.srp) instead, as that's what is being wiped. Cc: Tyrel Datwyler Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: "James E.J. Bottomley" Cc: "Martin K. Petersen" Cc: linux-s...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook Acked-by: Martin K. Petersen Link: https://lore.kernel.org/lkml/yq135rzp79c@ca-mkp.ca.oracle.com Acked-by: Tyrel Datwyler Link: https://lore.kernel.org/lkml/6eae8434-e9a7-aa74-628b-b515b3695...@linux.ibm.com --- drivers/scsi/ibmvscsi/ibmvscsi.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index e6a3eaaa57d9..3bd3a0124123 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -1055,8 +1055,9 @@ static int ibmvscsi_queuecommand_lck(struct scsi_cmnd *cmnd, return SCSI_MLQUEUE_HOST_BUSY; /* Set up the actual SRP IU */ + BUILD_BUG_ON(sizeof(evt_struct->iu.srp) != SRP_MAX_IU_LEN); + memset(_struct->iu.srp, 0x00, sizeof(evt_struct->iu.srp)); srp_cmd = _struct->iu.srp.cmd; - memset(srp_cmd, 0x00, SRP_MAX_IU_LEN); srp_cmd->opcode = SRP_CMD; memcpy(srp_cmd->cdb, cmnd->cmnd, sizeof(srp_cmd->cdb)); int_to_scsilun(lun, _cmd->lun); -- 2.30.2
[PATCH for-next 02/25] powerpc: Split memset() to avoid multi-field overflow
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Instead of writing across a field boundary with memset(), move the call to just the array, and an explicit zeroing of the prior field. Cc: Benjamin Herrenschmidt Cc: Qinglang Miao Cc: "Gustavo A. R. Silva" Cc: Hulk Robot Cc: Wang Wensheng Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook Reviewed-by: Michael Ellerman Link: https://lore.kernel.org/lkml/87czqsnmw9@mpe.ellerman.id.au --- drivers/macintosh/smu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c index 94fb63a7b357..3e2b25ea58a3 100644 --- a/drivers/macintosh/smu.c +++ b/drivers/macintosh/smu.c @@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd) cmd->read = cmd->info.devaddr & 0x01; switch(cmd->info.type) { case SMU_I2C_TRANSFER_SIMPLE: - memset(>info.sublen, 0, 4); + cmd->info.sublen = 0; + memset(cmd->info.subaddr, 0, sizeof(cmd->info.subaddr)); break; case SMU_I2C_TRANSFER_COMBINED: cmd->info.devaddr &= 0xfe; -- 2.30.2
Re: [PATCH v2 57/63] powerpc/signal32: Use struct_group() to zero spe regs
On Fri, Aug 20, 2021 at 05:49:35PM +1000, Michael Ellerman wrote: > Kees Cook writes: > > In preparation for FORTIFY_SOURCE performing compile-time and run-time > > field bounds checking for memset(), avoid intentionally writing across > > neighboring fields. > > > > Add a struct_group() for the spe registers so that memset() can correctly > > reason > > about the size: > > > >In function 'fortify_memset_chk', > >inlined from 'restore_user_regs.part.0' at > > arch/powerpc/kernel/signal_32.c:539:3: > >>> include/linux/fortify-string.h:195:4: error: call to > >>> '__write_overflow_field' declared with attribute warning: detected write > >>> beyond size of field (1st parameter); maybe use struct_group()? > >>> [-Werror=attribute-warning] > > 195 |__write_overflow_field(); > > |^~~~ > > > > Cc: Michael Ellerman > > Cc: Benjamin Herrenschmidt > > Cc: Paul Mackerras > > Cc: Christophe Leroy > > Cc: Sudeep Holla > > Cc: linuxppc-dev@lists.ozlabs.org > > Reported-by: kernel test robot > > Signed-off-by: Kees Cook > > --- > > arch/powerpc/include/asm/processor.h | 6 -- > > arch/powerpc/kernel/signal_32.c | 6 +++--- > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/processor.h > > b/arch/powerpc/include/asm/processor.h > > index f348e564f7dd..05dc567cb9a8 100644 > > --- a/arch/powerpc/include/asm/processor.h > > +++ b/arch/powerpc/include/asm/processor.h > > @@ -191,8 +191,10 @@ struct thread_struct { > > int used_vsr; /* set if process has used VSX */ > > #endif /* CONFIG_VSX */ > > #ifdef CONFIG_SPE > > - unsigned long evr[32];/* upper 32-bits of SPE regs */ > > - u64 acc;/* Accumulator */ > > + struct_group(spe, > > + unsigned long evr[32];/* upper 32-bits of SPE regs */ > > + u64 acc;/* Accumulator */ > > + ); > > unsigned long spefscr;/* SPE & eFP status */ > > unsigned long spefscr_last; /* SPEFSCR value on last prctl > >call or trap return */ > > diff --git a/arch/powerpc/kernel/signal_32.c > > b/arch/powerpc/kernel/signal_32.c > > index 0608581967f0..77b86caf5c51 100644 > > --- a/arch/powerpc/kernel/signal_32.c > > +++ b/arch/powerpc/kernel/signal_32.c > > @@ -532,11 +532,11 @@ static long restore_user_regs(struct pt_regs *regs, > > regs_set_return_msr(regs, regs->msr & ~MSR_SPE); > > if (msr & MSR_SPE) { > > /* restore spe registers from the stack */ > > - unsafe_copy_from_user(current->thread.evr, >mc_vregs, > > - ELF_NEVRREG * sizeof(u32), failed); > > + unsafe_copy_from_user(>thread.spe, >mc_vregs, > > + sizeof(current->thread.spe), failed); > > This makes me nervous, because the ABI is that we copy ELF_NEVRREG * > sizeof(u32) bytes, not whatever sizeof(current->thread.spe) happens to > be. > > ie. if we use sizeof an inadvertent change to the fields in > thread_struct could change how many bytes we copy out to userspace, > which would be an ABI break. > > And that's not that hard to do, because it's not at all obvious that the > size and layout of fields in thread_struct affects the user ABI. > > At the same time we don't want to copy the right number of bytes but > the wrong content, so from that point of view using sizeof is good :) > > The way we handle it in ptrace is to have BUILD_BUG_ON()s to verify that > things match up, so maybe we should do that here too. > > ie. add: > > BUILD_BUG_ON(sizeof(current->thread.spe) == ELF_NEVRREG * sizeof(u32)); > > Not sure if you are happy doing that as part of this patch. I can always > do it later if not. Sounds good to me; I did that in a few other cases in the series where the relationships between things seemed tenuous. :) I'll add this (as !=) in v3. Thanks! -- Kees Cook
Re: [PATCH v2 61/63] powerpc: Split memset() to avoid multi-field overflow
On Wed, Aug 18, 2021 at 08:42:18AM +0200, Christophe Leroy wrote: > > > Le 18/08/2021 à 08:05, Kees Cook a écrit : > > In preparation for FORTIFY_SOURCE performing compile-time and run-time > > field bounds checking for memset(), avoid intentionally writing across > > neighboring fields. > > > > Instead of writing across a field boundary with memset(), move the call > > to just the array, and an explicit zeroing of the prior field. > > > > Cc: Benjamin Herrenschmidt > > Cc: Qinglang Miao > > Cc: "Gustavo A. R. Silva" > > Cc: Hulk Robot > > Cc: Wang Wensheng > > Cc: linuxppc-dev@lists.ozlabs.org > > Signed-off-by: Kees Cook > > Reviewed-by: Michael Ellerman > > Link: https://lore.kernel.org/lkml/87czqsnmw9@mpe.ellerman.id.au > > --- > > drivers/macintosh/smu.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c > > index 94fb63a7b357..59ce431da7ef 100644 > > --- a/drivers/macintosh/smu.c > > +++ b/drivers/macintosh/smu.c > > @@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd) > > cmd->read = cmd->info.devaddr & 0x01; > > switch(cmd->info.type) { > > case SMU_I2C_TRANSFER_SIMPLE: > > - memset(>info.sublen, 0, 4); > > + cmd->info.sublen = 0; > > + memset(>info.subaddr, 0, 3); > > subaddr[] is a table, should the & be avoided ? It results in the same thing, but it's better form to not have the &; I will fix this. > And while at it, why not use sizeof(subaddr) instead of 3 ? Agreed. :) Thanks! -- Kees Cook
[PATCH v2 57/63] powerpc/signal32: Use struct_group() to zero spe regs
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Add a struct_group() for the spe registers so that memset() can correctly reason about the size: In function 'fortify_memset_chk', inlined from 'restore_user_regs.part.0' at arch/powerpc/kernel/signal_32.c:539:3: >> include/linux/fortify-string.h:195:4: error: call to >> '__write_overflow_field' declared with attribute warning: detected write >> beyond size of field (1st parameter); maybe use struct_group()? >> [-Werror=attribute-warning] 195 |__write_overflow_field(); |^~~~ Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Christophe Leroy Cc: Sudeep Holla Cc: linuxppc-dev@lists.ozlabs.org Reported-by: kernel test robot Signed-off-by: Kees Cook --- arch/powerpc/include/asm/processor.h | 6 -- arch/powerpc/kernel/signal_32.c | 6 +++--- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index f348e564f7dd..05dc567cb9a8 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -191,8 +191,10 @@ struct thread_struct { int used_vsr; /* set if process has used VSX */ #endif /* CONFIG_VSX */ #ifdef CONFIG_SPE - unsigned long evr[32];/* upper 32-bits of SPE regs */ - u64 acc;/* Accumulator */ + struct_group(spe, + unsigned long evr[32];/* upper 32-bits of SPE regs */ + u64 acc;/* Accumulator */ + ); unsigned long spefscr;/* SPE & eFP status */ unsigned long spefscr_last; /* SPEFSCR value on last prctl call or trap return */ diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c index 0608581967f0..77b86caf5c51 100644 --- a/arch/powerpc/kernel/signal_32.c +++ b/arch/powerpc/kernel/signal_32.c @@ -532,11 +532,11 @@ static long restore_user_regs(struct pt_regs *regs, regs_set_return_msr(regs, regs->msr & ~MSR_SPE); if (msr & MSR_SPE) { /* restore spe registers from the stack */ - unsafe_copy_from_user(current->thread.evr, >mc_vregs, - ELF_NEVRREG * sizeof(u32), failed); + unsafe_copy_from_user(>thread.spe, >mc_vregs, + sizeof(current->thread.spe), failed); current->thread.used_spe = true; } else if (current->thread.used_spe) - memset(current->thread.evr, 0, ELF_NEVRREG * sizeof(u32)); + memset(>thread.spe, 0, sizeof(current->thread.spe)); /* Always get SPEFSCR back */ unsafe_get_user(current->thread.spefscr, (u32 __user *)>mc_vregs + ELF_NEVRREG, failed); -- 2.30.2
[PATCH v2 61/63] powerpc: Split memset() to avoid multi-field overflow
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Instead of writing across a field boundary with memset(), move the call to just the array, and an explicit zeroing of the prior field. Cc: Benjamin Herrenschmidt Cc: Qinglang Miao Cc: "Gustavo A. R. Silva" Cc: Hulk Robot Cc: Wang Wensheng Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook Reviewed-by: Michael Ellerman Link: https://lore.kernel.org/lkml/87czqsnmw9@mpe.ellerman.id.au --- drivers/macintosh/smu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c index 94fb63a7b357..59ce431da7ef 100644 --- a/drivers/macintosh/smu.c +++ b/drivers/macintosh/smu.c @@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd) cmd->read = cmd->info.devaddr & 0x01; switch(cmd->info.type) { case SMU_I2C_TRANSFER_SIMPLE: - memset(>info.sublen, 0, 4); + cmd->info.sublen = 0; + memset(>info.subaddr, 0, 3); break; case SMU_I2C_TRANSFER_COMBINED: cmd->info.devaddr &= 0xfe; -- 2.30.2
[PATCH v2 36/63] scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memset(), avoid intentionally writing across neighboring fields. Instead of writing beyond the end of evt_struct->iu.srp.cmd, target the upper union (evt_struct->iu.srp) instead, as that's what is being wiped. Cc: Tyrel Datwyler Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: "James E.J. Bottomley" Cc: "Martin K. Petersen" Cc: linux-s...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook Acked-by: Martin K. Petersen Link: https://lore.kernel.org/lkml/yq135rzp79c@ca-mkp.ca.oracle.com Acked-by: Tyrel Datwyler Link: https://lore.kernel.org/lkml/6eae8434-e9a7-aa74-628b-b515b3695...@linux.ibm.com --- drivers/scsi/ibmvscsi/ibmvscsi.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c index 50df7dd9cb91..ea8e01f49cba 100644 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c @@ -1055,8 +1055,9 @@ static int ibmvscsi_queuecommand_lck(struct scsi_cmnd *cmnd, return SCSI_MLQUEUE_HOST_BUSY; /* Set up the actual SRP IU */ + BUILD_BUG_ON(sizeof(evt_struct->iu.srp) != SRP_MAX_IU_LEN); + memset(_struct->iu.srp, 0x00, sizeof(evt_struct->iu.srp)); srp_cmd = _struct->iu.srp.cmd; - memset(srp_cmd, 0x00, SRP_MAX_IU_LEN); srp_cmd->opcode = SRP_CMD; memcpy(srp_cmd->cdb, cmnd->cmnd, sizeof(srp_cmd->cdb)); int_to_scsilun(lun, _cmd->lun); -- 2.30.2
Re: [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping
On Wed, Aug 11, 2021 at 12:57:00PM -0500, Christopher M. Riedl wrote: > On Thu Aug 5, 2021 at 4:13 AM CDT, Christophe Leroy wrote: > > > > > > Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit : > > > When live patching with STRICT_KERNEL_RWX the CPU doing the patching > > > must temporarily remap the page(s) containing the patch site with +W > > > permissions. While this temporary mapping is in use, another CPU could > > > write to the same mapping and maliciously alter kernel text. Implement a > > > LKDTM test to attempt to exploit such an opening during code patching. > > > The test is implemented on powerpc and requires LKDTM built into the > > > kernel (building LKDTM as a module is insufficient). > > > > > > The LKDTM "hijack" test works as follows: > > > > > >1. A CPU executes an infinite loop to patch an instruction. This is > > > the "patching" CPU. > > >2. Another CPU attempts to write to the address of the temporary > > > mapping used by the "patching" CPU. This other CPU is the > > > "hijacker" CPU. The hijack either fails with a fault/error or > > > succeeds, in which case some kernel text is now overwritten. > > > [...] > > > +#if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \ > > > + defined(CONFIG_PPC)) > > > > I think this test shouldn't be limited to CONFIG_PPC and shouldn't be > > limited to CONFIG_STRICT_KERNEL_RWX. It should be there all the time. Agreed: if the machinery exists to provide this defense on even one arch/config/whatever combo, I'd like LKDTM to test for it. This lets use compare defenses across different combinations more easily, and means folks must answer questions like "why doesn't $combination provide $defense?" > > Also why limiting it to IS_BUILTIN(CONFIG_LKDTM) ? > > The test needs read_cpu_patching_addr() which definitely cannot be > exposed outside of the kernel (ie. builtin). FWIW, I'm okay with this. There isn't a solution that feels entirely "right", so either a build-time requirement like this, or using an exception for modules like this: arch/x86/kernel/cpu/common.c:#if IS_MODULE(CONFIG_LKDTM) arch/x86/kernel/cpu/common.c-EXPORT_SYMBOL_GPL(native_write_cr4); arch/x86/kernel/cpu/common.c-#endif I think neither is great. Another idea is maybe using a name-spaced export, like: EXPORT_SYMBOL_NS_GPL(native_write_cr4, LKDTM); But that still means it gets exposed to malicious discovery, so probably not. I suspect the best is to just do the BUILTIN check, since building LKDTM as a module on a _production_ kernel is rare if it exists at all. The only downside is needing to completely reboot to perform updated tests, but then, I frequently find myself breaking the kernel badly on bad tests, so I have to reboot anyway. ;) -Kees -- Kees Cook
[PATCH] ibmvnic: Use strscpy() instead of strncpy()
Since these strings are expected to be NUL-terminated and the buffers are exactly sized (in vnic_client_data_len()) with no padding, strncpy() can be safely replaced with strscpy() here, as strncpy() on NUL-terminated string is considered deprecated[1]. This has the side-effect of silencing a -Warray-bounds warning due to the compiler being confused about the vlcd incrementing: In file included from ./include/linux/string.h:253, from ./include/linux/bitmap.h:10, from ./include/linux/cpumask.h:12, from ./include/linux/mm_types_task.h:14, from ./include/linux/mm_types.h:5, from ./include/linux/buildid.h:5, from ./include/linux/module.h:14, from drivers/net/ethernet/ibm/ibmvnic.c:35: In function '__fortify_strncpy', inlined from 'vnic_add_client_data' at drivers/net/ethernet/ibm/ibmvnic.c:3919:2: ./include/linux/fortify-string.h:39:30: warning: '__builtin_strncpy' offset 12 from the object at 'v lcd' is out of the bounds of referenced subobject 'name' with type 'char[]' at offset 12 [-Warray-bo unds] 39 | #define __underlying_strncpy __builtin_strncpy | ^ ./include/linux/fortify-string.h:51:9: note: in expansion of macro '__underlying_strncpy' 51 | return __underlying_strncpy(p, q, size); | ^~~~ drivers/net/ethernet/ibm/ibmvnic.c: In function 'vnic_add_client_data': drivers/net/ethernet/ibm/ibmvnic.c:3883:7: note: subobject 'name' declared here 3883 | char name[]; | ^~~~ [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings Cc: Dany Madden Cc: Sukadev Bhattiprolu Cc: Thomas Falcon Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: "David S. Miller" Cc: Jakub Kicinski Cc: net...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook --- drivers/net/ethernet/ibm/ibmvnic.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 2d8804ebdf96..adb0d5ca9ff1 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -3909,21 +3909,21 @@ static void vnic_add_client_data(struct ibmvnic_adapter *adapter, vlcd->type = 1; len = strlen(os_name) + 1; vlcd->len = cpu_to_be16(len); - strncpy(vlcd->name, os_name, len); + strscpy(vlcd->name, os_name, len); vlcd = (struct vnic_login_client_data *)(vlcd->name + len); /* Type 2 - LPAR name */ vlcd->type = 2; len = strlen(utsname()->nodename) + 1; vlcd->len = cpu_to_be16(len); - strncpy(vlcd->name, utsname()->nodename, len); + strscpy(vlcd->name, utsname()->nodename, len); vlcd = (struct vnic_login_client_data *)(vlcd->name + len); /* Type 3 - device name */ vlcd->type = 3; len = strlen(adapter->netdev->name) + 1; vlcd->len = cpu_to_be16(len); - strncpy(vlcd->name, adapter->netdev->name, len); + strscpy(vlcd->name, adapter->netdev->name, len); } static int send_login(struct ibmvnic_adapter *adapter) -- 2.30.2
Re: [PATCH] crypto: nx: Fix memcpy() over-reading in nonce
On Thu, Jun 17, 2021 at 04:08:15PM +1000, Michael Ellerman wrote: > Kees Cook writes: > > Fix typo in memcpy() where size should be CTR_RFC3686_NONCE_SIZE. > > > > Fixes: 030f4e968741 ("crypto: nx - Fix reentrancy bugs") > > Cc: sta...@vger.kernel.org > > Signed-off-by: Kees Cook > > Thanks. > > > --- > > drivers/crypto/nx/nx-aes-ctr.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/crypto/nx/nx-aes-ctr.c b/drivers/crypto/nx/nx-aes-ctr.c > > index 13f518802343..6120e350ff71 100644 > > --- a/drivers/crypto/nx/nx-aes-ctr.c > > +++ b/drivers/crypto/nx/nx-aes-ctr.c > > @@ -118,7 +118,7 @@ static int ctr3686_aes_nx_crypt(struct skcipher_request > > *req) > > struct nx_crypto_ctx *nx_ctx = crypto_skcipher_ctx(tfm); > > u8 iv[16]; > > > > - memcpy(iv, nx_ctx->priv.ctr.nonce, CTR_RFC3686_IV_SIZE); > > + memcpy(iv, nx_ctx->priv.ctr.nonce, CTR_RFC3686_NONCE_SIZE); > > memcpy(iv + CTR_RFC3686_NONCE_SIZE, req->iv, CTR_RFC3686_IV_SIZE); > > iv[12] = iv[13] = iv[14] = 0; > > iv[15] = 1; > > Where IV_SIZE is 8 and NONCE_SIZE is 4. > > And iv is 16 bytes, so it's not a buffer overflow. > > But priv.ctr.nonce is 4 bytes, and at the end of the struct, so it reads > 4 bytes past the end of the nx_crypto_ctx, which is not good. > > But then immediately overwrites whatever it read with req->iv. > > So seems pretty harmless in practice? Right -- there's no damage done, but future memcpy() FORTIFY work alerts on this, so I'm going through cleaning all of these up. :) -- Kees Cook
[PATCH] crypto: nx: Fix memcpy() over-reading in nonce
Fix typo in memcpy() where size should be CTR_RFC3686_NONCE_SIZE. Fixes: 030f4e968741 ("crypto: nx - Fix reentrancy bugs") Cc: sta...@vger.kernel.org Signed-off-by: Kees Cook --- drivers/crypto/nx/nx-aes-ctr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/nx/nx-aes-ctr.c b/drivers/crypto/nx/nx-aes-ctr.c index 13f518802343..6120e350ff71 100644 --- a/drivers/crypto/nx/nx-aes-ctr.c +++ b/drivers/crypto/nx/nx-aes-ctr.c @@ -118,7 +118,7 @@ static int ctr3686_aes_nx_crypt(struct skcipher_request *req) struct nx_crypto_ctx *nx_ctx = crypto_skcipher_ctx(tfm); u8 iv[16]; - memcpy(iv, nx_ctx->priv.ctr.nonce, CTR_RFC3686_IV_SIZE); + memcpy(iv, nx_ctx->priv.ctr.nonce, CTR_RFC3686_NONCE_SIZE); memcpy(iv + CTR_RFC3686_NONCE_SIZE, req->iv, CTR_RFC3686_IV_SIZE); iv[12] = iv[13] = iv[14] = 0; iv[15] = 1; -- 2.25.1
Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers
On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote: > kernel.h is being used as a dump for all kinds of stuff for a long time. > Here is the attempt to start cleaning it up by splitting out panic and > oops helpers. > > At the same time convert users in header and lib folder to use new header. > Though for time being include new header back to kernel.h to avoid twisted > indirected includes for existing users. > > Signed-off-by: Andy Shevchenko I like it! Do you have a multi-arch CI to do allmodconfig builds to double-check this? Acked-by: Kees Cook -Kees -- Kees Cook
Re: [PATCH next v3 12/15] printk: introduce a kmsg_dump iterator
On Thu, Feb 25, 2021 at 09:24:35PM +0100, John Ogness wrote: > Rather than storing the iterator information in the registered > kmsg_dumper structure, create a separate iterator structure. The > kmsg_dump_iter structure can reside on the stack of the caller, thus > allowing lockless use of the kmsg_dump functions. > > This change also means that the kmsg_dumper dump() callback no > longer needs to pass in the kmsg_dumper as an argument. If > kmsg_dumpers want to access the kernel logs, they can use the new > iterator. > > Update the kmsg_dumper callback prototype. Update code that accesses > the kernel logs using the kmsg_dumper structure to use the new > kmsg_dump_iter structure. For kmsg_dumpers, this also means adding a > call to kmsg_dump_rewind() to initialize the iterator. > > All this is in preparation for removal of @logbuf_lock. > > Signed-off-by: John Ogness > --- > arch/powerpc/kernel/nvram_64.c | 14 +++--- > arch/powerpc/platforms/powernv/opal-kmsg.c | 3 +- > arch/powerpc/xmon/xmon.c | 6 +-- > arch/um/kernel/kmsg_dump.c | 8 +-- > drivers/hv/vmbus_drv.c | 7 +-- > drivers/mtd/mtdoops.c | 8 +-- > fs/pstore/platform.c | 8 +-- Reviewed-by: Kees Cook # pstore -Kees > include/linux/kmsg_dump.h | 38 --- > kernel/debug/kdb/kdb_main.c| 10 ++-- > kernel/printk/printk.c | 57 ++ > 10 files changed, 81 insertions(+), 78 deletions(-) > > diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c > index 532f22637783..5a64b24a91c2 100644 > --- a/arch/powerpc/kernel/nvram_64.c > +++ b/arch/powerpc/kernel/nvram_64.c > @@ -72,8 +72,7 @@ static const char *nvram_os_partitions[] = { > NULL > }; > > -static void oops_to_nvram(struct kmsg_dumper *dumper, > - enum kmsg_dump_reason reason); > +static void oops_to_nvram(enum kmsg_dump_reason reason); > > static struct kmsg_dumper nvram_kmsg_dumper = { > .dump = oops_to_nvram > @@ -642,11 +641,11 @@ void __init nvram_init_oops_partition(int > rtas_partition_exists) > * that we think will compress sufficiently to fit in the lnx,oops-log > * partition. If that's too much, go back and capture uncompressed text. > */ > -static void oops_to_nvram(struct kmsg_dumper *dumper, > - enum kmsg_dump_reason reason) > +static void oops_to_nvram(enum kmsg_dump_reason reason) > { > struct oops_log_info *oops_hdr = (struct oops_log_info *)oops_buf; > static unsigned int oops_count = 0; > + static struct kmsg_dump_iter iter; > static bool panicking = false; > static DEFINE_SPINLOCK(lock); > unsigned long flags; > @@ -681,13 +680,14 @@ static void oops_to_nvram(struct kmsg_dumper *dumper, > return; > > if (big_oops_buf) { > - kmsg_dump_get_buffer(dumper, false, > + kmsg_dump_rewind(); > + kmsg_dump_get_buffer(, false, >big_oops_buf, big_oops_buf_sz, _len); > rc = zip_oops(text_len); > } > if (rc != 0) { > - kmsg_dump_rewind(dumper); > - kmsg_dump_get_buffer(dumper, false, > + kmsg_dump_rewind(); > + kmsg_dump_get_buffer(, false, >oops_data, oops_data_sz, _len); > err_type = ERR_TYPE_KERNEL_PANIC; > oops_hdr->version = cpu_to_be16(OOPS_HDR_VERSION); > diff --git a/arch/powerpc/platforms/powernv/opal-kmsg.c > b/arch/powerpc/platforms/powernv/opal-kmsg.c > index 6c3bc4b4da98..a7bd6ac681f4 100644 > --- a/arch/powerpc/platforms/powernv/opal-kmsg.c > +++ b/arch/powerpc/platforms/powernv/opal-kmsg.c > @@ -19,8 +19,7 @@ > * may not be completely printed. This function does not actually dump the > * message, it just ensures that OPAL completely flushes the console buffer. > */ > -static void kmsg_dump_opal_console_flush(struct kmsg_dumper *dumper, > - enum kmsg_dump_reason reason) > +static void kmsg_dump_opal_console_flush(enum kmsg_dump_reason reason) > { > /* >* Outside of a panic context the pollers will continue to run, > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c > index 80ed3e1becf9..5978b90a885f 100644 > --- a/arch/powerpc/xmon/xmon.c > +++ b/arch/powerpc/xmon/xmon.c > @@ -3001,7 +3001,7 @@ print_address(unsigned long addr) > static void > dump_log_buf(void) > { > - struct kmsg_dumper dumper; > + struct kmsg_dump_iter iter; >
Re: linux-next: build warning after merge of the akpm tree
On Tue, Dec 08, 2020 at 11:01:57PM +1100, Stephen Rothwell wrote: > Hi Stephen, > > On Fri, 4 Dec 2020 21:00:00 +1100 Stephen Rothwell > wrote: > > > > Hi all, > > > > After merging the akpm tree, today's linux-next build (powerpc > > allyesconfig) produced warnings like this: > > > > ld: warning: orphan section `.data..Lubsan_data177' from > > `arch/powerpc/oprofile/op_model_pa6t.o' being placed in section > > `.data..Lubsan_data177' > > > > (lots of these latter ones) > > 781584 of them today! > > > I don't know what produced these, but it is in the akpm-current or > > akpm trees. > > Presumably the result of commit > > 186c3e18dba3 ("ubsan: enable for all*config builds") > > from the akpm-current tree. > > arch/powerpc/kernel/vmlinux.lds.S has: > > #ifdef CONFIG_PPC32 > .data : AT(ADDR(.data) - LOAD_OFFSET) { > DATA_DATA > #ifdef CONFIG_UBSAN > *(.data..Lubsan_data*) > *(.data..Lubsan_type*) > #endif > *(.data.rel*) > *(SDATA_MAIN) > > added by commit > > beba24ac5913 ("powerpc/32: Add .data..Lubsan_data*/.data..Lubsan_type* > sections explicitly") > > in 2018, but no equivalent for 64 bit. > > I will try the following patch tomorrow: > > From: Stephen Rothwell > Date: Tue, 8 Dec 2020 22:58:24 +1100 > Subject: [PATCH] powerpc: Add .data..Lubsan_data*/.data..Lubsan_type* > sections explicitly > > Similarly to commit > > beba24ac5913 ("powerpc/32: Add .data..Lubsan_data*/.data..Lubsan_type* > sections explicitly") > > since CONFIG_UBSAN bits can now be enabled for all*config. > > Signed-off-by: Stephen Rothwell > --- > arch/powerpc/kernel/vmlinux.lds.S | 4 > 1 file changed, 4 insertions(+) > > diff --git a/arch/powerpc/kernel/vmlinux.lds.S > b/arch/powerpc/kernel/vmlinux.lds.S > index 3b4c26e94328..0318ba436f34 100644 > --- a/arch/powerpc/kernel/vmlinux.lds.S > +++ b/arch/powerpc/kernel/vmlinux.lds.S > @@ -296,6 +296,10 @@ SECTIONS > #else > .data : AT(ADDR(.data) - LOAD_OFFSET) { > DATA_DATA > +#ifdef CONFIG_UBSAN > + *(.data..Lubsan_data*) > + *(.data..Lubsan_type*) > +#endif > *(.data.rel*) > *(.toc1) > *(.branch_lt) > -- > 2.29.2 > > -- > Cheers, > Stephen Rothwell Reviewed-by: Kees Cook Thanks for figuring this one out. :) Andrew, can you add this to your ubsan patch stack, or do you want me to resend it to you directly? -- Kees Cook
Re: [PATCH v2 2/2] kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1
On Wed, Dec 02, 2020 at 11:37:38AM +0900, Masahiro Yamada wrote: > On Wed, Dec 2, 2020 at 5:56 AM Kees Cook wrote: > > > > On Tue, Dec 01, 2020 at 10:31:37PM +0900, Masahiro Yamada wrote: > > > On Wed, Nov 25, 2020 at 7:22 AM Kees Cook wrote: > > > > > > > > On Thu, Nov 19, 2020 at 01:13:27PM -0800, Nick Desaulniers wrote: > > > > > On Thu, Nov 19, 2020 at 12:57 PM Nathan Chancellor > > > > > wrote: > > > > > > > > > > > > ld.lld 10.0.1 spews a bunch of various warnings about .rela > > > > > > sections, > > > > > > along with a few others. Newer versions of ld.lld do not have these > > > > > > warnings. As a result, do not add '--orphan-handling=warn' to > > > > > > LDFLAGS_vmlinux if ld.lld's version is not new enough. > > > > > > > > > > > > Link: https://github.com/ClangBuiltLinux/linux/issues/1187 > > > > > > Link: https://github.com/ClangBuiltLinux/linux/issues/1193 > > > > > > Reported-by: Arvind Sankar > > > > > > Reported-by: kernelci.org bot > > > > > > Reported-by: Mark Brown > > > > > > Reviewed-by: Kees Cook > > > > > > Signed-off-by: Nathan Chancellor > > > > > > > > > > Thanks for the additions in v2. > > > > > Reviewed-by: Nick Desaulniers > > > > > > > > I'm going to carry this for a few days in -next, and if no one screams, > > > > ask Linus to pull it for v5.10-rc6. > > > > > > > > Thanks! > > > > > > > > -- > > > > Kees Cook > > > > > > > > > Sorry for the delay. > > > Applied to linux-kbuild. > > > > Great, thanks! > > > > > But, I already see this in linux-next. > > > Please let me know if I should drop it from my tree. > > > > My intention was to get this to Linus this week. Do you want to do that > > yourself, or Ack the patches in my tree and I'll send it? > > I will send a kbuild pull request myself this week. Okay, thanks! I've removed it from my -next tree now. -- Kees Cook