On Sat, May 03, 2025 at 02:07:12PM +0200, Borislav Petkov wrote: > On Mon, Apr 14, 2025 at 07:34:48AM -0000, tip-bot2 for Ingo Molnar wrote: > > The fpu_thread_struct_whitelist() quirk to hardened usercopy can be removed, > > now that the FPU structure is not embedded in the task struct anymore, which > > reduces text footprint a bit. > > Well, hardened usercopy still doesn't like it on 32-bit, see splat below: > > I did some debugging printks and here's what I see: > > That's the loop in copy_uabi_to_xstate(), copying the first FPU state > - XFEATURE_FP - to the kernel buffer: > > [ 1.752756] copy_uabi_to_xstate: i: 0 dst: 0xcab11f40, offset: 0, size: > 160, kbuf: 0x00000000, ubuf: 0xbfcbca80 > [ 1.754600] copy_from_buffer: dst: 0xcab11f40, src: 0xbfcbca80, size: 160 > > hardened wants to check it: > > [ 1.755823] __check_heap_object: ptr: 0xcab11f40, slap_address: > 0xcab10000, size: 2944 > [ 1.757102] __check_heap_object: offset: 2112 > > and figures out it is in some weird offset 2112 from *task_struct* even > though: > > [ 1.750149] copy_uabi_to_xstate: sizeof(task_struct): 1984 > > btw, the buffer is big enough too: > > [ 1.749077] copy_uabi_to_xstate: sizeof(&fpstate->regs.xsave): 576 > > but then it decides to BUG because an overwrite attempt is being done on > task_struct which is bollocks now as struct fpu is not part of it anymore. > > And this is where I'm all out of ideas so lemme CC folks.
I had to dig a bit to find the series -- this reply seems to have broken threading with: https://lore.kernel.org/all/[email protected]/ > > [ 1.757898] __check_heap_object: will abort: offset: 2112, size: 160 > > [ 1.758951] usercopy: Kernel memory overwrite attempt detected to SLUB > object 'task_struct' (offset 2112, size 160)! So the useroffset and usersize arguments are what control the allowed window of copying in/out of the "task_struct" kmem cache: /* create a slab on which task_structs can be allocated */ task_struct_whitelist(&useroffset, &usersize); task_struct_cachep = kmem_cache_create_usercopy("task_struct", arch_task_struct_size, align, SLAB_PANIC|SLAB_ACCOUNT, useroffset, usersize, NULL); task_struct_whitelist() positions this window based on the location of the thread_struct within task_struct, and gets the arch-specific details via arch_thread_struct_whitelist(offset, size): static void __init task_struct_whitelist(unsigned long *offset, unsigned long *size) { /* Fetch thread_struct whitelist for the architecture. */ arch_thread_struct_whitelist(offset, size); /* * Handle zero-sized whitelist or empty thread_struct, otherwise * adjust offset to position of thread_struct in task_struct. */ if (unlikely(*size == 0)) *offset = 0; else *offset += offsetof(struct task_struct, thread); } Commit "x86/fpu: Make task_struct::thread constant size" removed the logic for the window, leaving: static inline void arch_thread_struct_whitelist(unsigned long *offset, unsigned long *size) { *offset = 0; *size = 0; } So now there is no window that usercopy hardening will allow to be copied in/out of task_struct. > [ 1.760651] ------------[ cut here ]------------ > [ 1.761474] kernel BUG at mm/usercopy.c:102! > [ 1.762240] Oops: invalid opcode: 0000 [#1] SMP > [ 1.763063] CPU: 6 UID: 0 PID: 1182 Comm: rc Not tainted 6.15.0-rc2+ #35 > PREEMPT(full) > [ 1.764374] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.16.3-debian-1.16.3-2 04/01/2014 > [ 1.765952] EIP: usercopy_abort+0x79/0x88 > [ 1.768411] Code: c1 89 44 24 0c 0f 45 cb 8b 5d 0c 89 74 24 10 89 4c 24 04 > c7 04 24 98 f0 c5 c1 89 5c 24 20 8b 5d 08 89 5c 24 1c e8 d3 8b e7 ff <0f> 0b > ba 89 8f ce c1 89 55 f0 89 d6 eb 97 90 3e 8d 74 26 00 85 d2 > [ 1.771573] EAX: 00000068 EBX: 00000840 ECX: 00000000 EDX: 00000006 > [ 1.772638] ESI: c1cdb354 EDI: c1ce0c9a EBP: cc751d40 ESP: cc751d0c > [ 1.773707] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010292 > [ 1.774831] CR0: 80050033 CR2: 00511860 CR3: 0cde1000 CR4: 003506d0 > [ 1.775921] Call Trace: > [ 1.776498] __check_heap_object+0x117/0x14c > [ 1.777314] __check_object_size+0x1af/0x250 > [ 1.778129] ? vprintk+0x13/0x1c > [ 1.778778] copy_from_buffer+0xbc/0x114 > [ 1.779498] copy_uabi_to_xstate+0x1b7/0x31c > [ 1.780251] copy_sigframe_from_user_to_xstate+0x27/0x34 > [ 1.781171] __fpu_restore_sig+0x4ae/0x4c4 > [ 1.781954] fpu__restore_sig+0x60/0xb0 > [ 1.784487] ia32_restore_sigcontext+0xe4/0x108 > [ 1.785464] __do_sys_sigreturn+0x66/0xac > [ 1.786191] ia32_sys_call+0x226a/0x30e0 > [ 1.786942] do_int80_syscall_32+0x83/0x158 > [ 1.787735] entry_INT80_32+0x108/0x108 But as reported above, there *is* a copy in copy_uabi_to_xstate(). (It seems there are several, actually.) int copy_sigframe_from_user_to_xstate(struct task_struct *tsk, const void __user *ubuf) { return copy_uabi_to_xstate(x86_task_fpu(tsk)->fpstate, NULL, ubuf, &tsk->thread.pkru); } This appears to be writing into x86_task_fpu(tsk)->fpstate. With or without CONFIG_X86_DEBUG_FPU, this resolves to: ((struct fpu *)((void *)(task) + sizeof(*(task)))) i.e. the memory "after task_struct" is cast to "struct fpu", and the uses the "fpstate" pointer. How that pointer gets set looks to be variable, but I think the one we care about here is: fpu->fpstate = &fpu->__fpstate; And struct fpu::__fpstate says: struct fpstate __fpstate; /* * WARNING: '__fpstate' is dynamically-sized. Do not put * anything after it here. */ So we're still dealing with a dynamically sized thing, even if it's not within the literal struct task_struct -- it's still in the kmem cache, though. Looking at the kmem cache size, it has allocated "arch_task_struct_size" bytes, which is calculated in fpu__init_task_struct_size(): int task_size = sizeof(struct task_struct); task_size += sizeof(struct fpu); /* * Subtract off the static size of the register state. * It potentially has a bunch of padding. */ task_size -= sizeof(union fpregs_state); /* * Add back the dynamically-calculated register state * size. */ task_size += fpu_kernel_cfg.default_size; /* * We dynamically size 'struct fpu', so we require that * 'state' be at the end of 'it: */ CHECK_MEMBER_AT_END_OF(struct fpu, __fpstate); arch_task_struct_size = task_size; So, this is still copying out of the kmem cache for task_struct, and the window seems unchanged (still fpu regs). This is what the window was before: void fpu_thread_struct_whitelist(unsigned long *offset, unsigned long *size) { *offset = offsetof(struct thread_struct, fpu.__fpstate.regs); *size = fpu_kernel_cfg.default_size; } And the same commit I mentioned above removed it. I think the misunderstanding is here: > The fpu_thread_struct_whitelist() quirk to hardened usercopy can be > removed, > now that the FPU structure is not embedded in the task struct anymore, > which > reduces text footprint a bit. Yes, FPU is no longer in task_struct, but it IS in the kmem cache named "task_struct", since the fpstate is still being allocated there. This partial revert of the earlier mentioned commit, along with a recalculation of the fpstate regs location, should fix it, and can get squashed into the earlier mentioned commit: diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index eaa7214d6953..ad33903dc92a 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -522,14 +522,12 @@ extern struct fpu *x86_task_fpu(struct task_struct *task); # define x86_task_fpu(task) ((struct fpu *)((void *)(task) + sizeof(*(task)))) #endif -/* - * X86 doesn't need any embedded-FPU-struct quirks: - */ -static inline void -arch_thread_struct_whitelist(unsigned long *offset, unsigned long *size) +extern void fpu_thread_struct_whitelist(unsigned long *offset, unsigned long *size); + +static inline void arch_thread_struct_whitelist(unsigned long *offset, + unsigned long *size) { - *offset = 0; - *size = 0; + fpu_thread_struct_whitelist(offset, size); } static inline void diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 3a19877a314e..e8188b0527cf 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -680,6 +680,20 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal, return 0; } +/* + * While struct fpu is no longer part of struct thread_struct, it is still + * allocated after struct task_struct in the "task_struct" kmem cache. But + * since FPU is expected to be part of struct thread_struct, we have to + * adjust for it here. + */ +void fpu_thread_struct_whitelist(unsigned long *offset, unsigned long *size) +{ + /* The allocation follows struct task_struct. */ + *offset = sizeof(struct task_struct) - offsetof(struct task_struct, thread); + *offset += offsetof(struct fpu, __fpstate.regs); + *size = fpu_kernel_cfg.default_size; +} + /* * Drops current FPU state: deactivates the fpregs and * the fpstate. NOTE: it still leaves previous contents However, I can't test this patch -- I'm not able to reproduce the problem either, even with your .config. What hardware are you using? I assume that for me x86_task_fpu(tsk)->fpstate isn't pointing to struct fpu::__fpstate. I'll keep digging... -- Kees Cook
