On Mon, Jun 15, 2020 at 05:20PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2020 at 05:03:27PM +0200, Peter Zijlstra wrote:
> 
> > Yes, I think so. x86_64 needs lib/memcpy_64.S in .noinstr.text then. For
> > i386 it's an __always_inline inline-asm thing.
> 
> Bah, I tried writing it without memcpy, but clang inserts memcpy anyway
> :/

Hmm, __builtin_memcpy() won't help either.

Turns out, Clang 11 got __builtin_memcpy_inline(): 
https://reviews.llvm.org/D73543

The below works, no more crash on either KASAN or KCSAN with Clang. We
can test if we have it with __has_feature(__builtin_memcpy_inline)
(although that's currently not working as expected, trying to fix :-/).

Would a memcpy_inline() be generally useful? It's not just Clang but
also GCC that isn't entirely upfront about which memcpy is inlined and
which isn't. If the compiler has __builtin_memcpy_inline(), we can use
it, otherwise the arch likely has to provide the implementation.

Thoughts?

Thanks,
-- Marco

------ >8 ------

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index af75109485c2..3e07beae2a75 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -690,13 +690,13 @@ struct bad_iret_stack *fixup_bad_iret(struct 
bad_iret_stack *s)
                (struct bad_iret_stack 
*)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
 
        /* Copy the IRET target to the temporary storage. */
-       memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
+       __builtin_memcpy_inline(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
 
        /* Copy the remainder of the stack from the current stack. */
-       memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
+       __builtin_memcpy_inline(&tmp, s, offsetof(struct bad_iret_stack, 
regs.ip));
 
        /* Update the entry stack */
-       memcpy(new_stack, &tmp, sizeof(tmp));
+       __builtin_memcpy_inline(new_stack, &tmp, sizeof(tmp));
 
        BUG_ON(!user_mode(&new_stack->regs));
        return new_stack;

Reply via email to