On Tue, Jun 11, 2019 at 02:08:34PM +0200, Peter Zijlstra wrote: > On Tue, Jun 11, 2019 at 10:03:07AM +0200, Peter Zijlstra wrote: > > On Fri, Jun 07, 2019 at 11:10:19AM -0700, Andy Lutomirski wrote: > > > > I am surely missing some kprobe context, but is it really safe to use > > > this mechanism to replace more than one instruction? > > > > I'm not entirely up-to-scratch here, so Masami, please correct me if I'm > > wrong. > > > > So what happens is that arch_prepare_optimized_kprobe() <- > > copy_optimized_instructions() copies however much of the instruction > > stream is required such that we can overwrite the instruction at @addr > > with a 5 byte jump. > > > > arch_optimize_kprobe() then does the text_poke_bp() that replaces the > > instruction @addr with int3, copies the rel jump address and overwrites > > the int3 with jmp. > > > > And I'm thinking the problem is with something like: > > > > @addr: nop nop nop nop nop > > > > We copy out the nops into the trampoline, overwrite the first nop with > > an INT3, overwrite the remaining nops with the rel addr, but oops, > > another CPU can still be executing one of those NOPs, right? > > > > I'm thinking we could fix this by first writing INT3 into all relevant > > instructions, which is going to be messy, given the current code base. > > Maybe not that bad; how's something like this? > > (completely untested) > > --- > arch/x86/kernel/alternative.c | 20 ++++++++++++++++++-- > 1 file changed, 18 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c > index 0d57015114e7..8f643dabea72 100644 > --- a/arch/x86/kernel/alternative.c > +++ b/arch/x86/kernel/alternative.c > @@ -24,6 +24,7 @@ > #include <asm/tlbflush.h> > #include <asm/io.h> > #include <asm/fixmap.h> > +#include <asm/insn.h> > > int __read_mostly alternatives_patched; > > @@ -849,6 +850,7 @@ static void do_sync_core(void *info) > > static bool bp_patching_in_progress; > static void *bp_int3_handler, *bp_int3_addr; > +static unsigned int bp_int3_length; > > int poke_int3_handler(struct pt_regs *regs) > { > @@ -867,7 +869,11 @@ int poke_int3_handler(struct pt_regs *regs) > if (likely(!bp_patching_in_progress)) > return 0; > > - if (user_mode(regs) || regs->ip != (unsigned long)bp_int3_addr) > + if (user_mode(regs)) > + return 0; > + > + if (regs->ip < (unsigned long)bp_int3_addr || > + regs->ip >= (unsigned long)bp_int3_addr + bp_int3_length) > return 0;
Bugger, this isn't right. It'll jump to the beginning of the trampoline, even if it is multiple instructions in, which would lead to executing instructions twice, which would be BAD. _maybe_, depending on what the slot looks like, we could do something like: offset = regs->ip - (unsigned long)bp_int3_addr; regs->ip = bp_int3_handler + offset; That is; jump into the slot at the same offset we hit the INT3, but this is quickly getting yuck. > /* set up the specified breakpoint handler */ > @@ -900,9 +906,12 @@ NOKPROBE_SYMBOL(poke_int3_handler); > void text_poke_bp(void *addr, const void *opcode, size_t len, void *handler) > { > unsigned char int3 = 0xcc; > + void *kaddr = addr; > + struct insn insn; > > bp_int3_handler = handler; > bp_int3_addr = (u8 *)addr + sizeof(int3); > + bp_int3_length = len - sizeof(int3); > bp_patching_in_progress = true; > > lockdep_assert_held(&text_mutex); > @@ -913,7 +922,14 @@ void text_poke_bp(void *addr, const void *opcode, size_t > len, void *handler) > */ > smp_wmb(); > > - text_poke(addr, &int3, sizeof(int3)); > + do { > + kernel_insn_init(&insn, kaddr, MAX_INSN_SIZE); > + insn_get_length(&insn); > + > + text_poke(kaddr, &int3, sizeof(int3)); > + > + kaddr += insn.length; > + } while (kaddr < addr + len); > > on_each_cpu(do_sync_core, NULL, 1); >