On Tue, 11 Jun 2019 10:03:07 +0200 Peter Zijlstra <pet...@infradead.org> wrote:
> So what happens is that arch_prepare_optimized_kprobe() <- > copy_optimized_instructions() copies however much of the instruction > stream is required such that we can overwrite the instruction at @addr > with a 5 byte jump. > > arch_optimize_kprobe() then does the text_poke_bp() that replaces the > instruction @addr with int3, copies the rel jump address and overwrites > the int3 with jmp. > > And I'm thinking the problem is with something like: > > @addr: nop nop nop nop nop What would work would be to: add breakpoint to first opcode. call synchronize_tasks(); /* All tasks now hitting breakpoint and jumping over affected code */ update the rest of the instructions. replace breakpoint with jmp. One caveat is that the replaced instructions must not be a call function. As if the call function calls schedule then it will circumvent the synchronize_tasks(). It would be OK if that call is the last of the instructions. But I doubt we modify anything more then a call size anyway, so this should still work for all current instances. -- Steve > > We copy out the nops into the trampoline, overwrite the first nop with > an INT3, overwrite the remaining nops with the rel addr, but oops, > another CPU can still be executing one of those NOPs, right? > > I'm thinking we could fix this by first writing INT3 into all relevant > instructions, which is going to be messy, given the current code base.