On Thu, Nov 29, 2018 at 10:58 AM Linus Torvalds <torva...@linux-foundation.org> wrote: > > In contrast, if the call was wrapped in an inline asm, we'd *know* the > compiler couldn't turn a "call wrapper(%rip)" into anything else.
Actually, I think I have a better model - if the caller is done with inline asm. What you can do then is basically add a single-byte prefix to the "call" instruction that does nothing (say, cs override), and then replace *that* with a 'int3' instruction. Boom. Done. Now, the "int3" handler can just update the instruction in-place, but leave the "int3" in place, and then return to the next instruction byte (which is just the normal branch instruction without the prefix byte). The cross-CPU case continues to work, because the 'int3' remains in place until after the IPI. But that would require that we'd mark those call instruction with Linus