From: Andy Lutomirski Sent: November 5, 2018 at 7:03:49 PM GMT > To: Nadav Amit <[email protected]> > Cc: Peter Zijlstra <[email protected]>, Ingo Molnar <[email protected]>, > LKML <[email protected]>, X86 ML <[email protected]>, H. Peter Anvin > <[email protected]>, Thomas Gleixner <[email protected]>, Borislav Petkov > <[email protected]>, Dave Hansen <[email protected]>, Andy Lutomirski > <[email protected]>, Kees Cook <[email protected]>, Dave Hansen > <[email protected]>, Masami Hiramatsu <[email protected]> > Subject: Re: [PATCH v3 2/7] x86/jump_label: Use text_poke_early() during > early_init > > > > >> On Nov 5, 2018, at 9:49 AM, Nadav Amit <[email protected]> wrote: >> >> From: Andy Lutomirski >> Sent: November 5, 2018 at 5:22:32 PM GMT >>> To: Peter Zijlstra <[email protected]> >>> Cc: Nadav Amit <[email protected]>, Ingo Molnar <[email protected]>, >>> [email protected], [email protected], H. Peter Anvin >>> <[email protected]>, Thomas Gleixner <[email protected]>, Borislav Petkov >>> <[email protected]>, Dave Hansen <[email protected]>, Andy >>> Lutomirski <[email protected]>, Kees Cook <[email protected]>, Dave >>> Hansen <[email protected]>, Masami Hiramatsu <[email protected]> >>> Subject: Re: [PATCH v3 2/7] x86/jump_label: Use text_poke_early() during >>> early_init >>> >>> >>> >>>>> On Nov 5, 2018, at 6:09 AM, Peter Zijlstra <[email protected]> wrote: >>>>> >>>>> On Fri, Nov 02, 2018 at 04:29:41PM -0700, Nadav Amit wrote: >>>>> diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c >>>>> index aac0c1f7e354..367c1d0c20a3 100644 >>>>> --- a/arch/x86/kernel/jump_label.c >>>>> +++ b/arch/x86/kernel/jump_label.c >>>>> @@ -52,7 +52,13 @@ static void __ref __jump_label_transform(struct >>>>> jump_entry *entry, >>>>> jmp.offset = jump_entry_target(entry) - >>>>> (jump_entry_code(entry) + JUMP_LABEL_NOP_SIZE); >>>>> >>>>> - if (early_boot_irqs_disabled) >>>>> + /* >>>>> + * As long as we are in early boot, we can use text_poke_early(), >>>>> which >>>>> + * is more efficient: the memory was still not marked as read-only >>>>> (it >>>>> + * is only marked after poking_init()). This also prevents us from >>>>> using >>>>> + * text_poke() before poking_init() is called. >>>>> + */ >>>>> + if (!early_boot_done) >>>>> poker = text_poke_early; >>>>> >>>>> if (type == JUMP_LABEL_JMP) { >>>> >>>> It took me a while to untangle init/maze^H^Hin.c... but I think this >>>> is all we need: >>>> >>>> diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c >>>> index aac0c1f7e354..ed5fe274a7d8 100644 >>>> --- a/arch/x86/kernel/jump_label.c >>>> +++ b/arch/x86/kernel/jump_label.c >>>> @@ -52,7 +52,12 @@ static void __ref __jump_label_transform(struct >>>> jump_entry *entry, >>>> jmp.offset = jump_entry_target(entry) - >>>> (jump_entry_code(entry) + JUMP_LABEL_NOP_SIZE); >>>> >>>> - if (early_boot_irqs_disabled) >>>> + /* >>>> + * As long as we're UP and not yet marked RO, we can use >>>> + * text_poke_early; SYSTEM_BOOTING guarantees both, as we switch to >>>> + * SYSTEM_SCHEDULING before going either. >>>> + */ >>>> + if (system_state == SYSTEM_BOOTING) >>>> poker = text_poke_early; >>>> >>>> if (type == JUMP_LABEL_JMP) { >>> >>> Can we move this logic into text_poke() and get rid of text_poke_early()? >> >> This will negatively affect poking of modules doing module loading, e.g., >> apply_paravirt(). This can be resolved by keeping track when the module is >> write-protected and giving a module parameter to text_poke(). Does it worth >> the complexity? > > Probably not. > > OTOH, why does alternative patching need text_poke() at all? Can’t it just > write to the text?
Good question. According to my understanding, these games of text_poke_early() are not needed, at least for modules (on Intel). Intel SDM 11.6 "SELF-MODIFYING CODE” says: "A write to a memory location in a code segment that is currently cached in the processor causes the associated cache line (or lines) to be invalidated. This check is based on the physical address of the instruction.” Then the manual talks about prefetched instructions, but the modules code is presumably not be “prefetchable” at this point. So I think it should be safe, but I guess that you reviewed Intel/AMD manuals better when you wrote sync_core(). Anyhow, there should be a function that wraps the memcpy() to keep track when someone changes the text (for potential future use). Does it make sense? Do you want me to give it a spin? Thanks, Nadav

