On Fri, 22 Jun 2018, Peter Zijlstra wrote: > On Fri, Jun 22, 2018 at 06:16:05PM +0200, Thomas Gleixner wrote: > > On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > > > On Fri, Jun 22, 2018 at 03:35:18PM +0000, Thomas Gleixner wrote: > > > > > How is that supposed to work correctly? > > > > > > > > start_kernel() > > > > .... > > > > trap_init() > > > > cpu_init() > > > > > > > > .... > > > > check_bugs() > > > > alternative_instructions() > > > > > > > > So the first invocation of cpu_init() on the boot CPU will then use > > > > static_cpu_has() which is not yet initialized proper. > > > > > > Ouch. > > > > > > Is there a way to catch such improper static_cpu_has() users? > > > Silent misbehaviour is risky. > > > > Yes, it is. I don't think we have something in place right now, but we > > should add it definitely. PeterZ ???? > > So static_cpu_has() _should_ work. That thing is mightily convoluted, > but behold: > > | static __always_inline __pure bool _static_cpu_has(u16 bit) > | { > | asm_volatile_goto("1: jmp 6f\n" > | "2:\n" > | ".skip -(((5f-4f) - (2b-1b)) > 0) * " > | "((5f-4f) - (2b-1b)),0x90\n" > > <snip magic shite> > > | ".section .altinstr_aux,\"ax\"\n" > | "6:\n" > | " testb %[bitnum],%[cap_byte]\n" > | " jnz %l[t_yes]\n" > | " jmp %l[t_no]\n" > | ".previous\n" > | : : [feature] "i" (bit), > | [always] "i" (X86_FEATURE_ALWAYS), > | [bitnum] "i" (1 << (bit & 7)), > | [cap_byte] "m" (((const char > *)boot_cpu_data.x86_capability)[bit >> 3]) > | : : t_yes, t_no); > | t_yes: > | return true; > | t_no: > | return false; > | } > > So by default that emits, before patching: > > jmp 6f > 'however many single byte NOPs are needed' > > .section.altinstr_aux > 6: testb %[bitnum],%[cap_byte] > jnz %l[t_yes] > jmp %l[t_no] > .previous > > Which is a dynamic test for the bit in the bitmask. Which always works, > irrespective of the alternative patching. > > The magic, which I cut out, will rewrite the "jmp 6f, nops" thing to > "jmp %l[y_{yes,no}]" at the alternative patching and we'll loose the > dynamic test, pinning the condition forever more.
Hrm. Memory seems have to tricked me. So yes, it should work then. Though I still prefer the two liners fixup of the cpu_init() section mismatch thingy for now over the whole code move. Especially since Borislav and I have plans to rework that insanity completely once the speculative distractions are subsiding. Thanks, tglx