Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, Jun 22, 2018 at 06:50:56PM +0200, Thomas Gleixner wrote: > > The magic, which I cut out, will rewrite the "jmp 6f, nops" thing to > > "jmp %l[y_{yes,no}]" at the alternative patching and we'll loose the > > dynamic test, pinning the condition forever more. > > Hrm. Memory seems have to tricked me. So yes, it should work then. Yes, but do verify that it does still. Because depending on how early you call it, boot_cpu_data.x86_capability might not be populated properly yet and then the dynamic case is wrong too. So check the order pls. > Though I still prefer the two liners fixup of the cpu_init() section > mismatch thingy for now over the whole code move. Especially since Borislav > and I have plans to rework that insanity completely once the speculative > distractions are subsiding. Hell yeah. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, Jun 22, 2018 at 06:50:56PM +0200, Thomas Gleixner wrote: > > The magic, which I cut out, will rewrite the "jmp 6f, nops" thing to > > "jmp %l[y_{yes,no}]" at the alternative patching and we'll loose the > > dynamic test, pinning the condition forever more. > > Hrm. Memory seems have to tricked me. So yes, it should work then. Yes, but do verify that it does still. Because depending on how early you call it, boot_cpu_data.x86_capability might not be populated properly yet and then the dynamic case is wrong too. So check the order pls. > Though I still prefer the two liners fixup of the cpu_init() section > mismatch thingy for now over the whole code move. Especially since Borislav > and I have plans to rework that insanity completely once the speculative > distractions are subsiding. Hell yeah. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, 22 Jun 2018, Peter Zijlstra wrote: > On Fri, Jun 22, 2018 at 06:16:05PM +0200, Thomas Gleixner wrote: > > On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > > > On Fri, Jun 22, 2018 at 03:35:18PM +, Thomas Gleixner wrote: > > > > > How is that supposed to work correctly? > > > > > > > > start_kernel() > > > > > > > > trap_init() > > > > cpu_init() > > > > > > > > > > > > check_bugs() > > > > alternative_instructions() > > > > > > > > So the first invocation of cpu_init() on the boot CPU will then use > > > > static_cpu_has() which is not yet initialized proper. > > > > > > Ouch. > > > > > > Is there a way to catch such improper static_cpu_has() users? > > > Silent misbehaviour is risky. > > > > Yes, it is. I don't think we have something in place right now, but we > > should add it definitely. PeterZ > > So static_cpu_has() _should_ work. That thing is mightily convoluted, > but behold: > > | static __always_inline __pure bool _static_cpu_has(u16 bit) > | { > | asm_volatile_goto("1: jmp 6f\n" > | "2:\n" > | ".skip -(((5f-4f) - (2b-1b)) > 0) * " > | "((5f-4f) - (2b-1b)),0x90\n" > > > > | ".section .altinstr_aux,\"ax\"\n" > | "6:\n" > | " testb %[bitnum],%[cap_byte]\n" > | " jnz %l[t_yes]\n" > | " jmp %l[t_no]\n" > | ".previous\n" > | : : [feature] "i" (bit), > | [always] "i" (X86_FEATURE_ALWAYS), > | [bitnum] "i" (1 << (bit & 7)), > | [cap_byte] "m" (((const char > *)boot_cpu_data.x86_capability)[bit >> 3]) > | : : t_yes, t_no); > | t_yes: > | return true; > | t_no: > | return false; > | } > > So by default that emits, before patching: > > jmp 6f > 'however many single byte NOPs are needed' > > .section.altinstr_aux > 6: testb %[bitnum],%[cap_byte] > jnz %l[t_yes] > jmp %l[t_no] > .previous > > Which is a dynamic test for the bit in the bitmask. Which always works, > irrespective of the alternative patching. > > The magic, which I cut out, will rewrite the "jmp 6f, nops" thing to > "jmp %l[y_{yes,no}]" at the alternative patching and we'll loose the > dynamic test, pinning the condition forever more. Hrm. Memory seems have to tricked me. So yes, it should work then. Though I still prefer the two liners fixup of the cpu_init() section mismatch thingy for now over the whole code move. Especially since Borislav and I have plans to rework that insanity completely once the speculative distractions are subsiding. Thanks, tglx
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, 22 Jun 2018, Peter Zijlstra wrote: > On Fri, Jun 22, 2018 at 06:16:05PM +0200, Thomas Gleixner wrote: > > On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > > > On Fri, Jun 22, 2018 at 03:35:18PM +, Thomas Gleixner wrote: > > > > > How is that supposed to work correctly? > > > > > > > > start_kernel() > > > > > > > > trap_init() > > > > cpu_init() > > > > > > > > > > > > check_bugs() > > > > alternative_instructions() > > > > > > > > So the first invocation of cpu_init() on the boot CPU will then use > > > > static_cpu_has() which is not yet initialized proper. > > > > > > Ouch. > > > > > > Is there a way to catch such improper static_cpu_has() users? > > > Silent misbehaviour is risky. > > > > Yes, it is. I don't think we have something in place right now, but we > > should add it definitely. PeterZ > > So static_cpu_has() _should_ work. That thing is mightily convoluted, > but behold: > > | static __always_inline __pure bool _static_cpu_has(u16 bit) > | { > | asm_volatile_goto("1: jmp 6f\n" > | "2:\n" > | ".skip -(((5f-4f) - (2b-1b)) > 0) * " > | "((5f-4f) - (2b-1b)),0x90\n" > > > > | ".section .altinstr_aux,\"ax\"\n" > | "6:\n" > | " testb %[bitnum],%[cap_byte]\n" > | " jnz %l[t_yes]\n" > | " jmp %l[t_no]\n" > | ".previous\n" > | : : [feature] "i" (bit), > | [always] "i" (X86_FEATURE_ALWAYS), > | [bitnum] "i" (1 << (bit & 7)), > | [cap_byte] "m" (((const char > *)boot_cpu_data.x86_capability)[bit >> 3]) > | : : t_yes, t_no); > | t_yes: > | return true; > | t_no: > | return false; > | } > > So by default that emits, before patching: > > jmp 6f > 'however many single byte NOPs are needed' > > .section.altinstr_aux > 6: testb %[bitnum],%[cap_byte] > jnz %l[t_yes] > jmp %l[t_no] > .previous > > Which is a dynamic test for the bit in the bitmask. Which always works, > irrespective of the alternative patching. > > The magic, which I cut out, will rewrite the "jmp 6f, nops" thing to > "jmp %l[y_{yes,no}]" at the alternative patching and we'll loose the > dynamic test, pinning the condition forever more. Hrm. Memory seems have to tricked me. So yes, it should work then. Though I still prefer the two liners fixup of the cpu_init() section mismatch thingy for now over the whole code move. Especially since Borislav and I have plans to rework that insanity completely once the speculative distractions are subsiding. Thanks, tglx
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, Jun 22, 2018 at 06:16:05PM +0200, Thomas Gleixner wrote: > On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > > On Fri, Jun 22, 2018 at 03:35:18PM +, Thomas Gleixner wrote: > > > How is that supposed to work correctly? > > > > > > start_kernel() > > > > > > trap_init() > > > cpu_init() > > > > > > > > > check_bugs() > > > alternative_instructions() > > > > > > So the first invocation of cpu_init() on the boot CPU will then use > > > static_cpu_has() which is not yet initialized proper. > > > > Ouch. > > > > Is there a way to catch such improper static_cpu_has() users? > > Silent misbehaviour is risky. > > Yes, it is. I don't think we have something in place right now, but we > should add it definitely. PeterZ So static_cpu_has() _should_ work. That thing is mightily convoluted, but behold: | static __always_inline __pure bool _static_cpu_has(u16 bit) | { | asm_volatile_goto("1: jmp 6f\n" | "2:\n" | ".skip -(((5f-4f) - (2b-1b)) > 0) * " | "((5f-4f) - (2b-1b)),0x90\n" | ".section .altinstr_aux,\"ax\"\n" | "6:\n" | " testb %[bitnum],%[cap_byte]\n" | " jnz %l[t_yes]\n" | " jmp %l[t_no]\n" | ".previous\n" | : : [feature] "i" (bit), | [always] "i" (X86_FEATURE_ALWAYS), | [bitnum] "i" (1 << (bit & 7)), | [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3]) | : : t_yes, t_no); | t_yes: | return true; | t_no: | return false; | } So by default that emits, before patching: jmp 6f 'however many single byte NOPs are needed' .section.altinstr_aux 6: testb %[bitnum],%[cap_byte] jnz %l[t_yes] jmp %l[t_no] .previous Which is a dynamic test for the bit in the bitmask. Which always works, irrespective of the alternative patching. The magic, which I cut out, will rewrite the "jmp 6f, nops" thing to "jmp %l[y_{yes,no}]" at the alternative patching and we'll loose the dynamic test, pinning the condition forever more.
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, Jun 22, 2018 at 06:16:05PM +0200, Thomas Gleixner wrote: > On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > > On Fri, Jun 22, 2018 at 03:35:18PM +, Thomas Gleixner wrote: > > > How is that supposed to work correctly? > > > > > > start_kernel() > > > > > > trap_init() > > > cpu_init() > > > > > > > > > check_bugs() > > > alternative_instructions() > > > > > > So the first invocation of cpu_init() on the boot CPU will then use > > > static_cpu_has() which is not yet initialized proper. > > > > Ouch. > > > > Is there a way to catch such improper static_cpu_has() users? > > Silent misbehaviour is risky. > > Yes, it is. I don't think we have something in place right now, but we > should add it definitely. PeterZ So static_cpu_has() _should_ work. That thing is mightily convoluted, but behold: | static __always_inline __pure bool _static_cpu_has(u16 bit) | { | asm_volatile_goto("1: jmp 6f\n" | "2:\n" | ".skip -(((5f-4f) - (2b-1b)) > 0) * " | "((5f-4f) - (2b-1b)),0x90\n" | ".section .altinstr_aux,\"ax\"\n" | "6:\n" | " testb %[bitnum],%[cap_byte]\n" | " jnz %l[t_yes]\n" | " jmp %l[t_no]\n" | ".previous\n" | : : [feature] "i" (bit), | [always] "i" (X86_FEATURE_ALWAYS), | [bitnum] "i" (1 << (bit & 7)), | [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3]) | : : t_yes, t_no); | t_yes: | return true; | t_no: | return false; | } So by default that emits, before patching: jmp 6f 'however many single byte NOPs are needed' .section.altinstr_aux 6: testb %[bitnum],%[cap_byte] jnz %l[t_yes] jmp %l[t_no] .previous Which is a dynamic test for the bit in the bitmask. Which always works, irrespective of the alternative patching. The magic, which I cut out, will rewrite the "jmp 6f, nops" thing to "jmp %l[y_{yes,no}]" at the alternative patching and we'll loose the dynamic test, pinning the condition forever more.
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > On Fri, Jun 22, 2018 at 03:35:18PM +, Thomas Gleixner wrote: > > > > Second thoughts. > > > > > > > > The only place where __pgtable_l5_enabled() is used in common.c is in > > > > early_identify_cpu() which is marked __init. So how is that section > > > > mismatch triggered? > > > > > > Yeah, it's not obvious: > > > > > > cpu_init() > > > load_mm_ldt() > > > ldt_slot_va() > > > LDT_BASE_ADDR > > > LDT_PGD_ENTRY > > > pgtable_l5_enabled() > > > > How is that supposed to work correctly? > > > > start_kernel() > > > > trap_init() > > cpu_init() > > > > > > check_bugs() > > alternative_instructions() > > > > So the first invocation of cpu_init() on the boot CPU will then use > > static_cpu_has() which is not yet initialized proper. > > Ouch. > > Is there a way to catch such improper static_cpu_has() users? > Silent misbehaviour is risky. Yes, it is. I don't think we have something in place right now, but we should add it definitely. PeterZ > > So, no. That does not work and the proper fix is: > > > > -unsigned int __pgtable_l5_enabled __initdata; > > +unsigned int __pgtable_l5_enabled __ro_after_init; > > > > and make cpu/common.c use the early variant. The extra 4 bytes storage are > > not a problem and cpu_init() is not a fast path at all. > > Okay, I'll prepare the patch. > > BTW, if we go this path after all, shouldn't we revert these: > > 046c0dbec023 ("x86: Mark native_set_p4d() as __always_inline") > 1ea66554d3b0 ("x86/mm: Mark p4d_offset() __always_inline") In principle the always inline is fine, but the changelogs are quite misleading and I really regret that I did not take the time to analyse that proper when I applied the patches. At least we have catched it now. So yes, please send the reverts along. Can you please add a proper root cause analysis for the issues Arnd has observed to the changelogs so we have it documented for later. Thanks, tglx
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > On Fri, Jun 22, 2018 at 03:35:18PM +, Thomas Gleixner wrote: > > > > Second thoughts. > > > > > > > > The only place where __pgtable_l5_enabled() is used in common.c is in > > > > early_identify_cpu() which is marked __init. So how is that section > > > > mismatch triggered? > > > > > > Yeah, it's not obvious: > > > > > > cpu_init() > > > load_mm_ldt() > > > ldt_slot_va() > > > LDT_BASE_ADDR > > > LDT_PGD_ENTRY > > > pgtable_l5_enabled() > > > > How is that supposed to work correctly? > > > > start_kernel() > > > > trap_init() > > cpu_init() > > > > > > check_bugs() > > alternative_instructions() > > > > So the first invocation of cpu_init() on the boot CPU will then use > > static_cpu_has() which is not yet initialized proper. > > Ouch. > > Is there a way to catch such improper static_cpu_has() users? > Silent misbehaviour is risky. Yes, it is. I don't think we have something in place right now, but we should add it definitely. PeterZ > > So, no. That does not work and the proper fix is: > > > > -unsigned int __pgtable_l5_enabled __initdata; > > +unsigned int __pgtable_l5_enabled __ro_after_init; > > > > and make cpu/common.c use the early variant. The extra 4 bytes storage are > > not a problem and cpu_init() is not a fast path at all. > > Okay, I'll prepare the patch. > > BTW, if we go this path after all, shouldn't we revert these: > > 046c0dbec023 ("x86: Mark native_set_p4d() as __always_inline") > 1ea66554d3b0 ("x86/mm: Mark p4d_offset() __always_inline") In principle the always inline is fine, but the changelogs are quite misleading and I really regret that I did not take the time to analyse that proper when I applied the patches. At least we have catched it now. So yes, please send the reverts along. Can you please add a proper root cause analysis for the issues Arnd has observed to the changelogs so we have it documented for later. Thanks, tglx
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, Jun 22, 2018 at 03:35:18PM +, Thomas Gleixner wrote: > On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > > On Fri, Jun 22, 2018 at 02:05:47PM +, Thomas Gleixner wrote: > > > On Tue, 12 Jun 2018, Kirill A. Shutemov wrote: > > > > > > > __pgtable_l5_enabled shouldn't be needed after system has booted, we can > > > > mark it as __initdata, but it requires preparation. > > > > > > > > This patch moves early cpu initialization into a separate translation > > > > unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. > > > > > > > > Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is > > > > not __init function and it leads to section mismatch. > > > > > > > > Signed-off-by: Kirill A. Shutemov > > > > Reviewed-by: Thomas Gleixner > > > > > > Second thoughts. > > > > > > The only place where __pgtable_l5_enabled() is used in common.c is in > > > early_identify_cpu() which is marked __init. So how is that section > > > mismatch triggered? > > > > Yeah, it's not obvious: > > > > cpu_init() > > load_mm_ldt() > > ldt_slot_va() > > LDT_BASE_ADDR > > LDT_PGD_ENTRY > > pgtable_l5_enabled() > > How is that supposed to work correctly? > > start_kernel() > > trap_init() > cpu_init() > > > check_bugs() > alternative_instructions() > > So the first invocation of cpu_init() on the boot CPU will then use > static_cpu_has() which is not yet initialized proper. Ouch. Is there a way to catch such improper static_cpu_has() users? Silent misbehaviour is risky. > So, no. That does not work and the proper fix is: > > -unsigned int __pgtable_l5_enabled __initdata; > +unsigned int __pgtable_l5_enabled __ro_after_init; > > and make cpu/common.c use the early variant. The extra 4 bytes storage are > not a problem and cpu_init() is not a fast path at all. Okay, I'll prepare the patch. BTW, if we go this path after all, shouldn't we revert these: 046c0dbec023 ("x86: Mark native_set_p4d() as __always_inline") 1ea66554d3b0 ("x86/mm: Mark p4d_offset() __always_inline") ? I can send it as part of the patchset. -- Kirill A. Shutemov
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, Jun 22, 2018 at 03:35:18PM +, Thomas Gleixner wrote: > On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > > On Fri, Jun 22, 2018 at 02:05:47PM +, Thomas Gleixner wrote: > > > On Tue, 12 Jun 2018, Kirill A. Shutemov wrote: > > > > > > > __pgtable_l5_enabled shouldn't be needed after system has booted, we can > > > > mark it as __initdata, but it requires preparation. > > > > > > > > This patch moves early cpu initialization into a separate translation > > > > unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. > > > > > > > > Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is > > > > not __init function and it leads to section mismatch. > > > > > > > > Signed-off-by: Kirill A. Shutemov > > > > Reviewed-by: Thomas Gleixner > > > > > > Second thoughts. > > > > > > The only place where __pgtable_l5_enabled() is used in common.c is in > > > early_identify_cpu() which is marked __init. So how is that section > > > mismatch triggered? > > > > Yeah, it's not obvious: > > > > cpu_init() > > load_mm_ldt() > > ldt_slot_va() > > LDT_BASE_ADDR > > LDT_PGD_ENTRY > > pgtable_l5_enabled() > > How is that supposed to work correctly? > > start_kernel() > > trap_init() > cpu_init() > > > check_bugs() > alternative_instructions() > > So the first invocation of cpu_init() on the boot CPU will then use > static_cpu_has() which is not yet initialized proper. Ouch. Is there a way to catch such improper static_cpu_has() users? Silent misbehaviour is risky. > So, no. That does not work and the proper fix is: > > -unsigned int __pgtable_l5_enabled __initdata; > +unsigned int __pgtable_l5_enabled __ro_after_init; > > and make cpu/common.c use the early variant. The extra 4 bytes storage are > not a problem and cpu_init() is not a fast path at all. Okay, I'll prepare the patch. BTW, if we go this path after all, shouldn't we revert these: 046c0dbec023 ("x86: Mark native_set_p4d() as __always_inline") 1ea66554d3b0 ("x86/mm: Mark p4d_offset() __always_inline") ? I can send it as part of the patchset. -- Kirill A. Shutemov
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > On Fri, Jun 22, 2018 at 02:05:47PM +, Thomas Gleixner wrote: > > On Tue, 12 Jun 2018, Kirill A. Shutemov wrote: > > > > > __pgtable_l5_enabled shouldn't be needed after system has booted, we can > > > mark it as __initdata, but it requires preparation. > > > > > > This patch moves early cpu initialization into a separate translation > > > unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. > > > > > > Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is > > > not __init function and it leads to section mismatch. > > > > > > Signed-off-by: Kirill A. Shutemov > > > Reviewed-by: Thomas Gleixner > > > > Second thoughts. > > > > The only place where __pgtable_l5_enabled() is used in common.c is in > > early_identify_cpu() which is marked __init. So how is that section > > mismatch triggered? > > Yeah, it's not obvious: > > cpu_init() > load_mm_ldt() > ldt_slot_va() > LDT_BASE_ADDR > LDT_PGD_ENTRY > pgtable_l5_enabled() How is that supposed to work correctly? start_kernel() trap_init() cpu_init() check_bugs() alternative_instructions() So the first invocation of cpu_init() on the boot CPU will then use static_cpu_has() which is not yet initialized proper. So, no. That does not work and the proper fix is: -unsigned int __pgtable_l5_enabled __initdata; +unsigned int __pgtable_l5_enabled __ro_after_init; and make cpu/common.c use the early variant. The extra 4 bytes storage are not a problem and cpu_init() is not a fast path at all. Thanks, tglx
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, 22 Jun 2018, Kirill A. Shutemov wrote: > On Fri, Jun 22, 2018 at 02:05:47PM +, Thomas Gleixner wrote: > > On Tue, 12 Jun 2018, Kirill A. Shutemov wrote: > > > > > __pgtable_l5_enabled shouldn't be needed after system has booted, we can > > > mark it as __initdata, but it requires preparation. > > > > > > This patch moves early cpu initialization into a separate translation > > > unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. > > > > > > Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is > > > not __init function and it leads to section mismatch. > > > > > > Signed-off-by: Kirill A. Shutemov > > > Reviewed-by: Thomas Gleixner > > > > Second thoughts. > > > > The only place where __pgtable_l5_enabled() is used in common.c is in > > early_identify_cpu() which is marked __init. So how is that section > > mismatch triggered? > > Yeah, it's not obvious: > > cpu_init() > load_mm_ldt() > ldt_slot_va() > LDT_BASE_ADDR > LDT_PGD_ENTRY > pgtable_l5_enabled() How is that supposed to work correctly? start_kernel() trap_init() cpu_init() check_bugs() alternative_instructions() So the first invocation of cpu_init() on the boot CPU will then use static_cpu_has() which is not yet initialized proper. So, no. That does not work and the proper fix is: -unsigned int __pgtable_l5_enabled __initdata; +unsigned int __pgtable_l5_enabled __ro_after_init; and make cpu/common.c use the early variant. The extra 4 bytes storage are not a problem and cpu_init() is not a fast path at all. Thanks, tglx
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, Jun 22, 2018 at 02:05:47PM +, Thomas Gleixner wrote: > On Tue, 12 Jun 2018, Kirill A. Shutemov wrote: > > > __pgtable_l5_enabled shouldn't be needed after system has booted, we can > > mark it as __initdata, but it requires preparation. > > > > This patch moves early cpu initialization into a separate translation > > unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. > > > > Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is > > not __init function and it leads to section mismatch. > > > > Signed-off-by: Kirill A. Shutemov > > Reviewed-by: Thomas Gleixner > > Second thoughts. > > The only place where __pgtable_l5_enabled() is used in common.c is in > early_identify_cpu() which is marked __init. So how is that section > mismatch triggered? Yeah, it's not obvious: cpu_init() load_mm_ldt() ldt_slot_va() LDT_BASE_ADDR LDT_PGD_ENTRY pgtable_l5_enabled() -- Kirill A. Shutemov
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Fri, Jun 22, 2018 at 02:05:47PM +, Thomas Gleixner wrote: > On Tue, 12 Jun 2018, Kirill A. Shutemov wrote: > > > __pgtable_l5_enabled shouldn't be needed after system has booted, we can > > mark it as __initdata, but it requires preparation. > > > > This patch moves early cpu initialization into a separate translation > > unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. > > > > Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is > > not __init function and it leads to section mismatch. > > > > Signed-off-by: Kirill A. Shutemov > > Reviewed-by: Thomas Gleixner > > Second thoughts. > > The only place where __pgtable_l5_enabled() is used in common.c is in > early_identify_cpu() which is marked __init. So how is that section > mismatch triggered? Yeah, it's not obvious: cpu_init() load_mm_ldt() ldt_slot_va() LDT_BASE_ADDR LDT_PGD_ENTRY pgtable_l5_enabled() -- Kirill A. Shutemov
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Tue, 12 Jun 2018, Kirill A. Shutemov wrote: > __pgtable_l5_enabled shouldn't be needed after system has booted, we can > mark it as __initdata, but it requires preparation. > > This patch moves early cpu initialization into a separate translation > unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. > > Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is > not __init function and it leads to section mismatch. > > Signed-off-by: Kirill A. Shutemov > Reviewed-by: Thomas Gleixner Second thoughts. The only place where __pgtable_l5_enabled() is used in common.c is in early_identify_cpu() which is marked __init. So how is that section mismatch triggered? Thanks, tglx
Re: [PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
On Tue, 12 Jun 2018, Kirill A. Shutemov wrote: > __pgtable_l5_enabled shouldn't be needed after system has booted, we can > mark it as __initdata, but it requires preparation. > > This patch moves early cpu initialization into a separate translation > unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. > > Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is > not __init function and it leads to section mismatch. > > Signed-off-by: Kirill A. Shutemov > Reviewed-by: Thomas Gleixner Second thoughts. The only place where __pgtable_l5_enabled() is used in common.c is in early_identify_cpu() which is marked __init. So how is that section mismatch triggered? Thanks, tglx
[PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
__pgtable_l5_enabled shouldn't be needed after system has booted, we can mark it as __initdata, but it requires preparation. This patch moves early cpu initialization into a separate translation unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is not __init function and it leads to section mismatch. Signed-off-by: Kirill A. Shutemov Reviewed-by: Thomas Gleixner --- Ingo, The patch was missed when the rest of 'no5lvl' patchset applied. There's rebased version of the patch. Please apply. --- arch/x86/kernel/cpu/Makefile | 1 + arch/x86/kernel/cpu/common.c | 216 --- arch/x86/kernel/cpu/cpu.h| 7 ++ arch/x86/kernel/cpu/early.c | 184 + 4 files changed, 214 insertions(+), 194 deletions(-) create mode 100644 arch/x86/kernel/cpu/early.c diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index 7a40196967cb..b1da5a7c145c 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -19,6 +19,7 @@ CFLAGS_common.o := $(nostackp) obj-y := cacheinfo.o scattered.o topology.o obj-y += common.o +obj-y += early.o obj-y += rdrand.o obj-y += match.o obj-y += bugs.o diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 910b47ee8078..dc7f4e283979 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -47,7 +47,6 @@ #include #include #include -#include #include #ifdef CONFIG_X86_LOCAL_APIC @@ -105,7 +104,7 @@ static const struct cpu_dev default_cpu = { .c_x86_vendor = X86_VENDOR_UNKNOWN, }; -static const struct cpu_dev *this_cpu = _cpu; +const struct cpu_dev *this_cpu_dev = _cpu; DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = { #ifdef CONFIG_X86_64 @@ -426,7 +425,7 @@ cpuid_dependent_features[] = { { 0, 0 } }; -static void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn) +void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn) { const struct cpuid_dependent_feature *df; @@ -471,10 +470,10 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c) if (c->x86_model >= 16) return NULL;/* Range check */ - if (!this_cpu) + if (!this_cpu_dev) return NULL; - info = this_cpu->legacy_models; + info = this_cpu_dev->legacy_models; while (info->family) { if (info->family == c->x86) @@ -551,7 +550,7 @@ void switch_to_new_gdt(int cpu) load_percpu_segment(cpu); } -static const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {}; +const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {}; static void get_model_name(struct cpuinfo_x86 *c) { @@ -622,8 +621,8 @@ void cpu_detect_cache_sizes(struct cpuinfo_x86 *c) c->x86_tlbsize += ((ebx >> 16) & 0xfff) + (ebx & 0xfff); #else /* do processor-specific cache resizing */ - if (this_cpu->legacy_cache_size) - l2size = this_cpu->legacy_cache_size(c, l2size); + if (this_cpu_dev->legacy_cache_size) + l2size = this_cpu_dev->legacy_cache_size(c, l2size); /* Allow user to override all this if necessary. */ if (cachesize_override != -1) @@ -646,8 +645,8 @@ u16 __read_mostly tlb_lld_1g[NR_INFO]; static void cpu_detect_tlb(struct cpuinfo_x86 *c) { - if (this_cpu->c_detect_tlb) - this_cpu->c_detect_tlb(c); + if (this_cpu_dev->c_detect_tlb) + this_cpu_dev->c_detect_tlb(c); pr_info("Last level iTLB entries: 4KB %d, 2MB %d, 4MB %d\n", tlb_lli_4k[ENTRIES], tlb_lli_2m[ENTRIES], @@ -709,7 +708,7 @@ void detect_ht(struct cpuinfo_x86 *c) #endif } -static void get_cpu_vendor(struct cpuinfo_x86 *c) +void get_cpu_vendor(struct cpuinfo_x86 *c) { char *v = c->x86_vendor_id; int i; @@ -722,8 +721,8 @@ static void get_cpu_vendor(struct cpuinfo_x86 *c) (cpu_devs[i]->c_ident[1] && !strcmp(v, cpu_devs[i]->c_ident[1]))) { - this_cpu = cpu_devs[i]; - c->x86_vendor = this_cpu->c_x86_vendor; + this_cpu_dev = cpu_devs[i]; + c->x86_vendor = this_cpu_dev->c_x86_vendor; return; } } @@ -732,7 +731,7 @@ static void get_cpu_vendor(struct cpuinfo_x86 *c) "CPU: Your system may be unstable.\n", v); c->x86_vendor = X86_VENDOR_UNKNOWN; - this_cpu = _cpu; + this_cpu_dev = _cpu; } void cpu_detect(struct cpuinfo_x86 *c) @@ -908,7 +907,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) apply_forced_caps(c); } -static void get_cpu_address_sizes(struct cpuinfo_x86 *c) +void get_cpu_address_sizes(struct
[PATCH REBASED RESEND] x86/cpu: Move early cpu initialization into a separate translation unit
__pgtable_l5_enabled shouldn't be needed after system has booted, we can mark it as __initdata, but it requires preparation. This patch moves early cpu initialization into a separate translation unit. This limits effect of USE_EARLY_PGTABLE_L5 to less code. Without the change cpu_init() uses __pgtable_l5_enabled. cpu_init() is not __init function and it leads to section mismatch. Signed-off-by: Kirill A. Shutemov Reviewed-by: Thomas Gleixner --- Ingo, The patch was missed when the rest of 'no5lvl' patchset applied. There's rebased version of the patch. Please apply. --- arch/x86/kernel/cpu/Makefile | 1 + arch/x86/kernel/cpu/common.c | 216 --- arch/x86/kernel/cpu/cpu.h| 7 ++ arch/x86/kernel/cpu/early.c | 184 + 4 files changed, 214 insertions(+), 194 deletions(-) create mode 100644 arch/x86/kernel/cpu/early.c diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index 7a40196967cb..b1da5a7c145c 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -19,6 +19,7 @@ CFLAGS_common.o := $(nostackp) obj-y := cacheinfo.o scattered.o topology.o obj-y += common.o +obj-y += early.o obj-y += rdrand.o obj-y += match.o obj-y += bugs.o diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 910b47ee8078..dc7f4e283979 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -47,7 +47,6 @@ #include #include #include -#include #include #ifdef CONFIG_X86_LOCAL_APIC @@ -105,7 +104,7 @@ static const struct cpu_dev default_cpu = { .c_x86_vendor = X86_VENDOR_UNKNOWN, }; -static const struct cpu_dev *this_cpu = _cpu; +const struct cpu_dev *this_cpu_dev = _cpu; DEFINE_PER_CPU_PAGE_ALIGNED(struct gdt_page, gdt_page) = { .gdt = { #ifdef CONFIG_X86_64 @@ -426,7 +425,7 @@ cpuid_dependent_features[] = { { 0, 0 } }; -static void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn) +void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn) { const struct cpuid_dependent_feature *df; @@ -471,10 +470,10 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c) if (c->x86_model >= 16) return NULL;/* Range check */ - if (!this_cpu) + if (!this_cpu_dev) return NULL; - info = this_cpu->legacy_models; + info = this_cpu_dev->legacy_models; while (info->family) { if (info->family == c->x86) @@ -551,7 +550,7 @@ void switch_to_new_gdt(int cpu) load_percpu_segment(cpu); } -static const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {}; +const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {}; static void get_model_name(struct cpuinfo_x86 *c) { @@ -622,8 +621,8 @@ void cpu_detect_cache_sizes(struct cpuinfo_x86 *c) c->x86_tlbsize += ((ebx >> 16) & 0xfff) + (ebx & 0xfff); #else /* do processor-specific cache resizing */ - if (this_cpu->legacy_cache_size) - l2size = this_cpu->legacy_cache_size(c, l2size); + if (this_cpu_dev->legacy_cache_size) + l2size = this_cpu_dev->legacy_cache_size(c, l2size); /* Allow user to override all this if necessary. */ if (cachesize_override != -1) @@ -646,8 +645,8 @@ u16 __read_mostly tlb_lld_1g[NR_INFO]; static void cpu_detect_tlb(struct cpuinfo_x86 *c) { - if (this_cpu->c_detect_tlb) - this_cpu->c_detect_tlb(c); + if (this_cpu_dev->c_detect_tlb) + this_cpu_dev->c_detect_tlb(c); pr_info("Last level iTLB entries: 4KB %d, 2MB %d, 4MB %d\n", tlb_lli_4k[ENTRIES], tlb_lli_2m[ENTRIES], @@ -709,7 +708,7 @@ void detect_ht(struct cpuinfo_x86 *c) #endif } -static void get_cpu_vendor(struct cpuinfo_x86 *c) +void get_cpu_vendor(struct cpuinfo_x86 *c) { char *v = c->x86_vendor_id; int i; @@ -722,8 +721,8 @@ static void get_cpu_vendor(struct cpuinfo_x86 *c) (cpu_devs[i]->c_ident[1] && !strcmp(v, cpu_devs[i]->c_ident[1]))) { - this_cpu = cpu_devs[i]; - c->x86_vendor = this_cpu->c_x86_vendor; + this_cpu_dev = cpu_devs[i]; + c->x86_vendor = this_cpu_dev->c_x86_vendor; return; } } @@ -732,7 +731,7 @@ static void get_cpu_vendor(struct cpuinfo_x86 *c) "CPU: Your system may be unstable.\n", v); c->x86_vendor = X86_VENDOR_UNKNOWN; - this_cpu = _cpu; + this_cpu_dev = _cpu; } void cpu_detect(struct cpuinfo_x86 *c) @@ -908,7 +907,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c) apply_forced_caps(c); } -static void get_cpu_address_sizes(struct cpuinfo_x86 *c) +void get_cpu_address_sizes(struct