On Thu, Nov 03, 2016 at 03:50:18PM +0100, Sebastian Andrzej Siewior wrote: > Part of the init (memory allocation and so on) is done > in mcheck_cpu_init(). While moving the the allocation to > mcheck_init_device() (where the hotplug calls are initialized) it > becomes necessary to move the callback (mcheck_cpu_init()), too. > > The callback is now removed from identify_cpu() and registered as a > hotplug event which is invoked as the very first one which is shortly > after the original point of invocation (look at smp_store_cpu_info() and > notify_cpu_starting() in smp_callin()). > One "visible" difference is that MCE for the boot CPU is not enabled at > identify_boot_cpu() time but at device_initcall_sync() time. Either way, > both times we had no userland around.
Uh, hm, I'm not sure about this: so the issue I see with this is that the more we're delaying the enabling or MCE reporting - and especially setting CR4[MCE] - the more we're increasing the window where a MCE during early boot will cause a shutdown. (This is what happens if CR4[MCE]=0b). Perhaps we should split the init into a very early init which doesn't need to be part of hotplug and the rest, which can do mce_disable_cpu() and mce_reenable_cpu(). Tony, how do you see this? > Cc: Tony Luck <[email protected]> > Cc: Borislav Petkov <[email protected]> > Cc: [email protected] > Cc: [email protected] > Signed-off-by: Sebastian Andrzej Siewior <[email protected]> > Signed-off-by: Thomas Gleixner <[email protected]> > --- ... > @@ -2584,11 +2580,26 @@ static __init int mcheck_init_device(void) > goto err_out; > } > > + err = __mcheck_cpu_mce_banks_init(); ^^^^^^^^ I guess you can merge this one... > + if (err) > + goto err_out_mem; > + > mce_init_banks(); ^^^^^^^^ into this one now. But let's sort out the bigger issue first. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.

