On Fri, Sep 16, 2016 at 08:28:44PM +0000, Luck, Tony wrote:
> > For UE recovery support, current we need mce=2 in command line
> > and also disable panic_on_oops with sysctl.
> Please explain. I've never given mce=2 on command line, and have
> had my kernel recover from thousands of (injected) UE memory errors.
So frankly, that panic_on_oops doesn't make a whole lotta sense to me.
It is promoting MCEs with severity MCE_UC_SEVERITY and higher to a
So let's look at those:
MCE_UC_SEVERITY, - we don't do anything special in the kernel for
those so just as well.
MCE_AR_SEVERITY, - those end up in the memory failure code if
they're memory errors
MCE_PANIC_SEVERITY, - causes panic
so if anything, panic_on_oops shouldn't control the panicking behavior
as tolerant does that already:
* Tolerant levels:
* 0: always panic on uncorrected errors, log corrected errors
* 1: panic or SIGBUS on uncorrected errors, log corrected errors
* 2: SIGBUS or log uncorrected errors (if possible), log corr. errors
* 3: never panic or SIGBUS, log all errors (for testing only)
IOW, I think that patch makes sense but please doublecheck my logic
ECO tip #101: Trim your mails when you reply.