Re: [PATCH] powerpc/powernv/mce: Don't silently restart the machine

2018-03-08 Thread Balbir Singh
On Thu, Mar 8, 2018 at 12:20 PM, Stewart Smith
 wrote:
> Balbir Singh  writes:
>> On MCE the current code will restart the machine with
>> ppc_md.restart(). This case was extremely unlikely since
>> prior to that a skiboot call is made and that resulted in
>> a checkstop for analysis.
>>
>> With newer skiboots, on P9 we don't checkstop the box by
>> default, instead we return back to the kernel to extract
>> useful information at the time of the MCE. While we still
>> get this information, this patch converts the restart to
>> a panic(), so that if configured a dump can be taken and
>> we can track and probably debug the potential issue causing
>> the MCE.
>
> This will likely change again, but I can send a patch that changes the
> comment (along with the logic of decoding it all and having enough
> information to make sensible decisions). But... I kind of don't want to
> bikeshed a comment to death :)
>
> I reckon the panic() here is the right thing to do no matter
> what.
>
> Reviewed-by: Stewart Smith 

Thanks!

Balbir Singh.


Re: [PATCH] powerpc/powernv/mce: Don't silently restart the machine

2018-03-07 Thread Stewart Smith
Balbir Singh  writes:
> On MCE the current code will restart the machine with
> ppc_md.restart(). This case was extremely unlikely since
> prior to that a skiboot call is made and that resulted in
> a checkstop for analysis.
>
> With newer skiboots, on P9 we don't checkstop the box by
> default, instead we return back to the kernel to extract
> useful information at the time of the MCE. While we still
> get this information, this patch converts the restart to
> a panic(), so that if configured a dump can be taken and
> we can track and probably debug the potential issue causing
> the MCE.

This will likely change again, but I can send a patch that changes the
comment (along with the logic of decoding it all and having enough
information to make sensible decisions). But... I kind of don't want to
bikeshed a comment to death :)

I reckon the panic() here is the right thing to do no matter
what.

Reviewed-by: Stewart Smith 

-- 
Stewart Smith
OPAL Architect, IBM.