On Wed, Feb 28, 2018 at 8:49 PM, Michael Ellerman <m...@ellerman.id.au> wrote:
> Balbir Singh <bsinghar...@gmail.com> writes:
>
>> On MCE the current code will restart the machine with
>> ppc_md.restart(). This case was extremely unlikely since
>> prior to that a skiboot call is made and that resulted in
>> a checkstop for analysis.
>>
>> With newer skiboots, on P9 we don't checkstop the box by
>> default, instead we return back to the kernel to extract
>> useful information at the time of the MCE. While we still
>> get this information, this patch converts the restart to
>> a panic(), so that if configured a dump can be taken and
>> we can track and probably debug the potential issue causing
>> the MCE.
>>
>> Signed-off-by: Balbir Singh <bsinghar...@gmail.com>
>> Reviewed-by: Nicholas Piggin <npig...@gmail.com>
>> ---
>>  arch/powerpc/platforms/powernv/opal.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/platforms/powernv/opal.c 
>> b/arch/powerpc/platforms/powernv/opal.c
>> index 69b5263fc9e3..b510a6f41b00 100644
>> --- a/arch/powerpc/platforms/powernv/opal.c
>> +++ b/arch/powerpc/platforms/powernv/opal.c
>> @@ -500,9 +500,12 @@ void pnv_platform_error_reboot(struct pt_regs *regs, 
>> const char *msg)
>                                                                             
> ^^^^^^^^^^^^^^^
> Why don't we use the msg ..
>
>>        *    opal to trigger checkstop explicitly for error analysis.
>>        *    The FSP PRD component would have already got notified
>>        *    about this error through other channels.
>> +      * 4. We are running on a newer skiboot that by default does
>> +      *    not cause a checkstop, drops us back to the kernel to
>> +      *    extract context and state at the time of the error.
>>        */
>>
>> -     ppc_md.restart(NULL);
>> +     panic("PowerNV Unrecovered Machine Check");
>               ^
>               Here.
>
> Because we can get here from a HMI so it's confusing to print "Machine
> Check" in that case, and we have the msg already.
>
> So just:
>
>> +     panic(msg);
>

My bad, we used to have two of these one in opal and opal-hmi and the
diff from the previous change showed this message. Resending

Thanks,
Balbir

Reply via email to