Thanks, both machines report:

# cat /sys/module/ipmi_msghandler/parameters/panic_op
string


> On 28. Feb 2023, at 18:04, Corey Minyard <miny...@acm.org> wrote:
> 
> Oh, I forgot.  You can look at panic_op in 
> /sys/module/ipmi_msghandler/parameters/panic_op
> 
> -corey
> 
> On Tue, Feb 28, 2023 at 05:48:07PM +0100, Christian Theune via 
> Openipmi-developer wrote:
>> Hi,
>> 
>>> On 28. Feb 2023, at 17:36, Corey Minyard <miny...@acm.org> wrote:
>>> 
>>> On Tue, Feb 28, 2023 at 02:53:12PM +0100, Christian Theune via 
>>> Openipmi-developer wrote:
>>>> Hi,
>>>> 
>>>> I’ve been trying to debug the PANIC and OEM string handling and am running 
>>>> out of ideas whether this is a bug or whether something so subtle has 
>>>> changed in my config that I’m just not seeing it.
>>>> 
>>>> (Note: I’m willing to pay for consulting.)
>>> 
>>> Probably not necessary.
>> 
>> Thanks! The offer always stands. If we should ever meet I’m also able to pay 
>> in beverages. ;)
>> 
>>>> I have machines that we’ve moved from an older setup (Gentoo, (mostly) 
>>>> vanilla kernel 4.19.157) to a newer setup (NixOS, (mostly) vanilla kernel 
>>>> 5.10.159) and I’m now experiencing crashes that seem to be kernel panics 
>>>> but do not get the usual messages in the IPMI SEL.
>>> 
>>> I just tested on stock 5.10.159 and it worked without issue.  Everything
>>> you have below looks ok.
>>> 
>>> Can you test by causing a crash with:
>>> 
>>> echo c >/proc/sysrq-trigger
>>> 
>>> and see if it works?
>> 
>> Yeah, already tried that and unfortunately that _doesn’t_ work.
>> 
>>> It sounds like you are having some type of crash that you would normally
>>> use the IPMI logs to debug.  However, they aren't perfect, the system
>>> has to stay up long enough to get them into the event log.
>> 
>> I think they are staying up long enough because a panic triggers the 255 
>> second bump in the watchdog and only then pass on. However, i’ve also 
>> noticed that the kernel _should_ be rebooting after a panic much faster (and 
>> not rely on the watchdog) and that doesn’t happen either. (Sorry this just 
>> popped from the back of my head).
>> 
>>> In this situation, getting a serial console (probably through IPMI
>>> Serial over LAN) and getting the console output on a crash is probably
>>> your best option.  You can use ipmitool for this, or I have a library
>>> that is able to make connections to serial ports, including through IPMI
>>> SoL.
>> 
>> Yup. Been there, too. :)
>> 
>> Unfortunately we’re currently chasing something that pops up very randomly 
>> on somewhat odd machines and I also have the feeling that it’s 
>> systematically broken right now (as the “echo c” doesn’t work).
>> 
>> Thanks a lot,
>> Christian
>> 
>> -- 
>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0
>> Flying Circus Internet Operations GmbH · https://flyingcircus.io
>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian 
>> Zagrodnick
>> 
>> 
>> 
>> _______________________________________________
>> Openipmi-developer mailing list
>> Openipmi-developer@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Liebe Grüße,
Christian Theune

-- 
Christian Theune · c...@flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick



_______________________________________________
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to