On Wed, Mar 01, 2023 at 06:00:07PM +0100, Christian Theune wrote: > I’m going to actually attach a serial console to watch the “echo c” panic, > maybe that gives _some_ indication. > > Otherwise: I can quickly run patches on the kernel there to try out things. > (And the funding offer still stands.)
Any news on this? I'm curious what this could be. -corey > > Christian > > > On 1. Mar 2023, at 17:58, Corey Minyard <miny...@acm.org> wrote: > > > > On Tue, Feb 28, 2023 at 06:36:17PM +0100, Christian Theune wrote: > >> Thanks, both machines report: > >> > >> # cat /sys/module/ipmi_msghandler/parameters/panic_op > >> string > > > > At this point, I have no idea. I'd have to start adding printks into > > the code and cause crashes to see what is happing. > > > > Maybe something is getting in the way of the panic notifiers and doing > > something to prevent the IPMI driver from working. > > > > -corey > > > >> > >> > >>> On 28. Feb 2023, at 18:04, Corey Minyard <miny...@acm.org> wrote: > >>> > >>> Oh, I forgot. You can look at panic_op in > >>> /sys/module/ipmi_msghandler/parameters/panic_op > >>> > >>> -corey > >>> > >>> On Tue, Feb 28, 2023 at 05:48:07PM +0100, Christian Theune via > >>> Openipmi-developer wrote: > >>>> Hi, > >>>> > >>>>> On 28. Feb 2023, at 17:36, Corey Minyard <miny...@acm.org> wrote: > >>>>> > >>>>> On Tue, Feb 28, 2023 at 02:53:12PM +0100, Christian Theune via > >>>>> Openipmi-developer wrote: > >>>>>> Hi, > >>>>>> > >>>>>> I’ve been trying to debug the PANIC and OEM string handling and am > >>>>>> running out of ideas whether this is a bug or whether something so > >>>>>> subtle has changed in my config that I’m just not seeing it. > >>>>>> > >>>>>> (Note: I’m willing to pay for consulting.) > >>>>> > >>>>> Probably not necessary. > >>>> > >>>> Thanks! The offer always stands. If we should ever meet I’m also able to > >>>> pay in beverages. ;) > >>>> > >>>>>> I have machines that we’ve moved from an older setup (Gentoo, (mostly) > >>>>>> vanilla kernel 4.19.157) to a newer setup (NixOS, (mostly) vanilla > >>>>>> kernel 5.10.159) and I’m now experiencing crashes that seem to be > >>>>>> kernel panics but do not get the usual messages in the IPMI SEL. > >>>>> > >>>>> I just tested on stock 5.10.159 and it worked without issue. Everything > >>>>> you have below looks ok. > >>>>> > >>>>> Can you test by causing a crash with: > >>>>> > >>>>> echo c >/proc/sysrq-trigger > >>>>> > >>>>> and see if it works? > >>>> > >>>> Yeah, already tried that and unfortunately that _doesn’t_ work. > >>>> > >>>>> It sounds like you are having some type of crash that you would normally > >>>>> use the IPMI logs to debug. However, they aren't perfect, the system > >>>>> has to stay up long enough to get them into the event log. > >>>> > >>>> I think they are staying up long enough because a panic triggers the 255 > >>>> second bump in the watchdog and only then pass on. However, i’ve also > >>>> noticed that the kernel _should_ be rebooting after a panic much faster > >>>> (and not rely on the watchdog) and that doesn’t happen either. (Sorry > >>>> this just popped from the back of my head). > >>>> > >>>>> In this situation, getting a serial console (probably through IPMI > >>>>> Serial over LAN) and getting the console output on a crash is probably > >>>>> your best option. You can use ipmitool for this, or I have a library > >>>>> that is able to make connections to serial ports, including through IPMI > >>>>> SoL. > >>>> > >>>> Yup. Been there, too. :) > >>>> > >>>> Unfortunately we’re currently chasing something that pops up very > >>>> randomly on somewhat odd machines and I also have the feeling that it’s > >>>> systematically broken right now (as the “echo c” doesn’t work). > >>>> > >>>> Thanks a lot, > >>>> Christian > >>>> > >>>> -- > >>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 > >>>> Flying Circus Internet Operations GmbH · https://flyingcircus.io > >>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland > >>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian > >>>> Zagrodnick > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Openipmi-developer mailing list > >>>> Openipmi-developer@lists.sourceforge.net > >>>> https://lists.sourceforge.net/lists/listinfo/openipmi-developer > >> > >> Liebe Grüße, > >> Christian Theune > >> > >> -- > >> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 > >> Flying Circus Internet Operations GmbH · https://flyingcircus.io > >> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland > >> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian > >> Zagrodnick > >> > > Liebe Grüße, > Christian Theune > > -- > Christian Theune · c...@flyingcircus.io · +49 345 219401 0 > Flying Circus Internet Operations GmbH · https://flyingcircus.io > Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland > HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick > _______________________________________________ Openipmi-developer mailing list Openipmi-developer@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openipmi-developer