Thanks for fixing it! > On 15. Mar 2023, at 13:20, Corey Minyard <miny...@acm.org> wrote: > > On Wed, Mar 15, 2023 at 01:12:05PM +0100, Christian Theune via > Openipmi-developer wrote: >> Ah, fantastic! That explains it of course … :) >> >> From my side I guess this works and I don’t have to retry with that, but I’d >> be happy to just wait for 5.10.175 … or would you prefer me explicitly >> testing your original? > > We can just wait. The problem is obvious now, and the backports are in > progress. > > Thanks for helping me with this. > > -corey > >> >> Christian >> >>> On 15. Mar 2023, at 13:07, Corey Minyard <miny...@acm.org> wrote: >>> >>> On Wed, Mar 15, 2023 at 07:32:41AM +0100, Christian Theune via >>> Openipmi-developer wrote: >>>> Hi, >>>> >>>> that didn’t apply on 5.10. Here’s what I’m currently trying to build after >>>> manually inspecting the rejected patch: >>>> >>> >>> Well, I guess I should have sent the prerequisite patch, too. Her it >>> is: >>> >>> a01a89b1db ("ipmi/watchdog: replace atomic_add() and atomic_sub()") >>> >>> Also attached. >>> >>> -corey >>> >>>> >>>> >>>>> On 14. Mar 2023, at 18:29, Corey Minyard <miny...@acm.org> wrote: >>>>> >>>>> Well, dang, I had already fixed this a year and a half ago. I wish I >>>>> had a better memory. >>>>> >>>>> Anyway, the fix is commit db05ddf7f321634c5659a0cf7ea56594e22365f7 >>>>> ("ipmi:watchdog: Set panic count to proper value on a panic") in >>>>> mainstream 5.16. I'm attaching that patch. >>>>> >>>>> -corey >>>>> >>>>> On Tue, Mar 14, 2023 at 03:58:26PM +0100, Christian Theune via >>>>> Openipmi-developer wrote: >>>>>> Awesome! >>>>>> >>>>>>> On 14. Mar 2023, at 15:54, Corey Minyard <miny...@acm.org> wrote: >>>>>>> >>>>>>> On Tue, Mar 14, 2023 at 03:22:39PM +0100, Christian Theune via >>>>>>> Openipmi-developer wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> sorry, I didn’t expect you to make me a branch. I had already taken >>>>>>>> your diff over to 5.10 as it applied cleanly … sorry for the >>>>>>>> additional work and thanks anyways. >>>>>>> >>>>>>> Ok, that's great. It's something about the IPMI watchdog panic >>>>>>> routines, and I can reproduce. I should be able to fix this pretty >>>>>>> quickly. I'll send a patch when I get this fixed. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -corey >>>>>>> >>>>>>>> >>>>>>>> Here’s the output: >>>>>>>> >>>>>>>> [ 6521.905890] sysrq: Trigger a crash >>>>>>>> [ 6521.909294] Kernel panic - not syncing: sysrq triggered crash >>>>>>>> [ 6521.915026] CPU: 1 PID: 43785 Comm: bash Tainted: G I >>>>>>>> 5.10.159 #1-NixOS >>>>>>>> [ 6521.922925] Hardware name: Dell Inc. PowerEdge R510/00HDP0, BIOS >>>>>>>> 1.11.0 07/23/2012 >>>>>>>> [ 6521.930475] Call Trace: >>>>>>>> [ 6521.932923] dump_stack+0x6b/0x83 >>>>>>>> [ 6521.936230] panic+0x101/0x2c8 >>>>>>>> [ 6521.939276] ? printk+0x58/0x73 >>>>>>>> [ 6521.942408] sysrq_handle_crash+0x16/0x20 >>>>>>>> [ 6521.946407] __handle_sysrq.cold+0x43/0x11a >>>>>>>> [ 6521.950580] write_sysrq_trigger+0x24/0x40 >>>>>>>> [ 6521.954668] proc_reg_write+0x51/0x90 >>>>>>>> [ 6521.958322] vfs_write+0xc3/0x280 >>>>>>>> [ 6521.961627] ksys_write+0x5f/0xe0 >>>>>>>> [ 6521.964935] do_syscall_64+0x33/0x40 >>>>>>>> [ 6521.968502] entry_SYSCALL_64_after_hwframe+0x61/0xc6 >>>>>>>> [ 6521.973540] RIP: 0033:0x7f2c6b91a133 >>>>>>>> [ 6521.977106] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 >>>>>>>> 0f 1f 80 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 >>>>>>>> 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 41 54 49 89 d4 55 48 >>>>>>>> 89 f5 >>>>>>>> [ 6521.995836] RSP: 002b:00007ffc4cf11088 EFLAGS: 00000246 ORIG_RAX: >>>>>>>> 0000000000000001 >>>>>>>> [ 6522.003387] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: >>>>>>>> 00007f2c6b91a133 >>>>>>>> [ 6522.010505] RDX: 0000000000000002 RSI: 0000000001555c08 RDI: >>>>>>>> 0000000000000001 >>>>>>>> [ 6522.017623] RBP: 0000000001555c08 R08: 000000000000000a R09: >>>>>>>> 00007f2c6b9aaf40 >>>>>>>> [ 6522.024743] R10: 00000000016e4218 R11: 0000000000000246 R12: >>>>>>>> 0000000000000002 >>>>>>>> [ 6522.031864] R13: 00007f2c6b9e8520 R14: 00007f2c6b9e8720 R15: >>>>>>>> 0000000000000002 >>>>>>>> [ 6522.039085] Calling notifier panic_event+0x0/0x410 >>>>>>>> [ipmi_msghandler] (000000008eb8cb44) >>>>>>>> [ 6522.047071] IPMI message handler: IPMI: panic event handler >>>>>>>> [ 6522.052628] IPMI message handler: IPMI: handling panic event for >>>>>>>> intf 0: 00000000443777b3 0000000067d05ff8 >>>>>>>> … >>>>>>>> and then it reboots after the 255 seconds from the watchdog timer are >>>>>>>> passed. >>>>>>>> >>>>>>>> Christian >>>>>>>> >>>>>>>>> On 13. Mar 2023, at 18:13, Corey Minyard <miny...@acm.org> wrote: >>>>>>>>> >>>>>>>>> On Mon, Mar 13, 2023 at 05:42:39PM +0100, Christian Theune wrote: >>>>>>>>>> Hrghs. I’m applying your patch to 5.10 as my distro build >>>>>>>>>> infrastructure has some patches that don’t apply to 6.2 and that I >>>>>>>>>> don’t know how to circumvent quickly enough… :) >>>>>>>>> >>>>>>>>> Ok, there's a >>>>>>>>> >>>>>>>>> https://github.com/cminyard/linux-ipmi.git:debug-panic-oem-events-5.10 >>>>>>>>> >>>>>>>>> branch available for you to pull. It's on top of latest 5.10. >>>>>>>>> >>>>>>>>> -corey >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On 13. Mar 2023, at 16:59, Christian Theune <c...@flyingcircus.io> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> I should be easily able to run 6.2, no worries. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 13. Mar 2023, at 16:33, Corey Minyard <miny...@acm.org> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Mar 13, 2023 at 02:07:01PM +0100, Christian Theune wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> yeah, the IPMI log is fine. This is a 10 minute interval job in >>>>>>>>>>>>> our system that exports the log and clears it: >>>>>>>>>>>>> >>>>>>>>>>>>> The job looks like this: >>>>>>>>>>>>> >>>>>>>>>>>>> /nix/store/m7lb36dr93qj27r9vskmjihz8imywy86-ipmitool-1.8.18/bin/ipmitool >>>>>>>>>>>>> sel elist >>>>>>>>>>>>> /nix/store/m7lb36dr93qj27r9vskmjihz8imywy86-ipmitool-1.8.18/bin/ipmitool >>>>>>>>>>>>> sel clear >>>>>>>>>>>>> >>>>>>>>>>>>> So it’s not atomic but it runs after the boot and the elist >>>>>>>>>>>>> should output it properly … at least it did in the past. ;) >>>>>>>>>>>>> >>>>>>>>>>>>> As I said - I’m happy to run any patches you have. If you point >>>>>>>>>>>>> me to a git branch somewhere I can switch that system easily. >>>>>>>>>>>> >>>>>>>>>>>> Ok, I have a branch at >>>>>>>>>>>> >>>>>>>>>>>> https://github.com/cminyard/linux-ipmi.git:debug-panic-oem-events >>>>>>>>>>>> >>>>>>>>>>>> that has debug tracing. It will print the function for all panic >>>>>>>>>>>> event >>>>>>>>>>>> handlers, their return values, and adds traces in the IPMI panic >>>>>>>>>>>> event >>>>>>>>>>>> handlers. >>>>>>>>>>>> >>>>>>>>>>>> It's a single patch right on top of 6.2; I'm not sure how portable >>>>>>>>>>>> it is >>>>>>>>>>>> to other kernel versions. I can port if you like. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> -corey >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Christian >>>>>>>>>>>>> >>>>>>>>>>>>>>> On 13. Mar 2023, at 13:58, Corey Minyard <miny...@acm.org> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Mar 13, 2023 at 10:27:51AM +0100, Christian Theune wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> alright, so here’s the output from the NixOS machine: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> root@xxx ~ # echo c >/proc/sysrq-trigger >>>>>>>>>>>>>>> client_loop: send disconnect: Broken pipe >>>>>>>>>>>>>>> … >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> root@xxx ~ # journalctl -u ipmi-log.service >>>>>>>>>>>>>>> -- Journal begins at Sun 2023-02-26 14:25:36 CET, ends at Mon >>>>>>>>>>>>>>> 2023-03-13 10:25:27 CET. -- >>>>>>>>>>>>>>> Mar 13 10:12:38 xxx ipmi-log-start[520973]: Clearing SEL. >>>>>>>>>>>>>>> Please allow a few seconds to erase. >>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>> -- Boot fdef496e784e4541abd9ae40df472a0b -- >>>>>>>>>>>>>>> Mar 13 10:25:07 xxx ipmi-log-start[1973]: 1 | 03/13/2023 | >>>>>>>>>>>>>>> 09:12:49 | Event Logging Disabled SEL | Log area reset/cleared >>>>>>>>>>>>>>> | Asserted >>>>>>>>>>>>>>> Mar 13 10:25:07 xxx ipmi-log-start[1973]: 2 | 03/13/2023 | >>>>>>>>>>>>>>> 09:21:06 | Watchdog2 OS Watchdog | Hard reset | Asserted >>>>>>>>>>>>>>> Mar 13 10:25:07 xxx ipmi-log-start[1977]: Clearing SEL. Please >>>>>>>>>>>>>>> allow a few seconds to erase. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hmm, the SEL got cleared. That would clear out any of the logs >>>>>>>>>>>>>> that >>>>>>>>>>>>>> were issued before that time. I'm not sure when the above >>>>>>>>>>>>>> happened >>>>>>>>>>>>>> verses the crash, though. It looks like it occurred as part of >>>>>>>>>>>>>> the >>>>>>>>>>>>>> reboot, but I'm not sure what I'm seeing. Maybe you have a >>>>>>>>>>>>>> startup >>>>>>>>>>>>>> process that clears the SEL? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Assuming that's not the issue, what you have looks ok. I'd need >>>>>>>>>>>>>> to add >>>>>>>>>>>>>> some logs to the kernel to see if the log operation ever happens. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -corey >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The SOL log looks like this: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1107585.917689] sysrq: Trigger a crash >>>>>>>>>>>>>>> [1107585.921272] Kernel panic - not syncing: sysrq triggered >>>>>>>>>>>>>>> crash >>>>>>>>>>>>>>> [1107585.927178] CPU: 1 PID: 521033 Comm: bash Tainted: G >>>>>>>>>>>>>>> I 5.10.159 #1-NixOS >>>>>>>>>>>>>>> [1107585.935335] Hardware name: Dell Inc. PowerEdge >>>>>>>>>>>>>>> R510/00HDP0, BIOS 1.11.0 07/23/2012 >>>>>>>>>>>>>>> [1107585.943058] Call Trace: >>>>>>>>>>>>>>> [1107585.945680] dump_stack+0x6b/0x83 >>>>>>>>>>>>>>> [1107585.949158] panic+0x101/0x2c8 >>>>>>>>>>>>>>> [1107585.952379] ? printk+0x58/0x73 >>>>>>>>>>>>>>> [1107585.955687] sysrq_handle_crash+0x16/0x20 >>>>>>>>>>>>>>> [1107585.959859] __handle_sysrq.cold+0x43/0x11a >>>>>>>>>>>>>>> [1107585.964203] write_sysrq_trigger+0x24/0x40 >>>>>>>>>>>>>>> [1107585.968463] proc_reg_write+0x51/0x90 >>>>>>>>>>>>>>> [1107585.972290] vfs_write+0xc3/0x280 >>>>>>>>>>>>>>> [1107585.975768] ksys_write+0x5f/0xe0 >>>>>>>>>>>>>>> [1107585.979248] do_syscall_64+0x33/0x40 >>>>>>>>>>>>>>> [1107585.982987] entry_SYSCALL_64_after_hwframe+0x61/0xc6 >>>>>>>>>>>>>>> [1107585.988199] RIP: 0033:0x7f5873932133 >>>>>>>>>>>>>>> [1107585.991938] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff >>>>>>>>>>>>>>> ff eb b3 0f 1f 80 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 >>>>>>>>>>>>>>> 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 >>>>>>>>>>>>>>> 00 41 54 49 89 d4 55 48 89 f5 >>>>>>>>>>>>>>> [1107586.010842] RSP: 002b:00007ffcc13808c8 EFLAGS: 00000246 >>>>>>>>>>>>>>> ORIG_RAX: 0000000000000001 >>>>>>>>>>>>>>> [1107586.018566] RAX: ffffffffffffffda RBX: 0000000000000002 >>>>>>>>>>>>>>> RCX: 00007f5873932133 >>>>>>>>>>>>>>> [1107586.025923] RDX: 0000000000000002 RSI: 00000000005c1c08 >>>>>>>>>>>>>>> RDI: 0000000000000001 >>>>>>>>>>>>>>> [1107586.033213] RBP: 00000000005c1c08 R08: 000000000000000a >>>>>>>>>>>>>>> R09: 00007f58739c2f40 >>>>>>>>>>>>>>> [1107586.040504] R10: 00000000005cc348 R11: 0000000000000246 >>>>>>>>>>>>>>> R12: 0000000000000002 >>>>>>>>>>>>>>> [1107586.047794] R13: 00007f5873a00520 R14: 00007f5873a00720 >>>>>>>>>>>>>>> R15: 0000000000000002 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Nothing obvious to me here … if you have any further ideas what >>>>>>>>>>>>>>> to test, let me know. I should be more responsive again now. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks and kind regards, >>>>>>>>>>>>>>> Christian >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 5. Mar 2023, at 23:53, Corey Minyard <miny...@acm.org> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Mar 01, 2023 at 06:00:07PM +0100, Christian Theune >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> I’m going to actually attach a serial console to watch the >>>>>>>>>>>>>>>>> “echo c” panic, maybe that gives _some_ indication. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Otherwise: I can quickly run patches on the kernel there to >>>>>>>>>>>>>>>>> try out things. (And the funding offer still stands.) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Any news on this? I'm curious what this could be. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -corey >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Christian >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 1. Mar 2023, at 17:58, Corey Minyard <miny...@acm.org> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, Feb 28, 2023 at 06:36:17PM +0100, Christian Theune >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> Thanks, both machines report: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> # cat /sys/module/ipmi_msghandler/parameters/panic_op >>>>>>>>>>>>>>>>>>> string >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> At this point, I have no idea. I'd have to start adding >>>>>>>>>>>>>>>>>> printks into >>>>>>>>>>>>>>>>>> the code and cause crashes to see what is happing. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Maybe something is getting in the way of the panic notifiers >>>>>>>>>>>>>>>>>> and doing >>>>>>>>>>>>>>>>>> something to prevent the IPMI driver from working. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -corey >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 28. Feb 2023, at 18:04, Corey Minyard <miny...@acm.org> >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Oh, I forgot. You can look at panic_op in >>>>>>>>>>>>>>>>>>>> /sys/module/ipmi_msghandler/parameters/panic_op >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -corey >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Tue, Feb 28, 2023 at 05:48:07PM +0100, Christian Theune >>>>>>>>>>>>>>>>>>>> via Openipmi-developer wrote: >>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On 28. Feb 2023, at 17:36, Corey Minyard >>>>>>>>>>>>>>>>>>>>>> <miny...@acm.org> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Tue, Feb 28, 2023 at 02:53:12PM +0100, Christian >>>>>>>>>>>>>>>>>>>>>> Theune via Openipmi-developer wrote: >>>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I’ve been trying to debug the PANIC and OEM string >>>>>>>>>>>>>>>>>>>>>>> handling and am running out of ideas whether this is a >>>>>>>>>>>>>>>>>>>>>>> bug or whether something so subtle has changed in my >>>>>>>>>>>>>>>>>>>>>>> config that I’m just not seeing it. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> (Note: I’m willing to pay for consulting.) >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Probably not necessary. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks! The offer always stands. If we should ever meet >>>>>>>>>>>>>>>>>>>>> I’m also able to pay in beverages. ;) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I have machines that we’ve moved from an older setup >>>>>>>>>>>>>>>>>>>>>>> (Gentoo, (mostly) vanilla kernel 4.19.157) to a newer >>>>>>>>>>>>>>>>>>>>>>> setup (NixOS, (mostly) vanilla kernel 5.10.159) and I’m >>>>>>>>>>>>>>>>>>>>>>> now experiencing crashes that seem to be kernel panics >>>>>>>>>>>>>>>>>>>>>>> but do not get the usual messages in the IPMI SEL. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I just tested on stock 5.10.159 and it worked without >>>>>>>>>>>>>>>>>>>>>> issue. Everything >>>>>>>>>>>>>>>>>>>>>> you have below looks ok. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Can you test by causing a crash with: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> echo c >/proc/sysrq-trigger >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> and see if it works? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yeah, already tried that and unfortunately that _doesn’t_ >>>>>>>>>>>>>>>>>>>>> work. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> It sounds like you are having some type of crash that >>>>>>>>>>>>>>>>>>>>>> you would normally >>>>>>>>>>>>>>>>>>>>>> use the IPMI logs to debug. However, they aren't >>>>>>>>>>>>>>>>>>>>>> perfect, the system >>>>>>>>>>>>>>>>>>>>>> has to stay up long enough to get them into the event >>>>>>>>>>>>>>>>>>>>>> log. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I think they are staying up long enough because a panic >>>>>>>>>>>>>>>>>>>>> triggers the 255 second bump in the watchdog and only >>>>>>>>>>>>>>>>>>>>> then pass on. However, i’ve also noticed that the kernel >>>>>>>>>>>>>>>>>>>>> _should_ be rebooting after a panic much faster (and not >>>>>>>>>>>>>>>>>>>>> rely on the watchdog) and that doesn’t happen either. >>>>>>>>>>>>>>>>>>>>> (Sorry this just popped from the back of my head). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> In this situation, getting a serial console (probably >>>>>>>>>>>>>>>>>>>>>> through IPMI >>>>>>>>>>>>>>>>>>>>>> Serial over LAN) and getting the console output on a >>>>>>>>>>>>>>>>>>>>>> crash is probably >>>>>>>>>>>>>>>>>>>>>> your best option. You can use ipmitool for this, or I >>>>>>>>>>>>>>>>>>>>>> have a library >>>>>>>>>>>>>>>>>>>>>> that is able to make connections to serial ports, >>>>>>>>>>>>>>>>>>>>>> including through IPMI >>>>>>>>>>>>>>>>>>>>>> SoL. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Yup. Been there, too. :) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Unfortunately we’re currently chasing something that pops >>>>>>>>>>>>>>>>>>>>> up very randomly on somewhat odd machines and I also have >>>>>>>>>>>>>>>>>>>>> the feeling that it’s systematically broken right now (as >>>>>>>>>>>>>>>>>>>>> the “echo c” doesn’t work). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks a lot, >>>>>>>>>>>>>>>>>>>>> Christian >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >>>>>>>>>>>>>>>>>>>>> Flying Circus Internet Operations GmbH · >>>>>>>>>>>>>>>>>>>>> https://flyingcircus.io >>>>>>>>>>>>>>>>>>>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >>>>>>>>>>>>>>>>>>>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, >>>>>>>>>>>>>>>>>>>>> Christian Zagrodnick >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>> Openipmi-developer mailing list >>>>>>>>>>>>>>>>>>>>> Openipmi-developer@lists.sourceforge.net >>>>>>>>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/openipmi-developer >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Liebe Grüße, >>>>>>>>>>>>>>>>>>> Christian Theune >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >>>>>>>>>>>>>>>>>>> Flying Circus Internet Operations GmbH · >>>>>>>>>>>>>>>>>>> https://flyingcircus.io >>>>>>>>>>>>>>>>>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >>>>>>>>>>>>>>>>>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, >>>>>>>>>>>>>>>>>>> Christian Zagrodnick >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Liebe Grüße, >>>>>>>>>>>>>>>>> Christian Theune >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >>>>>>>>>>>>>>>>> Flying Circus Internet Operations GmbH · >>>>>>>>>>>>>>>>> https://flyingcircus.io >>>>>>>>>>>>>>>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >>>>>>>>>>>>>>>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, >>>>>>>>>>>>>>>>> Christian Zagrodnick >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Liebe Grüße, >>>>>>>>>>>>>>> Christian Theune >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >>>>>>>>>>>>>>> Flying Circus Internet Operations GmbH · https://flyingcircus.io >>>>>>>>>>>>>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >>>>>>>>>>>>>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, >>>>>>>>>>>>>>> Christian Zagrodnick >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Liebe Grüße, >>>>>>>>>>>>> Christian Theune >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >>>>>>>>>>>>> Flying Circus Internet Operations GmbH · https://flyingcircus.io >>>>>>>>>>>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >>>>>>>>>>>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, >>>>>>>>>>>>> Christian Zagrodnick >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Liebe Grüße, >>>>>>>>>> Christian Theune >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >>>>>>>>>> Flying Circus Internet Operations GmbH · https://flyingcircus.io >>>>>>>>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >>>>>>>>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian >>>>>>>>>> Zagrodnick >>>>>>>>>> >>>>>>>> >>>>>>>> Liebe Grüße, >>>>>>>> Christian Theune >>>>>>>> >>>>>>>> -- >>>>>>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >>>>>>>> Flying Circus Internet Operations GmbH · https://flyingcircus.io >>>>>>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >>>>>>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian >>>>>>>> Zagrodnick >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Openipmi-developer mailing list >>>>>>>> Openipmi-developer@lists.sourceforge.net >>>>>>>> https://lists.sourceforge.net/lists/listinfo/openipmi-developer >>>>>> >>>>>> Liebe Grüße, >>>>>> Christian Theune >>>>>> >>>>>> -- >>>>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >>>>>> Flying Circus Internet Operations GmbH · https://flyingcircus.io >>>>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >>>>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian >>>>>> Zagrodnick >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Openipmi-developer mailing list >>>>>> Openipmi-developer@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/openipmi-developer >>>>> <0001-ipmi-watchdog-Set-panic-count-to-proper-value-on-a-p.patch> >>>> >>>> Liebe Grüße, >>>> Christian Theune >>>> >>>> -- >>>> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >>>> Flying Circus Internet Operations GmbH · https://flyingcircus.io >>>> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >>>> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian >>>> Zagrodnick >>>> >>> >>> >>>> _______________________________________________ >>>> Openipmi-developer mailing list >>>> Openipmi-developer@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/openipmi-developer >>> >>> <0001-ipmi-watchdog-replace-atomic_add-and-atomic_sub.patch> >> >> Liebe Grüße, >> Christian Theune >> >> -- >> Christian Theune · c...@flyingcircus.io · +49 345 219401 0 >> Flying Circus Internet Operations GmbH · https://flyingcircus.io >> Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland >> HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian >> Zagrodnick >> >> >> >> _______________________________________________ >> Openipmi-developer mailing list >> Openipmi-developer@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/openipmi-developer
Liebe Grüße, Christian Theune -- Christian Theune · c...@flyingcircus.io · +49 345 219401 0 Flying Circus Internet Operations GmbH · https://flyingcircus.io Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick _______________________________________________ Openipmi-developer mailing list Openipmi-developer@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openipmi-developer