On Wed, Jun 15, 2022 at 2:10 PM Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de> wrote: > > >>> Klaus Wenninger <kwenn...@redhat.com> schrieb am 15.06.2022 um 13:22 in > Nachricht > <CALrDAo3w1iZOPFV-5Bq=936hz_ctozsm1djkmpuisy7g-bd...@mail.gmail.com>: > > On Wed, Jun 15, 2022 at 10:33 AM Ulrich Windl > > <ulrich.wi...@rz.uni-regensburg.de> wrote: > >> > > ... > > >> (As said above it may be some RAM corruption where SMI (system management > >> interrupts, or so) play a role, but Dell says the hardware is OK, and using > >> SLES we don't have software support with Dell, so they won't even consider > > that > >> fact.) > > > > That happens inside of VMs right? I mean nodes being VMs. > > No, it happens on the hypervisor nodes that are part of the cluster. >
What I described below as well froze the whole machine - till it was taken down by the hardware-watchdog. > > A couple of years back I had an issue running protected mode inside > > of kvm-virtual machines on Lenovo laptops. > > That was really an SMI issue (obviously issues when an SMI interrupt > > was invoked during the CPU being in protected mode) that went away > > disabling SMI interrupts. > > I have no idea if that is still possible with current chipsets. And I'm not > > telling you to do that in production but it might be interesting to narrow > > the issue down still. One might run into thermal issues and such > > SMI is taking care of on that hardware. > > Well, as I have no better idea, I'd probably even give "kick it hard with the > foot" a chance ;-) Don't know if it is of much use but this is what I was using iirc https://github.com/zultron/smictrl. Jan back then wrote it for his laptop and mine showed the same behavior and being close enough chipset-wise it did the trick on mine as well. Obviously reading uefi-variables from the os as well triggers some SMI action. So booting with a legacy bios - if possible - might be an interesting test-case. > > Regards, > Ulrich > > > > > Klaus > >> > >> But actually I start believing such a system is a good playground for any > >> HA > >> solution ;-) > >> Unfortunately here it's much more production than playground... > >> > >> Regards, > >> Ulrich > >> > >> > >> _______________________________________________ > >> Manage your subscription: > >> https://lists.clusterlabs.org/mailman/listinfo/users > >> > >> ClusterLabs home: https://www.clusterlabs.org/ > > > > _______________________________________________ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/