On 2022/5/25 8:10, Ulrich Windl wrote:
Hi!
We are still suffering from kernel RAM corruption on the Xen hypervisor when a
VM or the hypervisor is doing I/O (three months since the bug report at SUSE,
but no fix or workaround meaning the whole Xen cluster project was canceled
after 20 years, but that's a different topic). All VMs will be migrated to
VMware, dumping the whole SLES15 Xen cluster very soon.
My script that detected RAM corruption tried to shutdown pacemaker, hoping for
the best (i.e. VMs to be live-migrated away). However there are very strange
decisions made (pacemaker-2.0.5+20201202.ba59be712-150300.4.21.1.x86_64):
May 24 17:05:07 h16 VirtualDomain(prm_xen_test-jeos7)[24460]: INFO: test-jeos7:
live migration to h19 succeeded.
May 24 17:05:07 h16 VirtualDomain(prm_xen_test-jeos9)[24463]: INFO: test-jeos9:
live migration to h19 succeeded.
May 24 17:05:07 h16 pacemaker-execd[7504]: notice: prm_xen_test-jeos7
migrate_to (call 321, PID 24281) exited with status 0 (execution time 5500ms,
queue time 0ms)
May 24 17:05:07 h16 pacemaker-controld[7509]: notice: Result of migrate_to
operation for prm_xen_test-jeos7 on h16: ok
May 24 17:05:07 h16 pacemaker-execd[7504]: notice: prm_xen_test-jeos9
migrate_to (call 323, PID 24283) exited with status 0 (execution time 5514ms,
queue time 0ms)
May 24 17:05:07 h16 pacemaker-controld[7509]: notice: Result of migrate_to
operation for prm_xen_test-jeos9 on h16: ok
Would you agree that the migration was successful? I'd say YES!
Maybe practically yes with what migrate_to has achieved with
VirtualDomain RA, but technically no from pacemaker's point of view.
Following the migrate_to on the source node, a migrate_from operation on
the target node and a stop operation on the source node will be needed
to eventually make a successful live-migration.
However this is what happened:
May 24 17:05:19 h16 pacemaker-controld[7509]: notice: Transition 2460
(Complete=16, Pending=0, Fired=0, Skipped=7, Incomplete=57,
Source=/var/lib/pacemaker/pengine/pe-input-89.bz2): Stopped
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Unexpected result
(error) was recorded for stop of prm_ping_gw1:1 on h16 at May 24 17:05:02 2022
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Unexpected result
(error) was recorded for stop of prm_ping_gw1:1 on h16 at May 24 17:05:02 2022
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Cluster node h16 will
be fenced: prm_ping_gw1:1 failed there
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Unexpected result
(error) was recorded for stop of prm_iotw-md10:1 on h16 at May 24 17:05:02 2022
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Unexpected result
(error) was recorded for stop of prm_iotw-md10:1 on h16 at May 24 17:05:02 2022
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Forcing cln_ping_gw1
away from h16 after 1000000 failures (max=1000000)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Forcing cln_ping_gw1
away from h16 after 1000000 failures (max=1000000)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Forcing cln_ping_gw1
away from h16 after 1000000 failures (max=1000000)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Forcing cln_iotw-md10
away from h16 after 1000000 failures (max=1000000)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Forcing cln_iotw-md10
away from h16 after 1000000 failures (max=1000000)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Forcing cln_iotw-md10
away from h16 after 1000000 failures (max=1000000)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: notice: Resource
prm_xen_test-jeos7 can no longer migrate from h16 to h19 (will stop on both
nodes)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: notice: Resource
prm_xen_test-jeos9 can no longer migrate from h16 to h19 (will stop on both
nodes)
May 24 17:05:19 h16 pacemaker-schedulerd[7508]: warning: Scheduling Node h16
for STONITH
So the DC considers the migration to have failed, even though it was reported
as success!
A so-called partial live-migration could no longer continue here.
Regards,
Yan
(The ping had dumped core due to RAM corruption before)
May 24 17:03:12 h16 kernel: ping[23973]: segfault at 213e6 ip 00000000000213e6
sp 00007ffc249fab78 error 14 in bash[5655262bc000+f1000]
So it stopped the VMs that were migrated successfully before:
May 24 17:05:19 h16 pacemaker-controld[7509]: notice: Initiating stop
operation prm_xen_test-jeos7_stop_0 on h19
May 24 17:05:19 h16 pacemaker-controld[7509]: notice: Initiating stop
operation prm_xen_test-jeos9_stop_0 on h19
May 24 17:05:19 h16 pacemaker-controld[7509]: notice: Requesting fencing
(reboot) of node h16
Those test VMs were not important, but the important part was that due to the
failure to stop the ping resource, it did not even try to migrate the other VMs
(non-test) away, so those were hard-fenced.
For completeness I should add that the RAM corruption also affected pacemaker
itself:
May 24 17:05:02 h16 kernel: traps: pacemaker-execd[24272] general protection
fault ip:7fc572327bcf sp:7ffca7cd22d0 error:0 in
libc-2.31.so[7fc572246000+1e6000]
May 24 17:05:02 h16 kernel: pacemaker-execd[24277]: segfault at 0 ip
0000000000000000 sp 00007ffca7cd22f0 error 14 in
pacemaker-execd[56347df4e000+b000]
May 24 17:05:02 h16 kernel: Code: Bad RIP value.
That affected the stop of some (non-essential) ping and MD-RAID-based
resources:
May 24 17:05:02 h16 pacemaker-execd[7504]: warning: prm_ping_gw1_stop_0[24272]
terminated with signal: Segmentation fault
May 24 17:05:02 h16 pacemaker-execd[7504]: warning:
prm_iotw-md10_stop_0[24277] terminated with signal: Segmentation fault
May 24 17:05:03 h16 sbd[7254]: warning: inquisitor_child: pcmk health check:
UNHEALTHY
May 24 17:05:03 h16 sbd[7254]: warning: inquisitor_child: Servant pcmk is
outdated (age: 1844062)
Note: If the "outdated" number is seconds, that's definitely odd!
Regards,
Ulrich
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/