>>> Jan Pokorný <jpoko...@redhat.com> schrieb am 12.08.2019 um 22:30 in Nachricht <20190812203037.gm25...@redhat.com>:
[...] > Is it OK for lower level components to do autonomous decisions > without at least informing the higher level wrt. what exactly is > going on, as we could observe here? [...] Excuse me for throwing in another comparison with old HP-UX ServiceGuard: As far as I understood it, the HP-UX kernel had a hardware-based watchdog that the process corresponding to crmd enabled during start and periodically "fed" to avoid a "TOC" (Transfer Of Control, resulting in a kernel panic, crash dump and reboot). That was all: So when crmd died or failed to feed the watchdog the node rebooted and the other node took care of the resources (if possible). Fencing was basically network based with a disk as tie-breaker: If there was a network outage, both (2-node cluster case) nodes tried to control the cluster, racing for a SCSI lock on the "lock disk" (requiring a multi-initiator SCSI setup for shared disks). The winner wrote his node name to the disk's slot so that the other node(s) could read and tell who the winner of the race was. They all committed suicide then (an exit of the main cluster process would be enough to trigger the watchdog, but the did a explicit TOC). Comparing with pacemaker corosync and fencing this all seems unnecessarily complex to me; at least if you have some shared storage. The other nice thing was network traffic in ServiceGuard: The heartbeat interval was configurable (like every 7 seconds), and when there was nothing "interesting" happening in the cluster there was no traffic other than the heartbeat (missing a configurable number of heartbeats declared a split brain, and the machinery really started). I think pacemaker is creating way to much network traffic. So I think sbd should not decide by itself whether to reboot a node or not. Maybe even sbd should not use the watchdog, but the crmd should... Regards, Ulrich _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/