09.08.2019 16:34, Yan Gao пишет: > Hi, > > With disk-less sbd, it's fine to stop cluster service from the cluster > nodes all at the same time. > > But if to stop the nodes one by one, for example with a 3-node cluster, > after stopping the 2nd node, the only remaining node resets itself with: >
That is sort of documented in SBD manual page: --><-- However, while the cluster is in such a degraded state, it can neither successfully fence nor be shutdown cleanly (as taking the cluster below the quorum threshold will immediately cause all remaining nodes to self-fence). --><-- SBD in shared-nothing mode is basically always in such degraded state and cannot tolerate loss of quorum. > Aug 09 14:30:20 opensuse150-1 sbd[1079]: pcmk: debug: > notify_parent: Not notifying parent: state transient (2) > Aug 09 14:30:20 opensuse150-1 sbd[1080]: cluster: debug: > notify_parent: Notifying parent: healthy > Aug 09 14:30:20 opensuse150-1 sbd[1078]: warning: inquisitor_child: > Latency: No liveness for 4 s exceeds threshold of 3 s (healthy servants: 0) > > I can think of the way to manipulate quorum with last_man_standing and > potentially also auto_tie_breaker, not to mention > last_man_standing_window would also be a factor... But is there a better > solution? > Lack of cluster wide shutdown mode was mentioned more than once on this list. I guess the only workaround is to use higher level tools which basically simply try to stop cluster on all nodes at once. It is still susceptible to race condition. _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
