On 14.04.2021 17:50, Digimer wrote: > Hi all, > > As we get close to finish our Anvil! switch to pacemaker, I'm trying > to tie up loose ends. One that I want feedback on is the pacemaker > version of cman's old 'post_join_delay' feature. > > Use case example; > > A common use for the Anvil! is remote deployments where there is no > (IT) humans available. Think cargo ships, field data collection, etc. So > it's entirely possible that a node could fail and not be repaired for > weeks or even months. With this in mind, it's also feasible that a solo > node later loses power, and then reboots. In such a case, 'pcs cluster > start' would never go quorate as the peer is dead. > > In cman, during startup, if there was no reply from the peer after > post_join_delay seconds, the peer would get fenced and then the cluster > would finish coming up. Being two_node, it would also become quorate and > start hosting services. Of course, this opens the risk of a fence loop, > but we have other protections in place to prevent that, so a fence loop > is not a concern. > > My question then is two-fold; > > 1. Is there a pacemaker equivalent to 'post_join_delay'? (Fence the peer > and, if successful, become quorate)? >
Startup fencing is pacemaker default (startup-fencing cluster option). > 2. If not, was this a conscious decision not to add it for some reason, > or was it simply never added? If it was consciously decided to not have > it, what was the reasoning behind it? > > I can replicate this behaviour in our code, but I don't want to do > that if there is a compelling reason that I am not aware of. > > So, > > A) is there a pacemaker version of post_join_delay? > B) is there a compelling argument NOT to use post_join_delay behaviour > in pacemaker I am not seeing? > > Thanks! > _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/