On 9/4/20 11:24 PM, Digimer wrote: > On 2020-09-04 5:15 p.m., Philippe M Stedman wrote: >> Hi ClusterLabs development, >> >> I am in the process of deploying a two-node cluster on AWS and using the >> fence_aws fence agent for fencing. I was reading through the following >> article about common pitfalls in configuring two-node Pacemaker clusters: >> https://www.thegeekdiary.com/most-common-two-node-pacemaker-cluster-issues-and-their-workarounds/ >> >> and the only concern I have is regarding the fencing device. If I read >> this correctly, there is no need to configure delayed fencing if the >> fence device can guarantee serialized access.My question here is does >> the fence_aws agent guarantee serialized access? In the event of a loss >> of communication between the two cluster nodes, can I guarantee that one >> host will win the race to fence the other and I won't end up in a >> situation where both hosts get fenced. >> >> Do I need to implement delayed fencing with the fence_aws agent or not? >> I appreciate any feedback. >> >> Thanks, >> >> *Phil Stedman* >> Db2 High Availability Development and Support >> Email: pmste...@us.ibm.com > It would depend on AWS, and I don't believe it's a good idea to design a > solution that depends on a third party's behaviour. > > There's another aspect of fence delays to consider as well; It's also to > help ensure that the best node survives, not just that one of them does. > So say your DB is running on node 1, you want to preferentially fence > node 2. If, later, your DB moves to node 2, then you want to reconfigure > your stonith devices to preferentially fence node 1. > > The delay parameter tells the agent to wait N seconds before fencing the > associated node. So if your DB is on node 1, you would set the stonith > device configuration that terminates node 1 to have, say, 'delay="15"'. > This way, node 2 looks up how to fence node 1, sees the delay, and > sleeps. Node 1 looks up how to fence node 2, sees no delay, and fences > immediately. Node 2 is dead before the sleep exits, ensuring in a comms > break where both nodes are otherwise OK that the node 1, the service > host, lives. > Just as a note to the above I wanted to mention 2 approaches to automatically give some preference to the 'better' node in these fencing-races:
- priority-fencing-delay - introduced by Yan Gao earlier this year    Optionally derive the priority of a node from the    resource-prioritiesof the resources it is running.    In a fencing-race the node with the highest priority    has a certainadvantage over the others as fencing requests    for that node areexecuted with an additional delay. - fence_heuristics_ping    Not really a fencing agent by itself!    Put on the same fencing level with the actual fencing agent for    your node to make actual fencing depend on the result of (own)    connectivity determinded using ping heuristics.    Btw. still waiting for feedback on the basic idea and    contributions picking up the idea taking into account    other aspects that might make a node the 'better' node ;-) Klaus _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/