On 2021-03-03 1:56 a.m., Ulrich Windl wrote: >>>> Eric Robinson <[email protected]> schrieb am 02.03.2021 um 19:26 in > Nachricht > <sa2pr03mb58847e37845fc6c92bc3007efa...@sa2pr03mb5884.namprd03.prod.outlook.com> > >>> -----Original Message----- >>> From: Users <[email protected]> On Behalf Of Digimer >>> Sent: Monday, March 1, 2021 11:02 AM >>> To: Cluster Labs - All topics related to open-source clustering welcomed >>> <[email protected]>; Ulrich Windl <[email protected]> >>> Subject: Re: [ClusterLabs] Antw: [EXT] Re: "Error: unable to fence >> '001db02a'" > ... >>>>> Cloud fencing usually requires a higher timeout (20s reported here). >>>>> >>>>> Microsoft seems to suggest the following setup: >>>>> >>>>> # pcs property set stonith‑timeout=900 >>>> >>>> But doesn't that mean the other node waits 15 minutes after stonith >>>> until it performs the first post-stonith action? >>> >>> No, it means that if there is no reply by then, the fence has failed. If > the >>> fence happens sooner, and the caller is told this, recovery begins very >> shortly >>> after. > > How would the fencing be confirmed? I don't know.
It's part of the FenceAgentAPI. The cluster invokes the fence agent, passes in variable=value pairs on STDIN, and waits for the agent to exit. It reads the agent's exit code and uses that to determine success or failure. So if the fence agent is invoked and 5 seconds later, it exits with the "success" RC, the cluster knows the peer is gone and that it can now safely begin recovery. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
