>>> Valentin Vidic <[email protected]> schrieb am 28.02.2021 um 16:59 in Nachricht <[email protected]>: > On Sun, Feb 28, 2021 at 03:34:20PM +0000, Eric Robinson wrote: >> 001db02b rebooted. After it came back up, I tried it in the other direction. >> >> On node 001db02b, the command... >> >> # pcs stonith fence 001db02a >> >> ...produced output... >> >> Error: unable to fence '001db02a'. >> >> However, node 001db02a did get restarted! >> >> We also saw this error... >> >> Failed Actions: >> * stonith‑001db02ab_start_0 on 001db02a 'unknown error' (1): call=70, > status=Timed Out, exitreason='', >> last‑rc‑change='Sun Feb 28 10:11:10 2021', queued=0ms, exec=20014ms >> >> When that happens, does Pacemaker take over the other node's resources, or
> not? > > Cloud fencing usually requires a higher timeout (20s reported here). > > Microsoft seems to suggest the following setup: > > # pcs property set stonith‑timeout=900 But doesn't that mean the other node waits 15 minutes after stonith until it performs the first post-stonith action? > # pcs stonith create rsc_st_azure fence_azure_arm username="login ID" > password="password" resourceGroup="resource group" tenantId="tenant ID" > subscriptionId="subscription id" > pcmk_host_map="prod‑cl1‑0:prod‑cl1‑0‑vm‑name;prod‑cl1‑1:prod‑cl1‑1‑vm‑name" > power_timeout=240 pcmk_reboot_timeout=900 pcmk_monitor_timeout=120 > pcmk_monitor_retries=4 pcmk_action_limit=3 > op monitor interval=3600 > > https://docs.microsoft.com/en‑us/azure/virtual‑machines/workloads/sap/high‑avai > lability‑guide‑rhel‑pacemaker > > ‑‑ > Valentin > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
