I have a question about OCF_TIMEOUT. Some times my cluster shows me this on pcs status: Failed Resource Actions: * fence-server02_monitor_60000 on server01 'OCF_TIMEOUT' (198): call=419, status='Timed Out', exitreason='', last-rc-change='2022-04-26 14:47:32 -03:00', queued=0ms, exec=20004ms
I can see in the same pcs status output that the fence device is started, so does that mean it failed some moment in the past and now it is OK? Or do I have to do something to recover it? # pcs status Cluster name: cluster1 Cluster Summary: * Stack: corosync * Current DC: server02 (version 2.1.0-8.el8-7c3f660707) - partition with quorum * Last updated: Tue Apr 26 14:52:56 2022 * Last change: Tue Apr 26 14:37:22 2022 by hacluster via crmd on server01 * 2 nodes configured * 11 resource instances configured Node List: * Online: [ server01 server02 ] Full List of Resources: * fence-server01 (stonith:fence_vmware_rest): Started server02 * fence-server02 (stonith:fence_vmware_rest): Started server01 ... Is "pcs resource cleanup" the right way to remove those messages ? Atenciosamente/Kind regards, Salatiel _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
