>>> Andrei Borzenkov <[email protected]> schrieb am 18.12.2020 um 08:01 in Nachricht <[email protected]>: > 17.12.2020 21:30, Ken Gaillot пишет: >> >> This reminded me that some IPMI implementations return "success" for >> commands before they've actually been completed. This is why >> fence_ipmilan has a "power_wait" parameter that defaults to 2 seconds. >> > > But on this case we also do not know whether command has been completed > successfully or not. I'd say in this case the only safe way is to use > poweroff and verify in stonith agent that node is actually powered off > before returning success.
As I wrote in my message, the other node showind that a node has left would be an indication that fencing was successful IF there was a valid network connection up to the fencing event. Thus I think a redundant network is rather important. The user should be able to tell whether fencing actually does work; maybe not from syslog, but from other indicators. Also if the network outage were simulated by using a node-specific blackhole route (blocking just the other node(s)), the node could be queried (for example) by a ping from a third note to see whether and when it actually wend down. Regards, Ulrich > >> The best thing would be to do some manual testing using ipmitool or >> whatnot to turn off the power, and observe how long it takes between >> when the command returns and the server actually is powered down. Then >> set power_wait to a comfortable margin above that. Or just keep raising >> power_wait until the problem goes away :) >> > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
