On 11/01/16 11:59 -0500, Digimer wrote: > We hit a strange problem where a RAID controller on a node failed, > causing DLM (gfs2/clvmd) to hang, but the node was never fenced. I > assume this was because corosync was still working. > > Is there a way in rhel6/cman/rgmanager to have a node suicide or get > fenced in a condition like this?
something like this in the crontab (beside cron and other components are now the SPOF and I/O spike or DoS will finish the apocalypse)? */1 * * * * timeout 30s touch <file on respective fs> || fence_node <myself> Sophistications at the components you mentioned might be preferred, though. -- Jan (Poki)
pgp0QQPyT8hHf.pgp
Description: PGP signature
_______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
