Hi all, I'm hoping to have the first release candidate for 2.1.3 ready next week.
Pacemaker has long had a feature to monitor node health (CPU usage, SMART drive errors, etc.) and move resources off degraded nodes: https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#tracking-node-health The 2.1.3 release will add a couple of features to make this more useful. First, you can now exempt particular resources from health-related bans, using the new "allow-unhealthy-nodes" resource meta-attribute. This is particularly helpful for the health monitoring agents themselves. Without the new option, health agents get moved off degraded nodes, which means the cluster can't detect if the degraded condition goes away. Users had to manually clear the health attributes to allow resources to move back to the node. Now, you can set allow- unhealthy-nodes=true on your health agent resources, so they can continue detecting changes in the health status. Second, crm_mon will indicate when a node's health is yellow or red, like: * Node List: * Node node1: online (health is RED) Previously, you would see that the node is not running any resources, but not know why, unless you thought to check every node health attribute. -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/