I have a simple resource defined:
[root@ha-d1 ~]# pcs resource show dmz1
Resource: dmz1 (class=ocf provider=internal type=ip-address)
Attributes: address=172.16.10.192 monitor_link=true
Meta Attrs: migration-threshold=3 failure-timeout=30s
Operations: monitor interval=7s (dmz1-monitor-interval-7s)
This is a custom resource which provides an ethernet alias to one of the
interfaces on our system.
I can unplug the cable on either node and failover occurs as expected, and 30s
after re-plugging it I can repeat the exercise on the opposite node and
failover will happen as expected.
However, if I unplug the cable from both nodes, the failcount goes up, and the
30s failure-timeout does not reset the failcounts, meaning that pacemaker never
tries to start the failed resource again.
Full list of resources:
Resource Group: network
inif (off::internal:ip.sh): Started ha-d1.dev.com
outif (off::internal:ip.sh): Started ha-d2.dev.com
dmz1 (off::internal:ip.sh): Stopped
Master/Slave Set: DRBDMaster [DRBDSlave]
Masters: [ ha-d1.dev.com ]
Slaves: [ ha-d2.dev.com ]
Resource Group: filesystem
DRBDFS (ocf::heartbeat:Filesystem): Stopped
Resource Group: application
service_failover (off::internal:service_failover): Stopped
Failcounts for dmz1
ha-d1.dev.com: 4
ha-d2.dev.com: 4
Is there any way to automatically recover from this scenario, other than
setting an obnoxiously high migration-threshold?
--
Sam Gardner
Software Engineer
Trustwave | SMART SECURITY ON DEMAND
________________________________
This transmission may contain information that is privileged, confidential,
and/or exempt from disclosure under applicable law. If you are not the intended
recipient, you are hereby notified that any disclosure, copying, distribution,
or use of the information contained herein (including any reliance thereon) is
strictly prohibited. If you received this transmission in error, please
immediately contact the sender and destroy the material in its entirety,
whether in electronic or hard copy format.
_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org