A bit more info..
If, after I restart the failed dirsrv instance, I then perform a "pcs resource cleanup dirsrv-daemon" to clear the FAIL messages then the failover will work OK. So it's as if the cleanup is changing the status in some way.. From: Bernie Jones [mailto:ber...@securityconsulting.ltd.uk] Sent: 10 March 2016 08:47 To: 'Cluster Labs - All topics related to open-source clustering welcomed' Subject: [ClusterLabs] FLoating IP failing over but not failing back with active/active LDAP (dirsrv) Hi all, could you advise please? I'm trying to configure a floating IP with an active/active deployment of 389 directory server. I don't want pacemaker to manage LDAP but just to monitor and switch the IP as required to provide resilience. I've seen some other similar threads and based my solution on those. I've amended the ocf for slapd to work with 389 DS and this tests out OK (dirsrv). I've then created my resources as below: pcs resource create dirsrv-ip ocf:heartbeat:IPaddr2 ip="192.168.26.100" cidr_netmask="32" op monitor timeout="20s" interval="5s" op start interval="0" timeout="20" op stop interval="0" timeout="20" pcs resource create dirsrv-daemon ocf:heartbeat:dirsrv op monitor interval="10" timeout="5" op start interval="0" timeout="5" op stop interval="0" timeout="5" meta "is-managed=false" pcs resource clone dirsrv-daemon meta globally-unique="false" interleave="true" target-role="Started" "master-max=2" pcs constraint colocation add dirsrv-daemon-clone with dirsrv-ip score=INFINITY pcs property set no-quorum-policy=ignore pcs resource defaults migration-threshold=1 pcs property set stonith-enabled=false On startup all looks well: ____________________________________________________________________________ ____________ Last updated: Thu Mar 10 08:28:03 2016 Last change: Thu Mar 10 08:26:14 2016 Stack: cman Current DC: ga2.idam.com - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 3 Resources configured Online: [ ga1.idam.com ga2.idam.com ] dirsrv-ip (ocf::heartbeat:IPaddr2): Started ga1.idam.com Clone Set: dirsrv-daemon-clone [dirsrv-daemon] dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga2.idam.com (unmanaged) dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga1.idam.com (unmanaged) ____________________________________________________________________________ ____________ Stop dirsrv on ga1: Last updated: Thu Mar 10 08:28:43 2016 Last change: Thu Mar 10 08:26:14 2016 Stack: cman Current DC: ga2.idam.com - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 3 Resources configured Online: [ ga1.idam.com ga2.idam.com ] dirsrv-ip (ocf::heartbeat:IPaddr2): Started ga2.idam.com Clone Set: dirsrv-daemon-clone [dirsrv-daemon] dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga2.idam.com (unmanaged) dirsrv-daemon (ocf::heartbeat:dirsrv): FAILED ga1.idam.com (unmanaged) Failed actions: dirsrv-daemon_monitor_10000 on ga1.idam.com 'not running' (7): call=12, status=complete, last-rc-change='Thu Mar 10 08:28:41 2016', queued=0ms, exec=0ms IP fails over to ga2 OK: ____________________________________________________________________________ ____________ Restart dirsrv on ga1 Last updated: Thu Mar 10 08:30:01 2016 Last change: Thu Mar 10 08:26:14 2016 Stack: cman Current DC: ga2.idam.com - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 3 Resources configured Online: [ ga1.idam.com ga2.idam.com ] dirsrv-ip (ocf::heartbeat:IPaddr2): Started ga2.idam.com Clone Set: dirsrv-daemon-clone [dirsrv-daemon] dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga2.idam.com (unmanaged) dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga1.idam.com (unmanaged) Failed actions: dirsrv-daemon_monitor_10000 on ga1.idam.com 'not running' (7): call=12, status=complete, last-rc-change='Thu Mar 10 08:28:41 2016', queued=0ms, exec=0ms ____________________________________________________________________________ ____________ Stop dirsrv on ga2: Last updated: Thu Mar 10 08:31:14 2016 Last change: Thu Mar 10 08:26:14 2016 Stack: cman Current DC: ga2.idam.com - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 3 Resources configured Online: [ ga1.idam.com ga2.idam.com ] dirsrv-ip (ocf::heartbeat:IPaddr2): Started ga2.idam.com Clone Set: dirsrv-daemon-clone [dirsrv-daemon] dirsrv-daemon (ocf::heartbeat:dirsrv): FAILED ga2.idam.com (unmanaged) dirsrv-daemon (ocf::heartbeat:dirsrv): Started ga1.idam.com (unmanaged) Failed actions: dirsrv-daemon_monitor_10000 on ga2.idam.com 'not running' (7): call=11, status=complete, last-rc-change='Thu Mar 10 08:31:12 2016', queued=0ms, exec=0ms dirsrv-daemon_monitor_10000 on ga1.idam.com 'not running' (7): call=12, status=complete, last-rc-change='Thu Mar 10 08:28:41 2016', queued=0ms, exec=0ms But IP stays on failed node Looking in the logs it seems that the cluster is not aware that ga1 is available even though the status output shows it is. If I repeat the tests but with ga2 started up first the behaviour is similar i.e. it fails over to ga1 but not back to ga2. Many thanks, Bernie _____ <https://www.avast.com/antivirus> Avast logo This email has been checked for viruses by Avast antivirus software. www.avast.com <https://www.avast.com/antivirus> _____ <https://www.avast.com/antivirus> Avast logo This email has been checked for viruses by Avast antivirus software. www.avast.com <https://www.avast.com/antivirus> --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
_______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org