Hi All, We have confirmed a slightly strange configuration of the bundle. There is only one bundle resource, and it has an association with a group resource. The operation was confirmed in PM 1.1.19. Step1) Configure the cluster. -------- [root@cent7-host1 ~]# crm_mon -R Defaulting to one-shot mode You need to have curses available at compile time to enable console mode Stack: corosync Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum Last updated: Thu Dec 6 13:20:21 2018 Last change: Thu Dec 6 13:20:05 2018 by root via cibadmin on cent7-host1 4 nodes configured 10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ] GuestOnline: [ httpd-bundle1-0@cent7-host1 httpd-bundle2-0@cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host1 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host1 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0 -------- Step2) Once we have cent7-host1 as standby, move the resource to cent7-host2. -------- [root@cent7-host1 ~]# crm_standby -v on [root@cent7-host1 ~]# crm_mon -R Defaulting to one-shot mode You need to have curses available at compile time to enable console mode Stack: corosync Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum Last updated: Thu Dec 6 13:21:36 2018 Last change: Thu Dec 6 13:21:17 2018 by root via crm_attribute on cent7-host1 4 nodes configured 10 resources configured Node cent7-host1 (3232262828): standby Online: [ cent7-host2 (3232262829) ] GuestOnline: [ httpd-bundle1-0@cent7-host2 httpd-bundle2-0@cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host2 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0 -------- Step3) Release standby of cent7-host1. -------- [root@cent7-host1 ~]# crm_standby -v off [root@cent7-host1 ~]# crm_mon -R Defaulting to one-shot mode You need to have curses available at compile time to enable console mode Stack: corosync Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum Last updated: Thu Dec 6 13:21:59 2018 Last change: Thu Dec 6 13:21:56 2018 by root via crm_attribute on cent7-host1 4 nodes configured 10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ] GuestOnline: [ httpd-bundle1-0@cent7-host2 httpd-bundle2-0@cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host2 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0 -------- Step4) Move the group 1 resource and also return the bundle resource to cent7-host1. -------- [root@cent7-host1 ~]# crm_resource -M -r group1 -H cent7-host1 -f -Q [root@cent7-host1 ~]# crm_mon -R Defaulting to one-shot mode You need to have curses available at compile time to enable console mode Stack: corosync Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum Last updated: Thu Dec 6 13:22:56 2018 Last change: Thu Dec 6 13:22:36 2018 by root via crm_resource on cent7-host1 4 nodes configured 10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ] GuestOnline: [ httpd-bundle1-0@cent7-host2 httpd-bundle2-0@cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host1 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0 -------- Step5) Release the added constraints. At this time, when looking at the display, httpd-bundle1-0 has not moved to cent7-host1. -------- [root@cent7-host1 ~]# crm_resource -U -r group1 [root@cent7-host1 ~]# crm_mon -R Defaulting to one-shot mode You need to have curses available at compile time to enable console mode Stack: corosync Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum Last updated: Thu Dec 6 13:23:21 2018 Last change: Thu Dec 6 13:23:17 2018 by root via crm_resource on cent7-host1 4 nodes configured 10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ] GuestOnline: [ httpd-bundle1-0@cent7-host2 httpd-bundle2-0@cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host1 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host2 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0 Step6) Connect to httpd-bundle1-docker-0 and kill pacemaker-remoted to cause a malfunction. -------- [root@cent7-host1 ~]# docker exec -it httpd-bundle1-docker-0 /bin/bash [root@httpd-bundle1-0 /]# ps -ef |grep remote root 5 1 0 04:22 ? 00:00:00 /usr/sbin/pacemaker_remoted root 133 120 0 04:23 ? 00:00:00 grep --color=auto remote [root@httpd-bundle1-0 /]# kill -9 5;exit -------- Finally, the cluster looks like this. - If pacemaker-remoted is KILL, it should FailOver to cent7-host2, but it will not fail over. - Also, in step 6, the fault occurred at cent7-host1 is indicated as the fault occurred at cent7-host2. -------- [root@cent7-host1 ~]# crm_mon -R Defaulting to one-shot mode You need to have curses available at compile time to enable console mode Stack: corosync Current DC: cent7-host2 (3232262829) (version 1.1.19-c3c624ea3d) - partition with quorum Last updated: Thu Dec 6 13:24:03 2018 Last change: Thu Dec 6 13:23:17 2018 by root via crm_resource on cent7-host1 4 nodes configured 10 resources configured Online: [ cent7-host1 (3232262828) cent7-host2 (3232262829) ] GuestOnline: [ httpd-bundle1-0@cent7-host1 httpd-bundle2-0@cent7-host2 ] Active resources: Resource Group: group1 dummy1 (ocf::pacemaker:Dummy): Started cent7-host1 Resource Group: group2 dummy2 (ocf::pacemaker:Dummy): Started cent7-host2 Docker container: httpd-bundle1 [pcmktest:http] httpd-bundle1-ip-192.168.20.188 (ocf::heartbeat:IPaddr2): Started cent7-host1 httpd-bundle1-docker-0 (ocf::heartbeat:docker): Started cent7-host1 httpd-bundle1-0 (ocf::pacemaker:remote): Started cent7-host1 httpd1 (ocf::heartbeat:apache): Started httpd-bundle1-0 Docker container: httpd-bundle2 [pcmktest:http] httpd-bundle2-ip-192.168.20.190 (ocf::heartbeat:IPaddr2): Started cent7-host2 httpd-bundle2-docker-0 (ocf::heartbeat:docker): Started cent7-host2 httpd-bundle2-0 (ocf::pacemaker:remote): Started cent7-host2 httpd2 (ocf::heartbeat:apache): Started httpd-bundle2-0 Failed Actions: * httpd-bundle1-0_monitor_60000 on cent7-host2 'unknown error' (1): call=9, status=Error, exitreason='', last-rc-change='Thu Dec 6 13:23:49 2018', queued=0ms, exec=0ms -------- Apparently, the problem seems to be that when the bundle resource is moved in Step 4, the remote resource is not moving. - In the latest master (a3bf7116d2), we could not confirm because the scheduler process went down. * This problem is registered in the following Bugzilla. - https://bugs.clusterlabs.org/show_bug.cgi?id=5373 Best Regards, Hideo Yamauchi. _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org