Re: [ClusterLabs] Nodes see each other as OFFLINE - fence agent (fence_pcmk) may not be working properly on RHEL 6.5
On 12/16/2016 07:46 AM, avinash shankar wrote: > > Hello team, > > I am a newbie in pacemaker and corosync cluster. > I am facing trouble with fence_agent on RHEL 6.5 > I have installed pcs, pacemaker, corosync, cman on RHEL 6.5 on two > virtual nodes (libvirt) cluster. > SELINUX and firewall is completely disabled. > > # yum list installed | egrep 'pacemaker|corosync|cman|fence' > cman.x86_64 3.0.12.1-78.el6 > @rhel-ha-for-rhel-6-server-rpms > corosync.x86_64 1.4.7-5.el6 > @rhel-ha-for-rhel-6-server-rpms > corosynclib.x86_64 1.4.7-5.el6 > @rhel-ha-for-rhel-6-server-rpms > fence-agents.x86_64 4.0.15-12.el6 > @rhel-6-server-rpms > fence-virt.x86_640.2.3-19.el6 > @rhel-ha-for-rhel-6-server-eus-rpms > pacemaker.x86_64 1.1.14-8.el6_8.2 > @rhel-ha-for-rhel-6-server-rpms > pacemaker-cli.x86_64 1.1.14-8.el6_8.2 > @rhel-ha-for-rhel-6-server-rpms > pacemaker-cluster-libs.x86_641.1.14-8.el6_8.2 > @rhel-ha-for-rhel-6-server-rpms > pacemaker-libs.x86_641.1.14-8.el6_8.2 > @rhel-ha-for-rhel-6-server-rpms > > > I bring up cluster using pcs cluster start --all > also done pcs property set stonith-enabled=false fence_pcmk simply tells CMAN to use pacemaker's fencing ... it can't work if pacemaker's fencing is disabled. > Below is the status > --- > # pcs status > Cluster name: roamclus > Last updated: Fri Dec 16 18:54:40 2016Last change: Fri Dec 16 > 17:44:50 2016 by root via cibadmin on cnode1 > Stack: cman > Current DC: NONE > 2 nodes and 2 resources configured > > Online: [ cnode1 ] > OFFLINE: [ cnode2 ] > > Full list of resources: > > PCSD Status: > cnode1: Online > cnode2: Online > --- > Same kind of output is observed on other node = cnode2 > So nodes see each other as OFFLINE. > Expected result is Online: [ cnode1 cnode2 ] > I did same packages installation on RHEL 6.8 and when I am starting the > cluster, > it shows both nodes ONLINE from each other. > > I need to resolve this such that on RHEL 6.5 nodes when we start cluster > by default > both nodes should display each others status as online. > -- > Below is the /etc/cluster/cluster.conf > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > # cat /var/lib/pacemaker/cib/cib.xml > num_updates="0" admin_epoch="0" cib-last-written="Fri Dec 16 18:57:10 > 2016" update-origin="cnode1" update-client="cibadmin" update-user="root" > have-quorum="1" dc-uuid="cnode1"> > > > > name="have-watchdog" value="false"/> > value="1.1.14-8.el6_8.2-70404b0"/> > name="cluster-infrastructure" value="cman"/> > name="stonith-enabled" value="false"/> > > > > > > > > > > > > /var/log/messages have below contents : > > Dec 15 20:29:43 cnode2 kernel: DLM (built Oct 26 2016 10:26:08) installed > Dec 15 20:29:46 cnode2 corosync[2464]: [MAIN ] Corosync Cluster > Engine ('1.4.7'): started and ready to provide service. > Dec 15 20:29:46 cnode2 corosync[2464]: [MAIN ] Corosync built-in > features: nss dbus rdma snmp > Dec 15 20:29:46 cnode2 corosync[2464]: [MAIN ] Successfully read > config from /etc/cluster/cluster.conf > Dec 15 20:29:46 cnode2 corosync[2464]: [MAIN ] Successfully parsed > cman config > Dec 15 20:29:46 cnode2 corosync[2464]: [TOTEM ] Initializing transport > (UDP/IP Multicast). > Dec 15 20:29:46 cnode2 corosync[2464]: [TOTEM ] Initializing > transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). > Dec 15 20:29:46 cnode2 corosync[2464]: [TOTEM ] The network interface > [10.10.18.138] is now up. > Dec 15 20:29:46 cnode2 corosync[2464]: [QUORUM] Using quorum provider > quorum_cman > Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: > corosync cluster quorum service v0.1 > Dec 15 20:29:46 cnode2 corosync[2464]: [CMAN ] CMAN 3.0.12.1 (built > Feb 1 2016 07:06:19) started > Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: > corosync CMAN membership service 2.90 > Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: > openais checkpoint service B.01.01 > Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: > corosync extended virtual synchrony service > Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: > corosync configuration service > Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: > corosync cluster closed process group service v1.01 >
[ClusterLabs] Nodes see each other as OFFLINE - fence agent (fence_pcmk) may not be working properly on RHEL 6.5
Hello team, I am a newbie in pacemaker and corosync cluster. I am facing trouble with fence_agent on RHEL 6.5 I have installed pcs, pacemaker, corosync, cman on RHEL 6.5 on two virtual nodes (libvirt) cluster. SELINUX and firewall is completely disabled. # yum list installed | egrep 'pacemaker|corosync|cman|fence' cman.x86_64 3.0.12.1-78.el6 @rhel-ha-for-rhel-6-server-rpms corosync.x86_64 1.4.7-5.el6 @rhel-ha-for-rhel-6-server-rpms corosynclib.x86_64 1.4.7-5.el6 @rhel-ha-for-rhel-6-server-rpms fence-agents.x86_64 4.0.15-12.el6 @rhel-6-server-rpms fence-virt.x86_640.2.3-19.el6 @rhel-ha-for-rhel-6-server-eus-rpms pacemaker.x86_64 1.1.14-8.el6_8.2 @rhel-ha-for-rhel-6-server-rpms pacemaker-cli.x86_64 1.1.14-8.el6_8.2 @rhel-ha-for-rhel-6-server-rpms pacemaker-cluster-libs.x86_641.1.14-8.el6_8.2 @rhel-ha-for-rhel-6-server-rpms pacemaker-libs.x86_641.1.14-8.el6_8.2 @rhel-ha-for-rhel-6-server-rpms I bring up cluster using pcs cluster start --all also done pcs property set stonith-enabled=false Below is the status --- # pcs status Cluster name: roamclus Last updated: Fri Dec 16 18:54:40 2016Last change: Fri Dec 16 17:44:50 2016 by root via cibadmin on cnode1 Stack: cman Current DC: NONE 2 nodes and 2 resources configured Online: [ cnode1 ] OFFLINE: [ cnode2 ] Full list of resources: PCSD Status: cnode1: Online cnode2: Online --- Same kind of output is observed on other node = cnode2 So nodes see each other as OFFLINE. Expected result is Online: [ cnode1 cnode2 ] I did same packages installation on RHEL 6.8 and when I am starting the cluster, it shows both nodes ONLINE from each other. I need to resolve this such that on RHEL 6.5 nodes when we start cluster by default both nodes should display each others status as online. -- Below is the /etc/cluster/cluster.conf -- # cat /var/lib/pacemaker/cib/cib.xml /var/log/messages have below contents : Dec 15 20:29:43 cnode2 kernel: DLM (built Oct 26 2016 10:26:08) installed Dec 15 20:29:46 cnode2 corosync[2464]: [MAIN ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service. Dec 15 20:29:46 cnode2 corosync[2464]: [MAIN ] Corosync built-in features: nss dbus rdma snmp Dec 15 20:29:46 cnode2 corosync[2464]: [MAIN ] Successfully read config from /etc/cluster/cluster.conf Dec 15 20:29:46 cnode2 corosync[2464]: [MAIN ] Successfully parsed cman config Dec 15 20:29:46 cnode2 corosync[2464]: [TOTEM ] Initializing transport (UDP/IP Multicast). Dec 15 20:29:46 cnode2 corosync[2464]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Dec 15 20:29:46 cnode2 corosync[2464]: [TOTEM ] The network interface [10.10.18.138] is now up. Dec 15 20:29:46 cnode2 corosync[2464]: [QUORUM] Using quorum provider quorum_cman Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Dec 15 20:29:46 cnode2 corosync[2464]: [CMAN ] CMAN 3.0.12.1 (built Feb 1 2016 07:06:19) started Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: corosync CMAN membership service 2.90 Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: corosync configuration service Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: corosync profile loading service Dec 15 20:29:46 cnode2 corosync[2464]: [QUORUM] Using quorum provider quorum_cman Dec 15 20:29:46 cnode2 corosync[2464]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Dec 15 20:29:46 cnode2 corosync[2464]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Dec 15 20:29:46 cnode2 corosync[2464]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Dec 15 20:29:46 cnode2 corosync[2464]: [CMAN ] quorum regained, resuming activity Dec 15 20:29:46 cnode2 corosync[2464]: [QUORUM] This node is within the primary component and will provide service. Dec 15