I found something abnormal during my operation about dynamically adding node to existing pacemaker cluster. My cluster is pcs/cman based, and versions of pcs/corosync are: # rpm -qa |grep corosync corosynclib-1.4.7-1.el6.x86_64 corosync-1.4.7-1.el6.x86_64 rpm -qa |grep pacemaker pacemaker-cli-1.1.12-4.el6.x86_64 pacemaker-1.1.12-4.el6.x86_64 pacemaker-cluster-libs-1.1.12-4.el6.x86_64 pacemaker-libs-1.1.12-4.el6.x86_64
In current cluster, there are 3 nodes existing: node101/node103/node192(DC node), now I want to dynamically adding node194 into the cluster. Then something wrong: Before adding, status of cluster are: pcs status Cluster name: kvm_storage Last updated: Mon Sep 26 16:16:19 2016 Last change: Thu Sep 22 14:39:56 2016 Stack: cman Current DC: 172.28.217.192 - partition with quorum Version: 1.1.11-97629de 6 Nodes configured 18 Resources configured Online: [ 172.28.217.101 172.28.217.103 172.28.217.192 ] OFFLINE: [ 172.28.217.193] ..... ..... In node101, the conf file are /etc/cluster/cluster.conf: <?xml version="1.0"?> <cluster config_version="4" name="kvm_storage"> <logging debug="off"/> <clusternodes> <clusternode name="172.28.217.101" nodeid="1"/> <clusternode name="172.28.217.103" nodeid="2"/> <clusternode name="172.28.217.192" nodeid="3"/> </clusternodes> <logging> <logging_daemon debug="on" logfile="/var/log/cluster/corosync.log" name="corosync"/> </logging> <dlm enable_fencing="0"/> <totem token="32000"/> <quorumd interval="2" label="qdiskCluster73" min_score="7" tko="9"> <heuristic program="ping -c 1 172.28.217.126 -t 2 -w 1" score="3"/> <heuristic program="ping -c 1 172.28.217.73 -t 2 -w 1" score="3"/> <heuristic program="/usr/local/odpm/checkFCStatus.sh" score="5"/> </quorumd> </cluster> Then I make the conf file of node194 according to node101, and only add one line: <clusternode name="172.28.217.194" nodeid="4"/> then I started the service cman and pcs, it ran successfully. I use command "ccs_sync -f /etc/cluster/cluster.conf" to update the conf file of other nodes in this cluster. [root@node194 ~]# service cman status cluster is running. But after typing command "pcs status", I found that in the area OFFLINE, there are something abnormal: [root@node194 ~]# pcs status Cluster name: kvm_storage Last updated: Mon Sep 26 16:19:40 2016 Last change: Mon Sep 26 14:56:26 2016 Stack: cman Current DC: 172.28.217.192 - partition with quorum Version: 1.1.11-97629de 6 Nodes configured 18 Resources configured Online: [ 172.28.217.101 172.28.217.103 172.28.217.192 ] OFFLINE: [ 172.28.217.193 *172.28.217.194* *Node4 *] which seems like that node194 are added two times respectively in forms of IP format and in forms of nodename format, so the cluster was not able to bring this node online. I have checked the relevant log in DC node192, and find that in /var/log/cluster/corosync.log Sep 26 16:29:24 [4556] node192 crmd: warning: crm_find_peer: Node '*Node4*' and '*172.28.217.194*' share the same cluster nodeid: 4 Sep 26 16:29:24 [4556] node192 crmd: warning: crm_find_peer: Node 'Node4' and '172.28.217.194' share the same cluster nodeid: 4 Sep 26 16:29:24 [4556] node192 crmd: notice: election_count_vote: Election 35241 (current: 35241, owner: 172.28.217.192): Processed no-vote from 172.28.217.194 (*Peer is not part of our cluster*) I don't know what to do about this? Are there any parameters in pacemaker's conf file to combine the IP address with node name? If any problems, just mail me. Many thanks for your great support!! -- View this message in context: http://linux-ha.996297.n3.nabble.com/Errors-occurred-when-dynamically-adding-node-to-existing-pacemaker-cluster-tp16247.html Sent from the Linux-HA mailing list archive at Nabble.com. _______________________________________________ Linux-HA mailing list is closing down. Please subscribe to us...@clusterlabs.org instead. http://clusterlabs.org/mailman/listinfo/users _______________________________________________ Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha