[Linux-HA] Errors occurred when dynamically adding node to existing pacemaker cluster

samuel yu Mon, 26 Sep 2016 07:55:09 -0700

I found something abnormal during my operation about dynamically adding node
to existing pacemaker cluster.
My cluster is pcs/cman based, and versions of pcs/corosync are:
# rpm -qa |grep corosync
corosynclib-1.4.7-1.el6.x86_64
corosync-1.4.7-1.el6.x86_64
rpm -qa |grep pacemaker
pacemaker-cli-1.1.12-4.el6.x86_64
pacemaker-1.1.12-4.el6.x86_64
pacemaker-cluster-libs-1.1.12-4.el6.x86_64
pacemaker-libs-1.1.12-4.el6.x86_64


In current cluster, there are 3 nodes existing: node101/node103/node192(DC
node), now I want to dynamically adding node194 into the cluster. Then
something wrong:
Before adding, status of cluster are:
pcs status
Cluster name: kvm_storage
Last updated: Mon Sep 26 16:16:19 2016
Last change: Thu Sep 22 14:39:56 2016
Stack: cman
Current DC: 172.28.217.192 - partition with quorum
Version: 1.1.11-97629de
6 Nodes configured
18 Resources configured

Online: [ 172.28.217.101 172.28.217.103 172.28.217.192 ]
OFFLINE: [ 172.28.217.193]
.....
.....

In node101, the conf file are /etc/cluster/cluster.conf:
<?xml version="1.0"?>
<cluster config_version="4" name="kvm_storage">
        <logging debug="off"/>
        <clusternodes>
                <clusternode name="172.28.217.101" nodeid="1"/>
                <clusternode name="172.28.217.103" nodeid="2"/>
                <clusternode name="172.28.217.192" nodeid="3"/>
        </clusternodes>
        <logging>
                <logging_daemon debug="on" 
logfile="/var/log/cluster/corosync.log"
name="corosync"/>
        </logging>
        <dlm enable_fencing="0"/>
        <totem token="32000"/>
        <quorumd interval="2" label="qdiskCluster73" min_score="7" tko="9">
                <heuristic program="ping -c 1 172.28.217.126 -t 2 -w 1" 
score="3"/>
                <heuristic program="ping -c 1 172.28.217.73 -t 2 -w 1" 
score="3"/>
                <heuristic program="/usr/local/odpm/checkFCStatus.sh" 
score="5"/>
        </quorumd>
</cluster>
Then I make the conf file of node194 according to node101, and only add one
line:
<clusternode name="172.28.217.194" nodeid="4"/>

then I started the service cman and pcs, it ran successfully.
I use command "ccs_sync -f /etc/cluster/cluster.conf" to update the conf
file of other nodes in this cluster.

[root@node194 ~]# service cman status
cluster is running.
But after typing command "pcs status", I found that in the area OFFLINE,
there are something abnormal:
[root@node194 ~]# pcs status
Cluster name: kvm_storage
Last updated: Mon Sep 26 16:19:40 2016
Last change: Mon Sep 26 14:56:26 2016
Stack: cman
Current DC: 172.28.217.192 - partition with quorum
Version: 1.1.11-97629de
6 Nodes configured
18 Resources configured

Online: [ 172.28.217.101 172.28.217.103 172.28.217.192 ]
OFFLINE: [ 172.28.217.193 *172.28.217.194* *Node4 *] 

which seems like that node194 are added two times respectively in forms of
IP format and in forms of nodename format, so the cluster was not able to
bring this node online.

I have checked the relevant log in DC node192, and find that in
/var/log/cluster/corosync.log
Sep 26 16:29:24 [4556] node192       crmd:  warning: crm_find_peer:     Node
'*Node4*' and '*172.28.217.194*' share the same cluster nodeid: 4
Sep 26 16:29:24 [4556] node192       crmd:  warning: crm_find_peer:     Node
'Node4' and '172.28.217.194' share the same cluster nodeid: 4
Sep 26 16:29:24 [4556] node192       crmd:   notice: election_count_vote: 
Election 35241 (current: 35241, owner: 172.28.217.192): Processed no-vote
from 172.28.217.194 (*Peer is not part of our cluster*)

I don't know what to do about this? Are there any parameters in pacemaker's
conf file to combine the IP address with node name?

If any problems, just mail me.
Many thanks for your great support!!







--
View this message in context: 
http://linux-ha.996297.n3.nabble.com/Errors-occurred-when-dynamically-adding-node-to-existing-pacemaker-cluster-tp16247.html
Sent from the Linux-HA mailing list archive at Nabble.com.
_______________________________________________
Linux-HA mailing list is closing down.
Please subscribe to us...@clusterlabs.org instead.
http://clusterlabs.org/mailman/listinfo/users
_______________________________________________
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha

[Linux-HA] Errors occurred when dynamically adding node to existing pacemaker cluster

Reply via email to