Hi everybody,

we have been using corosync directly to provide clustering for GFS2 on our 
centos 7.2 pools with only one network interface and all has been working great 
so far!

We now have a new set-up with two network interfaces for every host in the 
cluster:
A -> 1 Gbit (the one we would like corosync to use, 10.220.88.X)
B -> 10 Gbit (used for iscsi connection to storage, 10.220.246.X)

when we run corosync in this mode we get the logs continuously spammed by 
messages like these:

[12880] cl15-02 corosyncdebug   [TOTEM ] entering GATHER state from 0(consensus 
timeout).
[12880] cl15-02 corosyncdebug   [TOTEM ] Creating commit token because I am the 
rep.
[12880] cl15-02 corosyncdebug   [TOTEM ] Saving state aru 10 high seq received 
10
[12880] cl15-02 corosyncdebug   [MAIN  ] Storing new sequence id for ring 5750
[12880] cl15-02 corosyncdebug   [TOTEM ] entering COMMIT state.
[12880] cl15-02 corosyncdebug   [TOTEM ] got commit token
[12880] cl15-02 corosyncdebug   [TOTEM ] entering RECOVERY state.
[12880] cl15-02 corosyncdebug   [TOTEM ] TRANS [0] member 10.220.88.41:
[12880] cl15-02 corosyncdebug   [TOTEM ] TRANS [1] member 10.220.88.47:
[12880] cl15-02 corosyncdebug   [TOTEM ] position [0] member 10.220.88.41:
[12880] cl15-02 corosyncdebug   [TOTEM ] previous ring seq 574c rep 10.220.88.41
[12880] cl15-02 corosyncdebug   [TOTEM ] aru 10 high delivered 10 received flag 
1
[12880] cl15-02 corosyncdebug   [TOTEM ] position [1] member 10.220.88.47:
[12880] cl15-02 corosyncdebug   [TOTEM ] previous ring seq 574c rep 10.220.88.41
[12880] cl15-02 corosyncdebug   [TOTEM ] aru 10 high delivered 10 received flag 
1

[12880] cl15-02 corosyncdebug   [TOTEM ] Did not need to originate any messages 
in recovery.
[12880] cl15-02 corosyncdebug   [TOTEM ] got commit token
[12880] cl15-02 corosyncdebug   [TOTEM ] Sending initial ORF token
[12880] cl15-02 corosyncdebug   [TOTEM ] token retrans flag is 0 my set retrans 
flag0 retrans queue empty 1 count 0, aru 0
[12880] cl15-02 corosyncdebug   [TOTEM ] install seq 0 aru 0 high seq received 0
[12880] cl15-02 corosyncdebug   [TOTEM ] token retrans flag is 0 my set retrans 
flag0 retrans queue empty 1 count 1, aru 0
[12880] cl15-02 corosyncdebug   [TOTEM ] install seq 0 aru 0 high seq received 0
[12880] cl15-02 corosyncdebug   [TOTEM ] token retrans flag is 0 my set retrans 
flag0 retrans queue empty 1 count 2, aru 0
[12880] cl15-02 corosyncdebug   [TOTEM ] install seq 0 aru 0 high seq received 0
[12880] cl15-02 corosyncdebug   [TOTEM ] token retrans flag is 0 my set retrans 
flag0 retrans queue empty 1 count 3, aru 0
[12880] cl15-02 corosyncdebug   [TOTEM ] install seq 0 aru 0 high seq received 0
[12880] cl15-02 corosyncdebug   [TOTEM ] retrans flag count 4 token aru 0 
install seq 0 aru 0 0
[12880] cl15-02 corosyncdebug   [TOTEM ] Resetting old ring state
[12880] cl15-02 corosyncdebug   [TOTEM ] recovery to regular 1-0
[12880] cl15-02 corosyncdebug   [TOTEM ] waiting_trans_ack changed to 1
Apr 11 16:19:54 [13372] cl15-02 pacemakerd:     info: pcmk_quorum_notification: 
Membership 22352: quorum retained (2)
Apr 11 16:19:54 [13378] cl15-02       crmd:     info: pcmk_quorum_notification: 
Membership 22352: quorum retained (2)
[12880] cl15-02 corosyncdebug   [TOTEM ] entering OPERATIONAL state.
[12880] cl15-02 corosyncnotice  [TOTEM ] A new membership (10.220.88.41:22352) 
was formed. Members
[12880] cl15-02 corosyncdebug   [SYNC  ] Committing synchronization for 
corosync configuration map access
Apr 11 16:19:54 [13373] cl15-02        cib:     info: cib_process_request:      
Forwarding cib_modify operation for section nodes to master 
(origin=local/crmd/27157)
[12880] cl15-02 corosyncdebug   [CMAP  ] Not first sync -> no action
Apr 11 16:19:54 [13373] cl15-02        cib:     info: cib_process_request:      
Forwarding cib_modify operation for section status to master 
(origin=local/crmd/27158)
[12880] cl15-02 corosyncdebug   [CPG   ] got joinlist message from node 0x2
[12880] cl15-02 corosyncdebug   [CPG   ] comparing: sender r(0) 
ip(10.220.88.41) ; members(old:2 left:0)
[12880] cl15-02 corosyncdebug   [CPG   ] comparing: sender r(0) 
ip(10.220.88.47) ; members(old:2 left:0)
[12880] cl15-02 corosyncdebug   [CPG   ] chosen downlist: sender r(0) 
ip(10.220.88.41) ; members(old:2 left:0)
[12880] cl15-02 corosyncdebug   [CPG   ] got joinlist message from node 0x1
[12880] cl15-02 corosyncdebug   [SYNC  ] Committing synchronization for 
corosync cluster closed process group service v1.01
Apr 11 16:19:54 [13373] cl15-02        cib:     info: cib_process_request:      
Completed cib_modify operation for section nodes: OK (rc=0, 
origin=cl15-02/crmd/27157, version=0.18.22)
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[0] group:clvmd, 
ip:r(0) ip(10.220.88.41) , pid:35677
Apr 11 16:19:54 [13373] cl15-02        cib:     info: cib_process_request:      
Completed cib_modify operation for section status: OK (rc=0, 
origin=cl15-02/crmd/27158, version=0.18.22)
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[1] 
group:dlm:ls:clvmd\x00, ip:r(0) ip(10.220.88.41) , pid:34995
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[2] 
group:dlm:controld\x00, ip:r(0) ip(10.220.88.41) , pid:34995
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[3] group:crmd\x00, 
ip:r(0) ip(10.220.88.41) , pid:13378
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[4] group:attrd\x00, 
ip:r(0) ip(10.220.88.41) , pid:13376
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[5] 
group:stonith-ng\x00, ip:r(0) ip(10.220.88.41) , pid:13374
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[6] group:cib\x00, 
ip:r(0) ip(10.220.88.41) , pid:13373
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[7] 
group:pacemakerd\x00, ip:r(0) ip(10.220.88.41) , pid:13372
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[8] group:crmd\x00, 
ip:r(0) ip(10.220.88.47) , pid:12879
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[9] group:attrd\x00, 
ip:r(0) ip(10.220.88.47) , pid:12877
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[10] 
group:stonith-ng\x00, ip:r(0) ip(10.220.88.47) , pid:12875
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[11] group:cib\x00, 
ip:r(0) ip(10.220.88.47) , pid:12874
[12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[12] 
group:pacemakerd\x00, ip:r(0) ip(10.220.88.47) , pid:12873
[12880] cl15-02 corosyncdebug   [VOTEQ ] flags: quorate: Yes Leaving: No WFA 
Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No 
QdeviceMasterWins: No
[12880] cl15-02 corosyncdebug   [VOTEQ ] got nodeinfo message from cluster node 
1
[12880] cl15-02 corosyncdebug   [VOTEQ ] nodeinfo message[1]: votes: 1, 
expected: 3 flags: 1
[12880] cl15-02 corosyncdebug   [VOTEQ ] flags: quorate: Yes Leaving: No WFA 
Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No 
QdeviceMasterWins: No
[12880] cl15-02 corosyncdebug   [VOTEQ ] total_votes=2, expected_votes=3
[12880] cl15-02 corosyncdebug   [VOTEQ ] node 1 state=1, votes=1, expected=3
[12880] cl15-02 corosyncdebug   [VOTEQ ] node 2 state=1, votes=1, expected=3
[12880] cl15-02 corosyncdebug   [VOTEQ ] node 3 state=2, votes=1, expected=3
[12880] cl15-02 corosyncdebug   [VOTEQ ] lowest node id: 1 us: 1
[12880] cl15-02 corosyncdebug   [VOTEQ ] highest node id: 2 us: 1
[12880] cl15-02 corosyncdebug   [VOTEQ ] got nodeinfo message from cluster node 
1
[12880] cl15-02 corosyncdebug   [VOTEQ ] nodeinfo message[0]: votes: 0, 
expected: 0 flags: 0
[12880] cl15-02 corosyncdebug   [VOTEQ ] got nodeinfo message from cluster node 
2
[12880] cl15-02 corosyncdebug   [VOTEQ ] nodeinfo message[2]: votes: 1, 
expected: 3 flags: 1
[12880] cl15-02 corosyncdebug   [VOTEQ ] flags: quorate: Yes Leaving: No WFA 
Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No 
QdeviceMasterWins: No
[12880] cl15-02 corosyncdebug   [VOTEQ ] got nodeinfo message from cluster node 
2
[12880] cl15-02 corosyncdebug   [VOTEQ ] nodeinfo message[0]: votes: 0, 
expected: 0 flags: 0
[12880] cl15-02 corosyncdebug   [SYNC  ] Committing synchronization for 
corosync vote quorum service v1.0
[12880] cl15-02 corosyncdebug   [VOTEQ ] total_votes=2, expected_votes=3
[12880] cl15-02 corosyncdebug   [VOTEQ ] node 1 state=1, votes=1, expected=3
[12880] cl15-02 corosyncdebug   [VOTEQ ] node 2 state=1, votes=1, expected=3
[12880] cl15-02 corosyncdebug   [VOTEQ ] node 3 state=2, votes=1, expected=3
[12880] cl15-02 corosyncdebug   [VOTEQ ] lowest node id: 1 us: 1
[12880] cl15-02 corosyncdebug   [VOTEQ ] highest node id: 2 us: 1
[12880] cl15-02 corosyncnotice  [QUORUM] Members[2]: 1 2
[12880] cl15-02 corosyncdebug   [QUORUM] sending quorum notification to (nil), 
length = 56
[12880] cl15-02 corosyncnotice  [MAIN  ] Completed service synchronization, 
ready to provide service.
[12880] cl15-02 corosyncdebug   [TOTEM ] waiting_trans_ack changed to 0
[12880] cl15-02 corosyncdebug   [QUORUM] got quorate request on 0x7f5a907749a0
[12880] cl15-02 corosyncdebug   [TOTEM ] entering GATHER state from 11(merge 
during join).


and we do not get them when there is only a single network interface in the 
systems.

--------------------------------------------------------------------------------------
These are the network configurations on the three hosts:

[root@cl15-02 ~]# ifconfig | grep inet
        inet 10.220.88.41  netmask 255.255.248.0  broadcast 10.220.95.255
        inet 10.220.246.50  netmask 255.255.255.0  broadcast 10.220.246.255
        inet 127.0.0.1  netmask 255.0.0.0

[root@cl15-08 ~]# ifconfig | grep inet
        inet 10.220.88.47  netmask 255.255.248.0  broadcast 10.220.95.255
        inet 10.220.246.51  netmask 255.255.255.0  broadcast 10.220.246.255
        inet 127.0.0.1  netmask 255.0.0.0

[root@cl15-09 ~]# ifconfig | grep inet
        inet 10.220.88.48  netmask 255.255.248.0  broadcast 10.220.95.255
        inet 10.220.246.59  netmask 255.255.255.0  broadcast 10.220.246.255
        inet 127.0.0.1  netmask 255.0.0.0

-----------------------------------------------------------------------------------
corosync-quorumtool output:

[root@cl15-02 ~]# corosync-quorumtool
Quorum information
------------------
Date:             Mon Apr 11 15:46:26 2016
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          1
Ring ID:          18952
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
         1          1 cl15-02 (local)
         2          1 cl15-08
         3          1 cl15-09

---------------------------------------------------------------------------
/etc/corosync/corosync.conf:

[root@cl15-02 ~]# cat /etc/corosync/corosync.conf
totem {
    version: 2
    secauth: off
    cluster_name: gfs_cluster
    transport: udpu
}

nodelist {
    node {
        ring0_addr: cl15-02
        nodeid: 1
    }

    node {
        ring0_addr: cl15-08
        nodeid: 2
    }

    node {
        ring0_addr: cl15-09
        nodeid: 3
    }
}

quorum {
    provider: corosync_votequorum
}

logging {
    debug: on
    to_logfile: yes
    logfile: /var/log/cluster/corosync.log
    to_syslog: yes
}

-- 
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to