Re: [ClusterLabs] Users Digest, Vol 15, Issue 18
thank you Digimer ,i will try it on centos 7. thanks again for the help. On Wed, Apr 13, 2016 at 1:00 PM, wrote: > Send Users mailing list submissions to > users@clusterlabs.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://clusterlabs.org/mailman/listinfo/users > or, via email, send a message with subject or body 'help' to > users-requ...@clusterlabs.org > > You can reach the person managing the list at > users-ow...@clusterlabs.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Users digest..." > > > Today's Topics: > >1. Re: Totem is unable to form a cluster because of an operating > system or network fault (Digimer) >2. HA meetup at OpenStack Summit (Ken Gaillot) >3. Re: HA meetup at OpenStack Summit (Digimer) >4. Re: Totem is unable to form a cluster because of an operating > system or network fault (Jan Friesse) > > > -- > > Message: 1 > Date: Tue, 12 Apr 2016 11:46:21 -0400 > From: Digimer > To: Cluster Labs - All topics related to open-source clustering > welcomed > Subject: Re: [ClusterLabs] Totem is unable to form a cluster because > of an operating system or network fault > Message-ID: <570d184d.6000...@alteeve.ca> > Content-Type: text/plain; charset=windows-1252 > > On 12/04/16 07:44 AM, dinor geler wrote: > > Hi , > > Am trying to configure my sql on ubuntu according to this article : > > > https://azure.microsoft.com/en-in/documentation/articles/virtual-machines-linux-classic-mysql-cluster/ > > > > two node cluster > > > > > > looking on corosync log : > > > > > > Apr 12 11:01:09 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:11 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:13 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:16 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:18 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:20 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:22 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:24 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:27 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:29 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > Apr 12 11:01:31 corosync [TOTEM ] Totem is unable to form a cluster > > because of an operating system or network fault. The most common cause > > of this message is that the local firewall is configured improperly. > > > > > > > > totem { > > version: 2 > > crypto_cipher: none > > crypto_hash: none > > interface { > > ringnumber: 0 > > bindnetaddr: 10.1.0.0 > > mcastport: 5405 > > ttl: 1 > > } > > transport: udpu > > } > > logging { > > fileline: off > > to_logfile: yes > > to_syslog: yes > > logfile: /var/log/corosync/corosync.log > > debug: off > > timestamp: on > > logger_subsys { > > subsys: QUORUM > > debug: off > > } > > } > > nodelist { > > node { > > ring0_addr: 10.1.0.6 > > nodeid: 1 > > } > > node { > > ring0_addr: 10.1.0.7 > > nodeid: 2 > > } > > } > > quorum { > > provider: corosync_votequorum > > } > > > > > > If I initiate a tcpdump on node 2 and start either a netcat or nmap I > > see packet arrives to destination host for port 5405 UDP traffic > > > >
Re: [ClusterLabs] service flap as nodes join and leave
On Wed, Apr 13, 2016, at 12:36 PM, Ken Gaillot wrote: > On 04/13/2016 11:23 AM, Christopher Harvey wrote: > > I have a 3 node cluster (see the bottom of this email for 'pcs config' > > output) with 3 nodes. The MsgBB-Active and AD-Active service both flap > > whenever a node joins or leaves the cluster. I trigger the leave and > > join with a pacemaker service start and stop on any node. > > That's the default behavior of clones used in ordering constraints. If > you set interleave=true on your clones, each dependent clone instance > will only care about the depended-on instances on its own node, rather > than all nodes. > > See > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_clone_options > > While the interleave=true behavior is much more commonly used, > interleave=false is the default because it's safer -- the cluster > doesn't know anything about the cloned service, so it can't assume the > service is OK with it. Since you know what your service does, you can > set interleave=true for services that can handle it. Hi Ken, Thanks for pointing out that attribute to me. I applied it as follows: Clone: Router-clone Meta Attrs: clone-max=2 clone-node-max=1 interleave=true Resource: Router (class=ocf provider=solace type=Router) Meta Attrs: migration-threshold=1 failure-timeout=1s Operations: start interval=0s timeout=2 (Router-start-interval-0s) stop interval=0s timeout=2 (Router-stop-interval-0s) monitor interval=1s (Router-monitor-interval-1s) It doesn't seems to change the behavior. Moreover, I found that I can start/stop the pacemaker instance on the vmr-123-5 node and produce the same flap on the MsgBB-Active resource on vmr-132-3 node. The Router clones are never shutdown or started. I would have thought if everything else in the cluster is constant, vmr-132-5 could never affect resources on the other two. > > Here is the happy steady state setup: > > > > 3 nodes and 4 resources configured > > > > Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] > > > > Clone Set: Router-clone [Router] > > Started: [ vmr-132-3 vmr-132-4 ] > > MsgBB-Active(ocf::solace:MsgBB-Active): Started vmr-132-3 > > AD-Active (ocf::solace:AD-Active):Started vmr-132-3 > > > > [root@vmr-132-4 ~]# supervisorctl stop pacemaker > > no change, except vmr-132-4 goes offline > > [root@vmr-132-4 ~]# supervisorctl start pacemaker > > vmr-132-4 comes back online > > MsgBB-Active and AD-Active flap very quickly (<1s) > > Steady state is resumed. > > > > Why should the fact that vmr-132-4 coming and going affect the service > > on any other node? > > > > Thanks, > > Chris > > > > Cluster Name: > > Corosync Nodes: > > 192.168.132.5 192.168.132.4 192.168.132.3 > > Pacemaker Nodes: > > vmr-132-3 vmr-132-4 vmr-132-5 > > > > Resources: > > Clone: Router-clone > > Meta Attrs: clone-max=2 clone-node-max=1 > > Resource: Router (class=ocf provider=solace type=Router) > >Meta Attrs: migration-threshold=1 failure-timeout=1s > >Operations: start interval=0s timeout=2 (Router-start-timeout-2) > >stop interval=0s timeout=2 (Router-stop-timeout-2) > >monitor interval=1s (Router-monitor-interval-1s) > > Resource: MsgBB-Active (class=ocf provider=solace type=MsgBB-Active) > > Meta Attrs: migration-threshold=2 failure-timeout=1s > > Operations: start interval=0s timeout=2 (MsgBB-Active-start-timeout-2) > > stop interval=0s timeout=2 (MsgBB-Active-stop-timeout-2) > > monitor interval=1s (MsgBB-Active-monitor-interval-1s) > > Resource: AD-Active (class=ocf provider=solace type=AD-Active) > > Meta Attrs: migration-threshold=2 failure-timeout=1s > > Operations: start interval=0s timeout=2 (AD-Active-start-timeout-2) > > stop interval=0s timeout=2 (AD-Active-stop-timeout-2) > > monitor interval=1s (AD-Active-monitor-interval-1s) > > > > Stonith Devices: > > Fencing Levels: > > > > Location Constraints: > > Resource: AD-Active > > Disabled on: vmr-132-5 (score:-INFINITY) (id:ADNotOnMonitor) > > Resource: MsgBB-Active > > Enabled on: vmr-132-4 (score:100) (id:vmr-132-4Priority) > > Enabled on: vmr-132-3 (score:250) (id:vmr-132-3Priority) > > Disabled on: vmr-132-5 (score:-INFINITY) (id:MsgBBNotOnMonitor) > > Resource: Router-clone > > Disabled on: vmr-132-5 (score:-INFINITY) (id:RouterNotOnMonitor) > > Ordering Constraints: > > Resource Sets: > > set Router-clone MsgBB-Active sequential=true > > (id:pcs_rsc_set_Router-clone_MsgBB-Active) setoptions kind=Mandatory > > (id:pcs_rsc_order_Router-clone_MsgBB-Active) > > set MsgBB-Active AD-Active sequential=true > > (id:pcs_rsc_set_MsgBB-Active_AD-Active) setoptions kind=Mandatory > > (id:pcs_rsc_order_MsgBB-Active_AD-Active) > > Colocation Constraints: > > MsgBB-Active with Router-clone (score:INFINITY) > > (id:colocation-MsgBB-
Re: [ClusterLabs] service flap as nodes join and leave
On 04/13/2016 11:23 AM, Christopher Harvey wrote: > I have a 3 node cluster (see the bottom of this email for 'pcs config' > output) with 3 nodes. The MsgBB-Active and AD-Active service both flap > whenever a node joins or leaves the cluster. I trigger the leave and > join with a pacemaker service start and stop on any node. That's the default behavior of clones used in ordering constraints. If you set interleave=true on your clones, each dependent clone instance will only care about the depended-on instances on its own node, rather than all nodes. See http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_clone_options While the interleave=true behavior is much more commonly used, interleave=false is the default because it's safer -- the cluster doesn't know anything about the cloned service, so it can't assume the service is OK with it. Since you know what your service does, you can set interleave=true for services that can handle it. > Here is the happy steady state setup: > > 3 nodes and 4 resources configured > > Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] > > Clone Set: Router-clone [Router] > Started: [ vmr-132-3 vmr-132-4 ] > MsgBB-Active(ocf::solace:MsgBB-Active): Started vmr-132-3 > AD-Active (ocf::solace:AD-Active):Started vmr-132-3 > > [root@vmr-132-4 ~]# supervisorctl stop pacemaker > no change, except vmr-132-4 goes offline > [root@vmr-132-4 ~]# supervisorctl start pacemaker > vmr-132-4 comes back online > MsgBB-Active and AD-Active flap very quickly (<1s) > Steady state is resumed. > > Why should the fact that vmr-132-4 coming and going affect the service > on any other node? > > Thanks, > Chris > > Cluster Name: > Corosync Nodes: > 192.168.132.5 192.168.132.4 192.168.132.3 > Pacemaker Nodes: > vmr-132-3 vmr-132-4 vmr-132-5 > > Resources: > Clone: Router-clone > Meta Attrs: clone-max=2 clone-node-max=1 > Resource: Router (class=ocf provider=solace type=Router) >Meta Attrs: migration-threshold=1 failure-timeout=1s >Operations: start interval=0s timeout=2 (Router-start-timeout-2) >stop interval=0s timeout=2 (Router-stop-timeout-2) >monitor interval=1s (Router-monitor-interval-1s) > Resource: MsgBB-Active (class=ocf provider=solace type=MsgBB-Active) > Meta Attrs: migration-threshold=2 failure-timeout=1s > Operations: start interval=0s timeout=2 (MsgBB-Active-start-timeout-2) > stop interval=0s timeout=2 (MsgBB-Active-stop-timeout-2) > monitor interval=1s (MsgBB-Active-monitor-interval-1s) > Resource: AD-Active (class=ocf provider=solace type=AD-Active) > Meta Attrs: migration-threshold=2 failure-timeout=1s > Operations: start interval=0s timeout=2 (AD-Active-start-timeout-2) > stop interval=0s timeout=2 (AD-Active-stop-timeout-2) > monitor interval=1s (AD-Active-monitor-interval-1s) > > Stonith Devices: > Fencing Levels: > > Location Constraints: > Resource: AD-Active > Disabled on: vmr-132-5 (score:-INFINITY) (id:ADNotOnMonitor) > Resource: MsgBB-Active > Enabled on: vmr-132-4 (score:100) (id:vmr-132-4Priority) > Enabled on: vmr-132-3 (score:250) (id:vmr-132-3Priority) > Disabled on: vmr-132-5 (score:-INFINITY) (id:MsgBBNotOnMonitor) > Resource: Router-clone > Disabled on: vmr-132-5 (score:-INFINITY) (id:RouterNotOnMonitor) > Ordering Constraints: > Resource Sets: > set Router-clone MsgBB-Active sequential=true > (id:pcs_rsc_set_Router-clone_MsgBB-Active) setoptions kind=Mandatory > (id:pcs_rsc_order_Router-clone_MsgBB-Active) > set MsgBB-Active AD-Active sequential=true > (id:pcs_rsc_set_MsgBB-Active_AD-Active) setoptions kind=Mandatory > (id:pcs_rsc_order_MsgBB-Active_AD-Active) > Colocation Constraints: > MsgBB-Active with Router-clone (score:INFINITY) > (id:colocation-MsgBB-Active-Router-clone-INFINITY) > AD-Active with MsgBB-Active (score:1000) > (id:colocation-AD-Active-MsgBB-Active-1000) > > Resources Defaults: > No defaults set > Operations Defaults: > No defaults set > > Cluster Properties: > cluster-infrastructure: corosync > cluster-recheck-interval: 1s > dc-version: 1.1.13-10.el7_2.2-44eb2dd > have-watchdog: false > maintenance-mode: false > start-failure-is-fatal: false > stonith-enabled: false ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] service flap as nodes join and leave
I have a 3 node cluster (see the bottom of this email for 'pcs config' output) with 3 nodes. The MsgBB-Active and AD-Active service both flap whenever a node joins or leaves the cluster. I trigger the leave and join with a pacemaker service start and stop on any node. Here is the happy steady state setup: 3 nodes and 4 resources configured Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ] Clone Set: Router-clone [Router] Started: [ vmr-132-3 vmr-132-4 ] MsgBB-Active(ocf::solace:MsgBB-Active): Started vmr-132-3 AD-Active (ocf::solace:AD-Active):Started vmr-132-3 [root@vmr-132-4 ~]# supervisorctl stop pacemaker no change, except vmr-132-4 goes offline [root@vmr-132-4 ~]# supervisorctl start pacemaker vmr-132-4 comes back online MsgBB-Active and AD-Active flap very quickly (<1s) Steady state is resumed. Why should the fact that vmr-132-4 coming and going affect the service on any other node? Thanks, Chris Cluster Name: Corosync Nodes: 192.168.132.5 192.168.132.4 192.168.132.3 Pacemaker Nodes: vmr-132-3 vmr-132-4 vmr-132-5 Resources: Clone: Router-clone Meta Attrs: clone-max=2 clone-node-max=1 Resource: Router (class=ocf provider=solace type=Router) Meta Attrs: migration-threshold=1 failure-timeout=1s Operations: start interval=0s timeout=2 (Router-start-timeout-2) stop interval=0s timeout=2 (Router-stop-timeout-2) monitor interval=1s (Router-monitor-interval-1s) Resource: MsgBB-Active (class=ocf provider=solace type=MsgBB-Active) Meta Attrs: migration-threshold=2 failure-timeout=1s Operations: start interval=0s timeout=2 (MsgBB-Active-start-timeout-2) stop interval=0s timeout=2 (MsgBB-Active-stop-timeout-2) monitor interval=1s (MsgBB-Active-monitor-interval-1s) Resource: AD-Active (class=ocf provider=solace type=AD-Active) Meta Attrs: migration-threshold=2 failure-timeout=1s Operations: start interval=0s timeout=2 (AD-Active-start-timeout-2) stop interval=0s timeout=2 (AD-Active-stop-timeout-2) monitor interval=1s (AD-Active-monitor-interval-1s) Stonith Devices: Fencing Levels: Location Constraints: Resource: AD-Active Disabled on: vmr-132-5 (score:-INFINITY) (id:ADNotOnMonitor) Resource: MsgBB-Active Enabled on: vmr-132-4 (score:100) (id:vmr-132-4Priority) Enabled on: vmr-132-3 (score:250) (id:vmr-132-3Priority) Disabled on: vmr-132-5 (score:-INFINITY) (id:MsgBBNotOnMonitor) Resource: Router-clone Disabled on: vmr-132-5 (score:-INFINITY) (id:RouterNotOnMonitor) Ordering Constraints: Resource Sets: set Router-clone MsgBB-Active sequential=true (id:pcs_rsc_set_Router-clone_MsgBB-Active) setoptions kind=Mandatory (id:pcs_rsc_order_Router-clone_MsgBB-Active) set MsgBB-Active AD-Active sequential=true (id:pcs_rsc_set_MsgBB-Active_AD-Active) setoptions kind=Mandatory (id:pcs_rsc_order_MsgBB-Active_AD-Active) Colocation Constraints: MsgBB-Active with Router-clone (score:INFINITY) (id:colocation-MsgBB-Active-Router-clone-INFINITY) AD-Active with MsgBB-Active (score:1000) (id:colocation-AD-Active-MsgBB-Active-1000) Resources Defaults: No defaults set Operations Defaults: No defaults set Cluster Properties: cluster-infrastructure: corosync cluster-recheck-interval: 1s dc-version: 1.1.13-10.el7_2.2-44eb2dd have-watchdog: false maintenance-mode: false start-failure-is-fatal: false stonith-enabled: false ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] HA meetup at OpenStack Summit
On 13/04/16 10:16 AM, Ken Gaillot wrote: > On 04/12/2016 06:39 PM, Digimer wrote: >> On 12/04/16 07:09 PM, Ken Gaillot wrote: >>> Hi everybody, >>> >>> The upcoming OpenStack Summit is April 25-29 in Austin, Texas (US). Some >>> regular ClusterLabs contributors are going, so I was wondering if anyone >>> would like to do an informal meetup sometime during the summit. >>> >>> It looks like the best time would be that Wednesday, either lunch (at >>> the venue) or dinner (offsite). It might also be possible to reserve a >>> small (10-person) meeting room, or just meet informally in the expo hall. >>> >>> Anyone interested? Preferences/conflicts? >> >> Informal meet-up, or to try and get work done? > > Informal, though of course HA will be the likely topic of conversation :) OK, would be expensive to come down for $drinks, but I think we're still working on something semi-official for late summer/early spring. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] HA meetup at OpenStack Summit
On 04/12/2016 06:39 PM, Digimer wrote: > On 12/04/16 07:09 PM, Ken Gaillot wrote: >> Hi everybody, >> >> The upcoming OpenStack Summit is April 25-29 in Austin, Texas (US). Some >> regular ClusterLabs contributors are going, so I was wondering if anyone >> would like to do an informal meetup sometime during the summit. >> >> It looks like the best time would be that Wednesday, either lunch (at >> the venue) or dinner (offsite). It might also be possible to reserve a >> small (10-person) meeting room, or just meet informally in the expo hall. >> >> Anyone interested? Preferences/conflicts? > > Informal meet-up, or to try and get work done? Informal, though of course HA will be the likely topic of conversation :) ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] HA meetup at OpenStack Summit & Vault?
On 2016-04-12T19:39:25, Digimer wrote: Alas, I won't make it to the Summit, but if anyone else is at Vault (the week before in Raleigh), I'd be happy to meet! Regards, Lars -- Architect SDS SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Totem is unable to form a cluster because of an operating system or network fault
Hi , Am trying to configure my sql on ubuntu according to this article : https://azure.microsoft.com/en-in/documentation/articles/virtual-machines-linux-classic-mysql-cluster/ two node cluster looking on corosync log : Apr 12 11:01:09 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:11 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:13 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:16 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:18 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:20 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:22 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:24 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:27 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:29 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. Apr 12 11:01:31 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly. totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 10.1.0.0 mcastport: 5405 ttl: 1 } transport: udpu } logging { fileline: off to_logfile: yes to_syslog: yes logfile: /var/log/corosync/corosync.log debug: off timestamp: on logger_subsys { subsys: QUORUM debug: off } } nodelist { node { ring0_addr: 10.1.0.6 nodeid: 1 } node { ring0_addr: 10.1.0.7 nodeid: 2 } } quorum { provider: corosync_votequorum } If I initiate a tcpdump on node 2 and start either a netcat or nmap I see packet arrives to destination host for port 5405 UDP traffic I do see Corosync listening on the IP/PORT root@node-2:/home/dinor# netstat -an | grep -i 5405 udp0 0 10.1.0.7:5405 0.0.0.0:* root@node-1:/home/dinor# netstat -an | grep -i 5405 udp0 0 10.1.0.6:5405 0.0.0.0:* On node 1 I start a netcat to port 5405 via udp netcat -D -4 -u 10.1.0.7 5405 In here you type some text and hit enter On node 1 tcpdump we see data sent to IP 10.1.0.7 root@node-1:/var/log/corosync# tcpdump -n udp port 5405 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 10:08:24.484533 IP 10.1.0.6.44299 > 10.1.0.7.5405: UDP, length 26 On node 2 tcpdump I see the data arrive root@node-2:/var/log/corosync# tcpdump -n udp port 5405 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes 10:08:24.484892 IP 10.1.0.6.44299 > 10.1.0.7.5405: UDP, length 26 Tested also sending UDP packets from node 2 all ok. So connectivity seems to be ok. Port scanner also shows the port as Open root@node-1:/home/dinor# nmap -sUV 10.1.0.7 -p 5402-5405 Starting Nmap 5.21 ( http://nmap.org ) at 2016-04-12 10:31 UTC Nmap scan report for node-2 (10.1.0.7) Host is up (0.00060s latency). PORT STATE SERVICE VERSION 5402/udp closedunknown 5403/udp closedunknown 5404/udp closedunknown *5405/udp open|filtered unknown* MAC Address: 12:34:56:78:9A:BC (Unknown) Service detection performed. Please report any incorrect results at http://nmap.org/submit/ . Nmap done: 1 IP address (1 host up) scanned in 79.07 seconds There is no FW and