On 4/17/2013 3:57 PM, eXeC001er wrote: > Hello. > > I have tried to create the following demo-cluster to check how work > MasterWins logic: > > NODE1 (VM) > |========== tap0 (host) > NODE2 (VM) > |=============br0(host) > NODE3 (VM) > |========== tap1 (host) > NODE3 (VM) > > > To simulate 50/50 split i just remove "tap1" from "br0". > > before split i have the following on all nodes > > ---------------------- > Quorate: Yes > Nodeid Votes Qdevice Name > 1 1 A,V,MW 172.18.251.41 > 2 1 A,NV,MW 172.18.251.42 (local) > 3 1 NA,NV,MW 172.18.251.43 > 4 1 A,NV,MW 172.18.251.44 > 0 3 QDEV > > ---------------------- > > after split > > on NODE1 and NODE2 i see > > ---------------------- > Quorate: Yes > Nodeid Votes Qdevice Name > 1 1 A,V,MW 172.18.251.41 (local) > 2 1 A,NV,MW 172.18.251.42 > 0 3 QDEV > ---------------------- > > on NODE2 and NODE3 i see > > ---------------------- > Quorate: No > Nodeid Votes Qdevice Name > 3 1 A,NV,MW 172.18.251.43 > 4 1 A,NV,MW 172.18.251.44 (local) > 0 3 QDEV > ---------------------- > > So everything fine and MasterWins works as designed. > > But after check i tried to restore network connection and added "tap1" > to "br0". I see that all nodes can ping to each other. but corosync > still show me 50/50 split. > > tcpdump: > ..................... > 17:49:36.387217 IP 172.18.251.43.5404 > 172.18.251.44.5405: UDP, length 74 > 17:49:36.387441 IP 172.18.251.44.5404 > 172.18.251.43.5405: UDP, length 74 > 17:49:36.447590 IP 172.18.251.41.5404 > 172.18.251.42.5405: UDP, length 74 > 17:49:36.447811 IP 172.18.251.42.5404 > 172.18.251.41.5405: UDP, length 74 > 17:49:36.568557 IP 172.18.251.43.5404 > 172.18.251.44.5405: UDP, length 74 > 17:49:36.568804 IP 172.18.251.44.5404 > 172.18.251.43.5405: UDP, length 74 > 17:49:36.587829 IP 172.18.251.43.5404 > 239.255.1.1.5405: UDP, length 87 > 17:49:36.628254 IP 172.18.251.41.5404 > 172.18.251.42.5405: UDP, length 74 > 17:49:36.628442 IP 172.18.251.42.5404 > 172.18.251.41.5405: UDP, length 74 > 17:49:36.648323 IP 172.18.251.41.5404 > 239.255.1.1.5405: UDP, length 87 > ........................ > > > Any ideas ? >
Beside the missing logs that might show something, I have tested this scenario plenty times but using iptables instead. I wonder if you have found a bug in the bridging code. I suggest you try the following test instead: 4 nodes, without qdisk, try to repeat your bridge remove/add test 4 nodes, without qdisk, use iptables instead (make sure block mcast traffic too) then again with qdisk + iptables. But also collect the logs.. otherwise tcpdump doesn´t say enough. Fabio _______________________________________________ Openais mailing list Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais