On 5/31/2013 7:12 AM, [email protected] wrote:
> Hi All,
> 
> We discovered the problem of the network of the corosync communication.
> 
> We composed a cluster of three nodes on KVM in corosync.
> 
> Step 1) Start corosync service in all nodes. 
> 
> Step 2) Confirm that a cluster is comprised of all nodes definitely and 
> became the OPERATIONAL state.
> 
> Step 3) Cut off the network of node1(rh64-coro1) and node2(rh64-coro2) from a 
> host of KVM.
> 
>        [root@kvm-host ~]# brctl delif virbr3 vnet5;brctl delif virbr2 vnet1
> 
> Step 4) Because a problem occurred, we stop all nodes.
> 
> 
> The problem occurs at the time of step 3.
> 
> One node(rh64-coro1) continues moving a state after becoming the OPERATIONAL 
> state.
> 
> Two nodes(rh64-coro2 and rh64-coro3) continue changing in a state.
> It seems to never change in an OPERATIONAL state while the first node 
> operates.
> 
> This means that two nodes(rh64-coro2 and rh64-coro3) cannot complete cluster 
> constitution.
> When this network trouble happens, by the setting that corosync combined with 
> Pacemaker, corosync cannot notify Pacemaker of the constitution change of the 
> cluster.
> 
> 
> Question 1) Are there any parameters to solve this problem in corosync.conf?
>  * We bundle up an interface(Bonding) and think that it can be settled by 
> appointing "rrp_mode:none", but do not want to appoint "rrp_mode:none".
> 
> Question 2) Is this a bug? Or is it specifications of the communication of 
> corosync?

We already checked this specific test, and it appears to be a bug in
the kernel bridge code when handling multicast traffic (groups are not
joined correctly and traffic is not forwarded).

Check this thread as reference:
http://lists.linuxfoundation.org/pipermail/openais/2013-April/016792.html

Thanks
Fabio


_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss

Reply via email to