On 12/08/2010 08:36 AM, Ryan Steele wrote:
> Hey,
> 
> Just noticed a problem with Corosync 1.2.0-0ubuntu1, using a two-node cluster 
> configuring with redundant rings, that I
> found when testing my STONITH devices.  When an interface on either one of 
> the rings fails and is marked faulty, the
> interface for that same ring on the other node is also marked faulty 
> immediately.  This means that if any interface
> fails, the entire associated ring fails.  I mentioned it in IRC, and it was 
> believed to be a bug.  Here is my corosync.conf:
> 

That is how it is supposed to work.  Any interface that is faulty within
one ring will mark the entire ring faulty.  To reenable the ring, run
corosync-cfgtool -r (once the faulty network condition has been repaired).

Regards
-steve

> ######### begin corosync.conf
> compatibility: whitetank
> 
> totem {
>    version: 2
>    secauth: off
>    threads: 0
>    rrp_mode: passive
>    consensus: 1201
> 
>    interface {
>       ringnumber: 0
>       bindnetaddr: 192.168.192.0
>       mcastaddr: 227.94.1.1
>       mcastport: 5405
>    }
> 
>    interface {
>       ringnumber: 1
>       bindnetaddr: 10.1.0.0
>       mcastaddr: 227.94.1.2
>       mcastport: 5405
>    }
> }
> 
> logging {
>    fileline: off
>    to_stderr: yes
>    to_syslog: yes
>    syslog_facility: daemon
>    debug: off
>    timestamp: on
>    logger_subsys {
>       subsys: AMF
>       debug: off
>    }
> }
> 
> aisexec {
>    user:  root
>    group: root
> }
> 
> service {
>    name: pacemaker
>    ver:  0
> }
> ######### end corosync.conf
> 
> Please let me know if you need anything else to help diagnose this problem.  
> Also, I found a typo in the error message
> that appears in the logs ("adminisrtative" instead of "administrative"):
> 
> corosync[3419]:   [TOTEM ] Marking seqid 66284 ringid 1 interface 10.1.1.168 
> FAULTY - adminisrtative intervention required.
> 
> A "corosync-cfgtool -r" fixes the issue once the link is healthy again, but 
> it's definitely not optimal to have one
> interface failure bring down the entire ring.  Again, let me know if there's 
> anything else I can do to assist.  Thanks,
> and keep up the hard work!
> 
> 
> -Ryan
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to