Hey,

Just noticed a problem with Corosync 1.2.0-0ubuntu1, using a two-node cluster 
configuring with redundant rings, that I
found when testing my STONITH devices.  When an interface on either one of the 
rings fails and is marked faulty, the
interface for that same ring on the other node is also marked faulty 
immediately.  This means that if any interface
fails, the entire associated ring fails.  I mentioned it in IRC, and it was 
believed to be a bug.  Here is my corosync.conf:

######### begin corosync.conf
compatibility: whitetank

totem {
   version: 2
   secauth: off
   threads: 0
   rrp_mode: passive
   consensus: 1201

   interface {
      ringnumber: 0
      bindnetaddr: 192.168.192.0
      mcastaddr: 227.94.1.1
      mcastport: 5405
   }

   interface {
      ringnumber: 1
      bindnetaddr: 10.1.0.0
      mcastaddr: 227.94.1.2
      mcastport: 5405
   }
}

logging {
   fileline: off
   to_stderr: yes
   to_syslog: yes
   syslog_facility: daemon
   debug: off
   timestamp: on
   logger_subsys {
      subsys: AMF
      debug: off
   }
}

aisexec {
   user:  root
   group: root
}

service {
   name: pacemaker
   ver:  0
}
######### end corosync.conf

Please let me know if you need anything else to help diagnose this problem.  
Also, I found a typo in the error message
that appears in the logs ("adminisrtative" instead of "administrative"):

corosync[3419]:   [TOTEM ] Marking seqid 66284 ringid 1 interface 10.1.1.168 
FAULTY - adminisrtative intervention required.

A "corosync-cfgtool -r" fixes the issue once the link is healthy again, but 
it's definitely not optimal to have one
interface failure bring down the entire ring.  Again, let me know if there's 
anything else I can do to assist.  Thanks,
and keep up the hard work!


-Ryan
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to