Hey,
Just noticed a problem with Corosync 1.2.0-0ubuntu1, using a two-node cluster
configuring with redundant rings, that I
found when testing my STONITH devices. When an interface on either one of the
rings fails and is marked faulty, the
interface for that same ring on the other node is also marked faulty
immediately. This means that if any interface
fails, the entire associated ring fails. I mentioned it in IRC, and it was
believed to be a bug. Here is my corosync.conf:
######### begin corosync.conf
compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
rrp_mode: passive
consensus: 1201
interface {
ringnumber: 0
bindnetaddr: 192.168.192.0
mcastaddr: 227.94.1.1
mcastport: 5405
}
interface {
ringnumber: 1
bindnetaddr: 10.1.0.0
mcastaddr: 227.94.1.2
mcastport: 5405
}
}
logging {
fileline: off
to_stderr: yes
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
aisexec {
user: root
group: root
}
service {
name: pacemaker
ver: 0
}
######### end corosync.conf
Please let me know if you need anything else to help diagnose this problem.
Also, I found a typo in the error message
that appears in the logs ("adminisrtative" instead of "administrative"):
corosync[3419]: [TOTEM ] Marking seqid 66284 ringid 1 interface 10.1.1.168
FAULTY - adminisrtative intervention required.
A "corosync-cfgtool -r" fixes the issue once the link is healthy again, but
it's definitely not optimal to have one
interface failure bring down the entire ring. Again, let me know if there's
anything else I can do to assist. Thanks,
and keep up the hard work!
-Ryan
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais