Hi,

I'm using corosync 1.1.2, everything works fine when I use a single ring _but_ when I want to enable a second one I either have :
- "Could not get the ring status, the error is: 6"
- "[..] FAULTY - adminisrtative intervention required." (note the typo:)
- a second working ring (happened once so far on one host :)

Also corosync sometimes takes ages or doesn't restart (Waiting for services to unload:... I reckon it's only when I have an error 6). Could this be linked to the fact that pacemaker is running on top of corosync ?

Any ideas, docos to read are welcome.

Software used:
- corosync 1.1.2
- pacemaker 1.0.6
- centos 5.4

See corosync configuration and "faulty" logs in attachment.

Cheers,
Thomas



# Please read the corosync.conf.5 manual page
compatibility: none

service {
        # Load the Pacemaker Cluster Resource Manager
        name: pacemaker
        ver:  0
        use_mgmtd: yes
}


totem {
        version: 2
        secauth: off
        threads: 0
        rrp_mode: active
        interface {
                ringnumber: 0
                bindnetaddr: 192.168.200.0
                mcastaddr: 226.94.1.1
                mcastport: 5405
        }
        interface {
                ringnumber: 1
                bindnetaddr: 123.201.179.0
                mcastaddr: 226.94.1.2
                mcastport: 5405
        }
}

logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        to_syslog: yes
        logfile: /tmp/corosync.log
        debug: on
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

amf {
        mode: disabled
}
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = ring 0 active with no faults
RING ID 1
        id      = 123.245.120.23
        status  = Marking seqid 41 ringid 1 interface 123.245.120.23 FAULTY - 
adminisrtative intervention required.


>>>
[[email protected]:~]# corosync-cfgtool -r
Re-enabling all failed rings.
<<<

...t=0 ...

[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = ring 0 active with no faults
RING ID 1
        id      = 123.245.120.23
        status  = ring 1 active with no faults
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = ring 0 active with no faults
RING ID 1
        id      = 123.245.120.23
        status  = ring 1 active with no faults
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = Incrementing problem counter for seqid 2361 iface 
192.168.200.23 to [1 of 10]
RING ID 1
        id      = 123.245.120.23
        status  = ring 1 active with no faults
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = Incrementing problem counter for seqid 2361 iface 
192.168.200.23 to [1 of 10]
RING ID 1
        id      = 123.245.120.23
        status  = ring 1 active with no faults
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = Incrementing problem counter for seqid 2361 iface 
192.168.200.23 to [1 of 10]
RING ID 1
        id      = 123.245.120.23
        status  = Incrementing problem counter for seqid 2369 iface 
123.245.120.23 to [2 of 10]
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = Incrementing problem counter for seqid 2361 iface 
192.168.200.23 to [1 of 10]
RING ID 1
        id      = 123.245.120.23
        status  = Incrementing problem counter for seqid 2377 iface 
123.245.120.23 to [4 of 10]
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = Incrementing problem counter for seqid 2361 iface 
192.168.200.23 to [1 of 10]
RING ID 1
        id      = 123.245.120.23
        status  = Incrementing problem counter for seqid 2381 iface 
123.245.120.23 to [5 of 10]
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = ring 0 active with no faults
RING ID 1
        id      = 123.245.120.23
        status  = Incrementing problem counter for seqid 2385 iface 
123.245.120.23 to [5 of 10]
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = ring 0 active with no faults
RING ID 1
        id      = 123.245.120.23
        status  = Incrementing problem counter for seqid 2405 iface 
123.245.120.23 to [9 of 10]
[[email protected]:~]# corosync-cfgtool -s
Printing ring status.
Local node ID 399026368
RING ID 0
        id      = 192.168.200.23
        status  = ring 0 active with no faults
RING ID 1
        id      = 123.245.120.23
        status  = Marking seqid 2409 ringid 1 interface 123.245.120.23 FAULTY - 
adminisrtative intervention required.
[[email protected]:~]# 

... t=~5sec ... 

** /var/log/messages
Jan  5 15:47:49 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2361 iface 192.168.200.23 to [1 of 10]
Jan  5 15:47:49 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2365 iface 123.245.120.23 to [1 of 10]
Jan  5 15:47:49 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2369 iface 123.245.120.23 to [2 of 10]
Jan  5 15:47:50 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2373 iface 123.245.120.23 to [3 of 10]
Jan  5 15:47:50 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2377 iface 123.245.120.23 to [4 of 10]
Jan  5 15:47:50 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2381 iface 123.245.120.23 to [5 of 10]
Jan  5 15:47:51 tom-dns1 corosync[6270]:   [TOTEM ] ring 0 active with no faults
Jan  5 15:47:51 tom-dns1 corosync[6270]:   [TOTEM ] Decrementing problem 
counter for iface 123.245.120.23 to [4 of 10]
Jan  5 15:47:51 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2385 iface 123.245.120.23 to [5 of 10]
Jan  5 15:47:51 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2389 iface 123.245.120.23 to [6 of 10]
Jan  5 15:47:52 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2393 iface 123.245.120.23 to [7 of 10]
Jan  5 15:47:52 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2397 iface 123.245.120.23 to [8 of 10]
Jan  5 15:47:52 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2401 iface 123.245.120.23 to [9 of 10]
Jan  5 15:47:53 tom-dns1 corosync[6270]:   [TOTEM ] Decrementing problem 
counter for iface 123.245.120.23 to [8 of 10]
Jan  5 15:47:53 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2405 iface 123.245.120.23 to [9 of 10]
Jan  5 15:47:53 tom-dns1 corosync[6270]:   [TOTEM ] Incrementing problem 
counter for seqid 2409 iface 123.245.120.23 to [10 of 10]
Jan  5 15:47:53 tom-dns1 corosync[6270]:   [TOTEM ] Marking seqid 2409 ringid 1 
interface 123.245.120.23 FAULTY - adminisrtative intervention required.


_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to