Re: [Openais] Redundant rings : weird behaviour

Thomas Guthmann Thu, 07 Jan 2010 18:22:34 -0800

Hi Michael

 > - did you try another port in the second ring, not 5405?
Yes. 5406.


 > - Does tcpdump show the packets in the line?
Yes.
On hostB I can see packets from hostA who has a "dead" corosync.
    IP 123.201.179.21.5405 > 226.94.1.2.5406: UDP, length 376

Let's explain everything correctly because my previous post wasn't clear 
and neat.

*Issue*
   # corosync-cfgtool -s
        Printing ring status.
        Local node ID 399026368
        Could not get the ring status, the error is: 6

*Context*
- My configuration works until I decide to use a second ring.
- To add a second ring, I add in corosync
   + a new interface (ringnumber: 1, different multicast IP + port)
   + a new parameter: "rrp_mode: active"
   + restart corosync, even reboot the instance

*Info*
- During boot time, I have plenty of lines like the following :
   "[TOTEM ] Process pause detected for xxx ms flushing membership 
messages."
   + with one working ring, it stops after ~ 500ms followed by pcmk info
   + with 2 rings, it stops at 25000ms and not pcmk messages follow
- Corosync doesn't stop/restart with 2 rings. It blocks on :
   "Waiting for services to unload:... ".
   it's hard to strace due to the multicast flooding

*Config*
- Corosync 1.1.2
- Pacemaker 1.0.6
- Centos 5.4

*Questions*
- What means error 6 ?
- What corosync is waiting for during boot time or when shutting down ?

Any ideas ? :)

-Thomas

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Re: [Openais] Redundant rings : weird behaviour

Reply via email to