On 08/03/2011 04:06 PM, David wrote:
> I have a 3 node RHCS cluster and prior to an VLAN change (moved the 
> cluster communications into its own VLAN) all three nodes were working.
> 
> Post VLAN migration 2 of the 3 nodes joined the cluster but a third is 
> failing when I start cman:
> 
> Starting cluster:
>     Checking Network Manager...                             [  OK  ]
>     Global setup...                                         [  OK  ]
>     Loading kernel modules...                               [  OK  ]
>     Mounting configfs...                                    [  OK  ]
>     Starting cman... Aug 03 22:58:26 corosync [MAIN  ] Corosync Cluster 
> Engine ('1.2.3'): started and ready to provide service.
> Aug 03 22:58:26 corosync [MAIN  ] Corosync built-in features: nss rdma
> Aug 03 22:58:26 corosync [MAIN  ] Successfully read config from 
> /etc/cluster/cluster.conf
> Aug 03 22:58:26 corosync [MAIN  ] Successfully parsed cman config
> Aug 03 22:58:26 corosync [TOTEM ] Token Timeout (10000 ms) retransmit 
> timeout (2380 ms)
> Aug 03 22:58:26 corosync [TOTEM ] token hold (1894 ms) retransmits 
> before loss (4 retrans)
> Aug 03 22:58:26 corosync [TOTEM ] join (60 ms) send_join (0 ms) 
> consensus (12000 ms) merge (200 ms)
> Aug 03 22:58:26 corosync [TOTEM ] downcheck (1000 ms) fail to recv const 
> (2500 msgs)
> Aug 03 22:58:26 corosync [TOTEM ] seqno unchanged const (30 rotations) 
> Maximum network MTU 1402
> Aug 03 22:58:26 corosync [TOTEM ] window size per rotation (50 messages) 
> maximum messages per rotation (17 messages)
> Aug 03 22:58:26 corosync [TOTEM ] missed count const (5 messages)
> Aug 03 22:58:26 corosync [TOTEM ] send threads (0 threads)
> Aug 03 22:58:26 corosync [TOTEM ] RRP token expired timeout (2380 ms)
> Aug 03 22:58:26 corosync [TOTEM ] RRP token problem counter (2000 ms)
> Aug 03 22:58:26 corosync [TOTEM ] RRP threshold (10 problem count)
> Aug 03 22:58:26 corosync [TOTEM ] RRP mode set to none.
> Aug 03 22:58:26 corosync [TOTEM ] heartbeat_failures_allowed (0)
> Aug 03 22:58:26 corosync [TOTEM ] max_network_delay (50 ms)
> Aug 03 22:58:26 corosync [TOTEM ] HeartBeat is Disabled. To enable set 
> heartbeat_failures_allowed > 0
> Aug 03 22:58:26 corosync [TOTEM ] Initializing transport (UDP/IP).
> Aug 03 22:58:26 corosync [TOTEM ] Initializing transmit/receive 
> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Aug 03 22:58:26 corosync [IPC   ] you are using ipc api v2
> Aug 03 22:58:26 corosync [TOTEM ] Receive multicast socket recv buffer 
> size (262142 bytes).
> Aug 03 22:58:26 corosync [TOTEM ] Transmit multicast socket send buffer 
> size (262142 bytes).
> corosync: totemsrp.c:3091: memb_ring_id_create_or_load: Assertion `res 
> == sizeof (unsigned long long)' failed.
> Aug 03 22:58:26 corosync [TOTEM ] The network interface [10.50.3.70] is 
> now up.
> corosync died with signal: 6 Check cluster logs for details
>                                                             [FAILED]
> 
> 
> I haven't been able to find information that identifies the issue or how 
> to correct it.  I am hoping someone from this group may be able to shed 
> some light.
> 

This happens because the ring id file is 0 bytes.  We have fixed this
problem in later versions of corosync.  TO rectify this problem, rm -f
/var/lib/corosync/ringid*

Regards
-steve

> Thanks!
> David
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to