On 03/05/2013, at 5:17 PM, RaSca <[email protected]> wrote:
> Il giorno Ven 05 Apr 2013 15:29:36 CEST, RaSca ha scritto: > [...] >> It seem that when a configuration message has to run over the ring, in >> some particular cases, everything collapse. Following Florian's article >> I've tried setting up a window_size of 300, but since everything is the >> same, I think that with a default netmtu of 1500 and following the man >> page of corosync I must not go over 170 (which is 1500/300). >> The point is: what else can I check? Does it make sense to set a >> window_size LOWER than 50? >> Thanks for your help, > > I answer to myself, maybe it will be useful for someone else. > There was no way of making multicast working in this network. It does > not depend on the window_size or other kind of parameters, sometimes it > breaks. > Even if multicast was tested successfully (with omping and also mnc) > sometimes the ring does not complete and I've got the retransmit list > that make the cluster crash. > The only solution I've found was to use unicast, declaring "transport: > udpu" in corosync.conf and a member section for each node in the cluster. > Doing this made everything up again, I still see the "retransmit list" > messages, but those are in the order of 1 in an hour, so it's fine. > > Have you got other suggestions? Not personally other than to suggest the corosync ML which may have more expertise. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
