On Sun, Aug 10, 2008 at 23:13, Jerome Martin <[EMAIL PROTECTED]> wrote: > Hi Steven, Andrew, > > On Sun, 2008-08-10 at 13:24 -0700, Steven Dake wrote: >> Our policy regarding your issue is that we require the operating >> system >> multicast to operate properly (which it doesn't), we require the >> multicast hardware switch to operate properly, and we require a basic >> posix api to make all this a reality. > > I read you loud and clear. > > However, I might be picky, but I make a differentce between a member > node OS/network malfunction (which is not what happens in that > situation) preventing it from being part of the cluster (which would be > only fair :-) ) and a misbehaving node preventing OTHER nodes (in that > case ALL of them) to continue functionning properly. Please read > further, because I am not jumping on openAIS back there... > > The case at hand, however, is not as simple as that. We are in a > situation where one misbehaving node triggers what I will call a > "membership event storm". That storm does not in fact prevent openAIS > from functionning, and membership for operationnal cluster nodes is > preserved. However, when used in conjunction with pacemaker (and I would > bet that other services might be impacted by this), the storm being > forwarded to the service level has very bad consequences, preventing the > WHOLE cluster from functionning properly. It is, IMHO, a weakness caused > by two factors : > > 1) Lack of robustness on the pacemaker side (Andrew, this one's for > you :-) ).
Its the old rule: garbage in, garbage out. If you have someone poking you in the arm 10 times a second, would you get any work done? Pacemaker wont work if the underlying cluster infrastructure (openais) isn't functioning. And it seems OpenAIS can't/won't/shouldn't function in this scenario (whether it should I'll leave as a question for Steve). A random thought I just had: enabling authentication _may_ have prevented this. The rogue machine presumably wouldn't have had the right credentials and its messages would have been dropped. > > 2) Useless forwarding of event which DO NOT CHANGE THE CLUSTER STATE > (this one is on openAIS, the fact that the membership storm is being > propagated instead of just being "buffered" as it does not change the > state of the membership world) > > The way I see it, as an end-user of a stack of various software aimed at > providing HA (openAIS + pacemaker + heartbeat lrmd), including tolerance > for nodes faults and a high level of relilience/robustness, is that this > is typically the kind of node malfunction that should be handled > gracefully by my cluster. Which is not the case. > > Of course, I I were Andrew, I could point at STONITH to kill that > misbehaving node. But what if the storm prevents pacemaker to actually > schedule the STONITH event (as in fact it prevented pacemaker to > schedule ANY action until crmd crashed in my test case) ? I didn't realize it crashed. backtrace? > > If I were you Steven, I could easilly point a finger at pacemaker being > solely responsible for this ... > > But we all know that in order to acheive the best-in-class cluster > resiliency, we need each and every link of the chain to be as robust and > forgiving as possible. Forgiving here, for me, means try to avoid > putting unnecessary burden on the next one, in that case forwarding that > membership event storm. > > Please bear with my limited insight of the philosophy and internals of > the openAIS stack, maybe I am overseeing a design prerequisite that > implies to forward membership events even if they do not actually change > the members database contents ... > > [...] >> I'd suggest reporting a bug to the maintainer of vserver on the >> multicast not being bound properly. That is a bug in their driver >> software. Mutlicast is required by ipv6 networks. > > Multicast is working fine in the dev branch of linux-vservers, and I was > not in fact trying to make it work on my compile vservers machines. This > was just a side-effect of installing openAIS packages there to compile > pacemaker. Still, I think this is only one example of how such a one-way > broadcast can happen, and is vserver is not the point of focus in the > broader question I am rising now (totally separate from understanding > WHY the initial issue happened, thanks to your help in decoding my logs, > Steven). > > Please, Steven and Andrew, talk to me and to each other about this, > because I am in no position to decide at which level(s) it is more > meaningfull to improve your software behaviors, but as an end-user, I > clearly see a robustness issue which I feel should be addresses if we > agree on the notion of "fault tolerance" (litteraly), which is at the > very heart of any HA cluster. > > Sidenote: Andrew, should I crosspost this to linux-ha ML with a summary > of the actual scenario ? No. Its not relevant to them since they don't use the openais stack. _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
