> -----Ursprüngliche Nachricht----- > Von: General Linux-HA mailing list <[email protected]> > Gesendet: 15.01.08 10:23:03 > An: General Linux-HA mailing list <[email protected]> > Betreff: Re: [Linux-HA] URGENT: Problem with configuration of STONITH device > > probably you can remember. We had a discussion about that, because it was > > said that the stonithd on one node would prevent to call a stonith > > agent to kill its own node. Is this still true? > > Yes.
Will that change? > > running NOW on node 2 should be possible to shoot itself. > > > > Dejan, can you remember? > > Yes. One serious problem in this case is that the cluster can > never know if the stonith operation was successful. Which would > basically render the cluster unusable. So, the upper question is probably NO by design? > > > > It is an extra package provided by IBM. A tad heavy though: it's > > > a Java application. > > > > Yes, it slurps CPU cycles like a Bavarian...?h sorry.. > > Bohemian slurps beer. ;-) > > The worse: I got errors while monitoring regularly. Someone > > posted here that this is related to timing problems. > > I can recall that it worked for me(tm). However, it is very > demanding in terms of memory/cpu and that's definitely not good > for stonith. I had monitoring running once per minute. Approx. once per day I got an error. The error message was not very helpful as it was a java stack trace. > > Hmm, I probably picked the wrong version then. Buggy in a sense > that it won't work at all? Can you describe the bug. I'm sorry, but it's the worst case. It would not stonith and that' really the primary goal. :-( I found out with my tests that the first parameter to the start/stop/restart actions (therefore the second parameter) is the name of the node to stonith. My first script checked for exactly ONE parameter. In this case obviously not correct. > > > So, Dejan, could you please check in the differences? > > I have the (more) correct version attached. > > Thanks, I'll update the repository. Thank you. Sorry for the mess. > > IMPORTANT: > > 1) The external stonith api allowes that an external stonith plugin > > can be responsible of shooting more that one node. Parameter while calling. > > My external stonith plugin shoots exactly the one node configured via > > CIB. It ignores the parameter. Probably I should add a check for that. > > What exactly do you refer to? All parameters are defined by the > plugin itself. If that is the case, why should you ignore any of > them :) See the above answer. That's something I don't understand. I couldn't find something in the documentation of the external stonith plugin API. When a stonith action is triggered the plugin is called with the action name and with the node name to stonith as second parameter. I guess this is done to be able to create a configuration with lists of hosts so that the plugin-cibconfig-combination can shoot more that one node. But I'm not sure. In my first attempt I thought that ALL parameters are given as environment variables and ONLY the action is given as parameter. But probably ...better hopefully...you can enlighten me. :-) > > 2) The RSA board allows only one telnet session at a time. So if someone > > logs > > in to e.g. check something and at the same time a monitor cycle is started > > by > > HA, the resource gets a monitor failure and will probably moved. > > Yes, that's one typical problem with this class of devices. But > there's nothing one can do about that. Let a node shoot itself. (See answers above). It's not good, but better than no attempt of shooting. Best regards Andreas Mock _______________________________________________________________________ Jetzt neu! Schützen Sie Ihren PC mit McAfee und WEB.DE. 30 Tage kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220 _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
