Hi, On Tue, Jan 15, 2008 at 11:30:41AM +0100, Andreas Mock wrote: > > -----Urspr?ngliche Nachricht----- > > Von: General Linux-HA mailing list <[email protected]> > > Gesendet: 15.01.08 10:23:03 > > An: General Linux-HA mailing list <[email protected]> > > Betreff: Re: [Linux-HA] URGENT: Problem with configuration of STONITH device > > > probably you can remember. We had a discussion about that, because it was > > > said that the stonithd on one node would prevent to call a stonith > > > agent to kill its own node. Is this still true? > > > > Yes. > > Will that change?
No. > > > running NOW on node 2 should be possible to shoot itself. > > > > > > Dejan, can you remember? > > > > Yes. One serious problem in this case is that the cluster can > > never know if the stonith operation was successful. Which would > > basically render the cluster unusable. > > So, the upper question is probably NO by design? Yes. > > > > It is an extra package provided by IBM. A tad heavy though: it's > > > > a Java application. > > > > > > Yes, it slurps CPU cycles like a Bavarian...?h sorry.. > > > Bohemian slurps beer. ;-) > > > The worse: I got errors while monitoring regularly. Someone > > > posted here that this is related to timing problems. > > > > I can recall that it worked for me(tm). However, it is very > > demanding in terms of memory/cpu and that's definitely not good > > for stonith. > > I had monitoring running once per minute. Approx. once per day I got > an error. The error message was not very helpful as it was a java > stack trace. OK. Though I'd find a once per minute monitor interval a bit excessive. Perhaps once an hour or so would be more appropriate. > > Hmm, I probably picked the wrong version then. Buggy in a sense > > that it won't work at all? Can you describe the bug. > > I'm sorry, but it's the worst case. It would not stonith and that' really > the primary goal. :-( > I found out with my tests that the first parameter to the start/stop/restart > actions (therefore the second parameter) is the name of the node to > stonith. My first script checked for exactly ONE parameter. In this case > obviously not correct. > > > > > > So, Dejan, could you please check in the differences? > > > I have the (more) correct version attached. > > > > Thanks, I'll update the repository. > > Thank you. Sorry for the mess. No problem. Thank you for the contribution! > > > IMPORTANT: > > > 1) The external stonith api allowes that an external stonith plugin > > > can be responsible of shooting more that one node. Parameter while > > > calling. > > > My external stonith plugin shoots exactly the one node configured via > > > CIB. It ignores the parameter. Probably I should add a check for that. > > > > What exactly do you refer to? All parameters are defined by the > > plugin itself. If that is the case, why should you ignore any of > > them :) > > See the above answer. That's something I don't understand. I couldn't > find something in the documentation of the external stonith plugin API. > When a stonith action is triggered the plugin is called with the action name > and with the node name to stonith as second parameter. > I guess this is done to be able to create a configuration with lists of hosts > so that the plugin-cibconfig-combination can shoot more that one node. > But I'm not sure. In my first attempt I thought that ALL parameters are > given as environment variables and ONLY the action is given as parameter. > > But probably ...better hopefully...you can enlighten me. :-) Yes, an action plus node is passed. Some, or actually most devices until recently, can handle more than one node. > > > 2) The RSA board allows only one telnet session at a time. So if someone > > > logs > > > in to e.g. check something and at the same time a monitor cycle is > > > started by > > > HA, the resource gets a monitor failure and will probably moved. > > > > Yes, that's one typical problem with this class of devices. But > > there's nothing one can do about that. > > Let a node shoot itself. (See answers above). It's not good, but better than > no attempt of shooting. Well, that probably won't change. If you want to express your sentiment on the matter, here's the bugzilla: http://developerbugs.linux-foundation.org/show_bug.cgi?id=1752 Cheers, Dejan > > > Best regards > Andreas Mock > _______________________________________________________________________ > Jetzt neu! Sch?tzen Sie Ihren PC mit McAfee und WEB.DE. 30 Tage > kostenlos testen. http://www.pc-sicherheit.web.de/startseite/?mc=022220 > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
