Hi, On Fri, Jan 11, 2008 at 08:09:26PM +0100, Andreas Mock wrote: > Hi Dejan, hi Lukas, > > this post is (probably) important to you. > > > -----Urspr?ngliche Nachricht----- > > Von: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] Im Auftrag von > > Dejan Muhamedagic > > Gesendet: Freitag, 11. Januar 2008 18:02 > > An: General Linux-HA mailing list > > Betreff: Re: [Linux-HA] Problem with configuration of STONITH device > > > > > > Hi, > > > > On Fri, Jan 11, 2008 at 04:45:24PM +0100, Luk?? Pecha wrote: > > > Hello, > > > does anyone have any experience with external/ibmrsa > > stonith device agent? > > Yes, me. :-) > > > > > > > > I have two servers and both of them have RSA II module, so > > I want to enable > > > STONITH on them. I have read the related topics at > > linux-ha.org, but it's > > > still not clear to me, how it works - when I create the > > stonith clone > > > device, do I have to create a primitive resource for each > > server or do I > > > have to somehow specify all of the nodes in one primitive resource? > > > > Whichever you prefer. It is admitedly a bit confusing. If you > > have only two nodes, then it may be preferable not to use clones, > > but just define two stonith resources and make constraints which > > wouldn't let them run on the node matching the hostname in the > > resource. > > Dejan, > > probably you can remember. We had a discussion about that, because it was > said that the stonithd on one node would prevent to call a stonith > agent to kill its own node. Is this still true?
Yes. > You wanted to have a look at this. > Background was: > 1) Normal setup: > * Stonith-Primitive 1 to kill node 2 > * Stonith-Primitive 2 to kill node 1 > * Constraint for Primitive 1 to run on node 1 preferably > * Constraint for Primitive 2 to run on node 2 preferably > (assumption: when node 1 gets crazy stonith on node 2 can shoot it) > 2) Failure with monitoring of stonith primitive 1 on node 1: > * Primitive 1 will be moved to node2, yes this should not happen and > an administrator has to investigate, but meanwhile Primitive 1 > running NOW on node 2 should be possible to shoot itself. > > Dejan, can you remember? Yes. One serious problem in this case is that the cluster can never know if the stonith operation was successful. Which would basically render the cluster unusable. > > It is an extra package provided by IBM. A tad heavy though: it's > > a Java application. > > Yes, it slurps CPU cycles like a Bavarian...?h sorry.. > Bohemian slurps beer. ;-) > The worse: I got errors while monitoring regularly. Someone > posted here that this is related to timing problems. I can recall that it worked for me(tm). However, it is very demanding in terms of memory/cpu and that's definitely not good for stonith. > > There's a relatively new alternative stonith > > agent ibmrsa-telnet. You may want to try that one. It's available > > with the 2.1.3 release. > > The first release of my script packaged with 2.1.3 is buggy. > Sorry for that! I found that very quickly but the updated and posted > version of my new script seems to be never arrived at the maintainers. Hmm, I probably picked the wrong version then. Buggy in a sense that it won't work at all? Can you describe the bug. > So, Dejan, could you please check in the differences? > I have the (more) correct version attached. Thanks, I'll update the repository. > The second version plays very well for some months now in a productive > environment. > > > > > > > Please, if you have experience with this, could you please > > provide me with > > > some info and an example of the cib configuration for > > ibmrsa stonith agent? > > > > At the bottom of the said ibmrsa-telnet you'll find a CIB snippet > > defining one stonith resource. > > IMPORTANT: > 1) The external stonith api allowes that an external stonith plugin > can be responsible of shooting more that one node. Parameter while calling. > My external stonith plugin shoots exactly the one node configured via > CIB. It ignores the parameter. Probably I should add a check for that. What exactly do you refer to? All parameters are defined by the plugin itself. If that is the case, why should you ignore any of them :) > 2) The RSA board allows only one telnet session at a time. So if someone logs > in to e.g. check something and at the same time a monitor cycle is started by > HA, the resource gets a monitor failure and will probably moved. Yes, that's one typical problem with this class of devices. But there's nothing one can do about that. > Enhancements are wellcome. > > Dejan, is there a proper place to put these hints somewhere for others? > In the file, man page, wiki? wiki.linux-ha.org Probably here: http://wiki.linux-ha.org/StonithAgents There is also the idioms page where you could add your CIB example: http://wiki.linux-ha.org/CIB/Idioms/ Cheers, Dejan > > Best regards > Andreas Mock > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
