On 10/19/07, Spam Filter <[EMAIL PROTECTED]> wrote: > OK, I got a better understanding now. > > Attached is my small script (missing one class file but not needed to > show) which is my stonith script. > It does all the methods required but what I have at the beginning is > that the "hostlist" environment must be set otherwise it exits with a 1. > I don't know if this is always set when the script is called all the > time or only when stonith action (on|off|reset) is needed.
if you set a parameter called "hostlist" in the CIB, then it should be available all the time the alternative is to have the stonith agent figure out which hosts it can shoot > > Now from the information you gave below, I think my script does as it > should maybe except the environment stuff I mentioned above. > > I originally had it so you can call it like this : > ./powerio castor reset > ./powerio castor off > Etc. > > But after reading the hostlist, thought it would be better so it can do > multiples and compatible with the way HA do the environment > stuff..*shrugs* > > Now onto loading the stonith into the cib, do I specifically need to put > all host names capable of stonith as the parameter yes > below as I would've > though all members on HA would/could be stonithed... If not, then I > assume I need to add names to this list as more members join, correct? > > I assume the below modified example would suffice my script. Is there an > easy way to implement this on a live system and test that HA can stonith > but make sure it's a test run with the plugs unplugged so it doesn't > actually kill systems as well as not touch any other resources... Only > enough to ensure stonithing is working and able to be called rather than > just test from command line..etc.etc..??? there is a command called stonith that you can use to test it > Sorry for sounding dumb if it's the obvious, getting old ;) > > <clone id="DoFencing"> > <instance_attributes> > <attributes> > <nvpair name="clone_max" value="2"/> > <nvpair name="clone_node_max" value="1"/> > </attributes> > </instance_attributes> > <primitive id="child_DoFencing" class="stonith" > type="external/powerio" provider="heartbeat"> > <operations> > <op name="monitor" interval="5s" timeout="20s" prereq="nothing"/> > <op name="start" timeout="20s" prereq="nothing"/> > </operations> > <instance_attributes> > <attributes> > <nvpair name="hostlist" value="castor,pollux"/> > </attributes> > </instance_attributes> > </primitive> > </clone> > > > George > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of matilda > matilda > Sent: Friday, 19 October 2007 5:43 PM > To: General Linux-HA mailing list > Subject: Re: RE: [Linux-HA] CRM and STONITH questions > > >>> "Spam Filter" <[EMAIL PROTECTED]> 19.10.2007 04:36 >>> > > Hi, > > Also hi, hi all, > > > Is the nvpair for clone_max and clone_node_max a HA parameter or meant > >for my script? If HA, how do I know if I need the example settings or > >changed for a 2 node fail over system? > > The stonith plugin for HAv2 has to be configures like a normal resource > (single resource or clone resource). The configuration example in the > wiki article uses a clone configuration. In the example you have a > two node cluster, therefore the example states > '<nvpair name="clone_max" value="2"/>' > because a maximum of 2 clones is requested. Without any other > configuration > these 2 clones can run one one node if requested. But this doesn't make > much sense if exactly this node gets crazy (has to be stonithed). > Because of this there is the config snippet > <nvpair name="clone_node_max" value="1"/> > saying that on every node only a maximum of 1 clone has to be run. > These two config snippets together lead to a situation (in normal > circumstances) where exactly one stonith clone runs on every node. > One node can shoot the other node or itself. That is NOT specified > by this configuration. > Short answer to your question: clone_max and clone_node_max are > config parameters for stonithd at the end. > > > What exactly does the "monitor" do, is it just a status check as my > > device is a webpage and passing a 'status' returns a success if it can > > reach the website to stonith the nodes? > The monitor action does the same as with a normal resource, checking if > this resource is operational. If you have configured the monitor action > stonithd calls the external monitor plugin with the argument 'status'. > If the external stonith plugin resturns with return code 0, everything > is fine, if it returns with something different, stonithd is assuming > a failure of the plugin (stonith channel) and is propagating this > failure > to the deciding instance of HA (lrm->crm->pengine). > In an error case the failcount of this stonith resource is incremented. > Failover behaviour is the same as for normal resources (gurus out there: > Please correct me if I'm saying something wrong) > > > What does the start and timeout meant for as well? > The same as for normal resources. > > > For the parm1 and parm2 attributes, if my script uses the "hostlist" > > environment variable do I need to pass this in here or is it > > automatically set when the stonith is called.etc.etc. > etc, etc. is a little bit very unspecific, don't you think so? > > To your first part of question: If a stonith plugin needs parameters, > these parameters are transferred as environment variables. The snippet > in > the example: > <instance_attributes> > <attributes> > <nvpair name="parm1-name" value="parm1-value"/> > <nvpair name="parm2-name" value="parm2-value"/> > <!-- ... --> > </attributes> > </instance_attributes> > defines two parameters 'parm1-name' and 'parm2-name' and > the associated values. If you configure the stonith plugin that way, > the stonith plugin is called with these environment variables set. > (Caution: This is not true for ALL of the calls to the stonith plugin. > Only to those which need this information (on, off, reset)) > > Now to the 'hostlist': The stonith plugin can be one that can stonith > more that one node, like a stonith macine gun ;-) > in the startup phase of the stonith plugin the plugin is called with > the first argument 'gethost' (see documentation). The stonith plugin > has to answer with exactly one nodename (aka hostname) per line. But > it's > o.k. to send more that one line to state that the plugin is able to > shoot > more nodes. After that stonithd (or someone else in the machinery) > knows whom to ask when a node has to be shot. > > When the external stonith plugin is called to shoot a node (1. parameter > is 'reset') than the second parameter is the node name of the node > to shoot. (By the way, I have to correct my last published stonith > plugin, > arghhh) > > The other interface calls (getconfignames, getinfo-devid, > getinfo-devname, > getinfo-devdescr, getinfo-devurl, getinfo-xml) are calls to the external > stonith plugin to present metainformations to the constrolling instance. > They are called at the start time of the plugin. Informations returned > there must be consistent to the parameters your stonith plugin need. > E.g. the parameters returned by the call to 'getconfignames' must > match the parameters returned as xml-snippet by the call to > 'getinfo-xml') > > > > I'm totally lost on where the detailed info for this is so I can > > successfully make this work. > I think these information bring light into the dark. If these > informations > let you understand the way stonith plugins work, than you have (!!) to > put > an article to the wiki explaining that. That will be the price you have > to > pay. ;-)) > > > Best regards > Andreas Mock > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
