Re: RE: [Linux-HA] CRM and STONITH questions

Dejan Muhamedagic Fri, 19 Oct 2007 04:13:21 -0700

Hi,

On Fri, Oct 19, 2007 at 09:42:48AM +0200, matilda matilda wrote:
> >>> "Spam Filter" <[EMAIL PROTECTED]> 19.10.2007 04:36 >>>
> > Hi,
> 
> Also hi, hi all,
> 
> > Is the nvpair for clone_max and clone_node_max a HA parameter or meant
> >for my script? If HA, how do I know if I need the example settings or
> >changed for a 2 node fail over system?
> 
> The stonith plugin for HAv2 has to be configures like a normal resource
> (single resource or clone resource). The configuration example in the
> wiki article uses a clone configuration. In the example you have a
> two node cluster, therefore the example states 
> '<nvpair name="clone_max" value="2"/>'
> because a maximum of 2 clones is requested. Without any other configuration
> these 2 clones can run one one node if requested. But this doesn't make
> much sense if exactly this node gets crazy (has to be stonithed).
> Because of this there is the config snippet
> <nvpair name="clone_node_max" value="1"/>
> saying that on every node only a maximum of 1 clone has to be run.
> These two config snippets together lead to a situation (in normal 
> circumstances) where exactly one stonith clone runs on every node.
> One node can shoot the other node or itself. That is NOT specified
> by this configuration.
> Short answer to your question: clone_max and clone_node_max are
> config parameters for stonithd at the end.
> 
> > What exactly does the "monitor" do, is it just a status check as my
> > device is a webpage and passing a 'status' returns a success if it can
> > reach the website to stonith the nodes?
> The monitor action does the same as with a normal resource, checking if
> this resource is operational. If you have configured the monitor action
> stonithd calls the external monitor plugin with the argument 'status'.
> If the external stonith plugin resturns with return code 0, everything
> is fine, if it returns with something different, stonithd is assuming
> a failure of the plugin (stonith channel) and is propagating this failure
> to the deciding instance of HA (lrm->crm->pengine).
> In an error case the failcount of this stonith resource is incremented.
> Failover behaviour is the same as for normal resources (gurus out there:
> Please correct me if I'm saying something wrong)
> 
> > What does the start and timeout meant for as well?
> The same as for normal resources.


As far as the CRM is concerned it is the same. However, the
plugins have no start/stop operations implemented, so these
operations would have been better described as enable/disable.
They serve to let the cluster know which stonith resources are
operational and available. Furthermore, the start operation
includes one status operation to ensure that the plugin can be
used to operate nodes.

Thanks for the exhaustive explanation.

Cheers,

Dejan


> > For the parm1 and parm2 attributes, if my script uses the "hostlist"
> > environment variable do I need to pass this in here or is it
> > automatically set when the stonith is called.etc.etc.
> etc, etc. is a little bit very unspecific, don't you think so?
> 
> To your first part of question: If a stonith plugin needs parameters,
> these parameters are transferred as environment variables. The snippet in
> the example:
>     <instance_attributes>
>       <attributes>
>         <nvpair name="parm1-name" value="parm1-value"/>
>         <nvpair name="parm2-name" value="parm2-value"/>
>         <!-- ... -->
>       </attributes>
>     </instance_attributes>
> defines two parameters 'parm1-name' and 'parm2-name' and
> the associated values. If you configure the stonith plugin that way,
> the stonith plugin is called with these environment variables set.
> (Caution: This is not true for ALL of the calls to the stonith plugin.
> Only to those which need this information (on, off, reset))
> 
> Now to the 'hostlist': The stonith plugin can be one that can stonith
> more that one node, like a stonith macine gun ;-)
> in the startup phase of the stonith plugin the plugin is called with
> the first argument 'gethost' (see documentation). The stonith plugin
> has to answer with exactly one nodename (aka hostname) per line. But it's
> o.k. to send more that one line to state that the plugin is able to shoot
> more nodes. After that stonithd (or someone else in the machinery)
> knows whom to ask when a node has to be shot.
> 
> When the external stonith plugin is called to shoot a node (1. parameter
> is 'reset') than the second parameter is the node name of the node
> to shoot. (By the way, I have to correct my last published stonith plugin,
> arghhh)
> 
> The other interface calls (getconfignames, getinfo-devid, getinfo-devname, 
> getinfo-devdescr, getinfo-devurl, getinfo-xml) are calls to the external
> stonith plugin to present metainformations to the constrolling instance.
> They are called at the start time of the plugin. Informations returned
> there must be consistent to the parameters your stonith plugin need.
> E.g. the parameters returned by the call to 'getconfignames' must
> match the parameters returned as xml-snippet by the call to 'getinfo-xml')
> 
> 
> > I'm totally lost on where the detailed info for this is so I can
> > successfully make this work.
> I think these information bring light into the dark. If these informations
> let you understand the way stonith plugins work, than you have (!!) to put
> an article to the wiki explaining that. That will be the price you have to
> pay.  ;-))
> 
> 
> Best regards
> Andreas Mock
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: RE: [Linux-HA] CRM and STONITH questions

Reply via email to