Hi,

On Wed, Aug 13, 2008 at 11:07:18AM +0200, Andreas Mock wrote:
> Hi Andreas,
> 
> > -----Urspr?ngliche Nachricht-----
> > Von: "Andreas Kurz" <[EMAIL PROTECTED]>
> > Gesendet: 12.08.08 22:27:30
> > An: "General Linux-HA mailing list" <[email protected]>
> > Betreff: Re: [Linux-HA] STONITH, default fencing time, forced-fencing
> > 
> > Of course ... one instance should be enough but isn't it safer to make
> > sure that every node is able to stonith any other node ... no matter in
> > which state the complete cluster is?
> > 
> 
> I thought this too longtime ago. I ended also configuring
> stonith as separate primitive resource for each node and
> constraints forcing a sonith plugin not to run on its
> controlled node.
> 
> Why this: In my case (and I'm pretty sure that's also valid for
> other scenarios) the stonith device can only serve exactly ONE
> connection.

This if often true.

> That means only one user/plugin can connect to the
> stonith device at a time. As soon as I used clones I couldn't
> guarantee that only one clone connects to the stonith device at
> one time. The monitor action is implemented as a connection
> attempt with gathering some more or less meaningful status
> information.  I got many (wrong) monitoring failures. And
> therefore using clones was not appropriate.

Right. Different stonithd instances access devices at will.

> Stonith plugin contributors should test and document if the
> plugin is "clone-aware".

Yes, unfortunately the documentation about which devices are
supported and how is not complete. More user contributions would
be very welcome.

> Another issue is that on a two node cluster the usage of clones where the
> clone instance  running on the node it can't shoot (suicide forbidden) is
> IMHO more or less the same as using one primitive stonith ressource and
> an appropriate constraint avoiding a wrong node/stonith assignment.
> 
> By  the way: I really never got the feeling to understand the
> stonith subsystem and its behaviour more that 60%. Probably
> I/we have to take Dejan for one or more beer and asking him
> anything we don't know about this.  :-))

Here's an excerpt from the stonithd sources, hope that it helps:

/***************************************************
 * How does stonithd reset a node
 *
 * Every stonithd instance has zero or more stonith resources in
 * the started (enabled) state. These resources represent stonith
 * devices configured in such a way as to be able to manage one
 * or more nodes.
 *
 * 1. One of the stonithd instances receives a request to manage
 * (stonith) a node.
 *
 * 2. stonithd invokes each of the local stonith resources in
 * turn to try to stonith the node. stonith resources don't have
 * defined priority/preference.
 *
 * 3. If none of the local stonith resources succeeded, then
 * stonithd broadcasts a message to other stonithd instances (on
 * other nodes) with a request to stonith the node.
 *
 * 4. All other stonithd instances repeat step 2. However, they
 * don't proceed with step 3. They report back to the originating
 * stonithd about the outcome.
 *
 ***************************************************/

Right now, the stonithd which receives a fencing request is the
one running on the DC.

Thanks,

Dejan

P.S. Oh, a beer's OK, though I'd prefer subjects other than work :)

> Best regards
> Andreas Mock
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to