Hi, On Wed, Aug 13, 2008 at 11:07:18AM +0200, Andreas Mock wrote: > Hi Andreas, > > > -----Urspr?ngliche Nachricht----- > > Von: "Andreas Kurz" <[EMAIL PROTECTED]> > > Gesendet: 12.08.08 22:27:30 > > An: "General Linux-HA mailing list" <[email protected]> > > Betreff: Re: [Linux-HA] STONITH, default fencing time, forced-fencing > > > > Of course ... one instance should be enough but isn't it safer to make > > sure that every node is able to stonith any other node ... no matter in > > which state the complete cluster is? > > > > I thought this too longtime ago. I ended also configuring > stonith as separate primitive resource for each node and > constraints forcing a sonith plugin not to run on its > controlled node. > > Why this: In my case (and I'm pretty sure that's also valid for > other scenarios) the stonith device can only serve exactly ONE > connection.
This if often true. > That means only one user/plugin can connect to the > stonith device at a time. As soon as I used clones I couldn't > guarantee that only one clone connects to the stonith device at > one time. The monitor action is implemented as a connection > attempt with gathering some more or less meaningful status > information. I got many (wrong) monitoring failures. And > therefore using clones was not appropriate. Right. Different stonithd instances access devices at will. > Stonith plugin contributors should test and document if the > plugin is "clone-aware". Yes, unfortunately the documentation about which devices are supported and how is not complete. More user contributions would be very welcome. > Another issue is that on a two node cluster the usage of clones where the > clone instance running on the node it can't shoot (suicide forbidden) is > IMHO more or less the same as using one primitive stonith ressource and > an appropriate constraint avoiding a wrong node/stonith assignment. > > By the way: I really never got the feeling to understand the > stonith subsystem and its behaviour more that 60%. Probably > I/we have to take Dejan for one or more beer and asking him > anything we don't know about this. :-)) Here's an excerpt from the stonithd sources, hope that it helps: /*************************************************** * How does stonithd reset a node * * Every stonithd instance has zero or more stonith resources in * the started (enabled) state. These resources represent stonith * devices configured in such a way as to be able to manage one * or more nodes. * * 1. One of the stonithd instances receives a request to manage * (stonith) a node. * * 2. stonithd invokes each of the local stonith resources in * turn to try to stonith the node. stonith resources don't have * defined priority/preference. * * 3. If none of the local stonith resources succeeded, then * stonithd broadcasts a message to other stonithd instances (on * other nodes) with a request to stonith the node. * * 4. All other stonithd instances repeat step 2. However, they * don't proceed with step 3. They report back to the originating * stonithd about the outcome. * ***************************************************/ Right now, the stonithd which receives a fencing request is the one running on the DC. Thanks, Dejan P.S. Oh, a beer's OK, though I'd prefer subjects other than work :) > Best regards > Andreas Mock > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
