Thank you for your reply, Dominik. I think UPS or PDU in this case is a better solution than a lights-out device, since they have separate power supply.
And I don't think we need to manage UPS or PDU's failure by our self, the manufacturer should take responsibility of this. Am I correct? But yes, probably need additional budgets for this. Anyway, again, thanks for your advice. I'm going to do some research on them. On Thu, Apr 1, 2010 at 6:38 AM, Dominik Klein <[email protected]> wrote: > Tony Gan wrote: > > Hi, > > For a two-node cluster, what are the best STONITH devices? > > > > Currently I am using Dell's iDrac for STONITH device. It works pretty > well. > > However the biggest problem for iDrac or any other lights-out devices is > > that they share power supply with hosts machines. > > > > Once an active machine lost its power completely, you want to fail-over > to > > the backup-node in your cluster. > > But with iDrac as your STONITH device you can not, because the STONITH > > resource on backup node will run into error (fail to connect to STONITH > > device, it's out of power too) , and refuse to start any resources. > > > > > > I was wondering what kind of STONITH devices everybody is using to solve > > this problem. And how much are they? > > > > Actually Pacemaker's page have a link talking about this: > > http://www.clusterlabs.org/doc/crm_fencing.html > > > > It suggests UPS (Uninterruptible Power Supply) as well as PDU (Power > > Distribution Unit). > > Anybody used them before? How well are they integrated with Heartbeat? > What > > are the pros and cons? > > Hi > > I am using APC PDUs for my clusters. > > The setup is like: > > power supply circuit 1 -> pdu 1 -> node 1 > power supply circuit 2 -> pdu 2 -> node 2 > > If a node fails, the corresponding pdu usually is accessible and > manageable. > > However, if a pdu fails (and they probably can fail in ways we cannot > really imagine (to quote Dejan)) that renders the same problem as yours. > The node is down, the stonith device is down, so no resource takeover. > > But imho, this is not resolvable. At least I do not know of a way how > to. If a PDU or UPS fails (node down and power device down), then the > resources for the failed node will not be recovered since the failed > node cannot be shot. > > Regards > Dominik > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
