On Fri, Aug 12, 2011 at 08:58:05AM +1000, Andrew Beekhof wrote:
> On Thu, Aug 11, 2011 at 9:29 PM, Sam Sun <[email protected]> wrote:
> > Hi All,
> > This is Sam for Ericsson IPWorks product maintenance team. We have an 
> > urgent problem on the Linux HA solution.
> > I am not sure if this is the right mail box, however it is very appreciated 
> > if any one can help us.
> > Our product has used SLES 10 SP4 X86_64 with HA version 2.1.4-0.24.9.
> 
> I'd contact SUSE - you pay them to give you their full attention  :-)
> 
> > We have a problem in the STONITH implement. There are only two nodes in HA 
> > cluster.
> >    However if there is split brain situation, Two HA nodes will shutdown 
> > the peer nodes both at the same time?
> 
> Yes
> 
> >    Then we only let STONTH running in one of HA nodes, is this a right 
> > configuration?
> 
> No.
> 
> > Is there any Best Practice for STONITH implementation in HA which only has 
> > two nodes?

I assume you are already aware of http://ourobengr.com/ha

Besides that, you may want to add a random (or node dependent) timeout
to the stonith agent action, to increase the chance during a split brain
that one shoots the other before being shot itself.

So e.g. you have nodes A and B, and you modify the stonith agent
to always sleep(x) on node A when shooting node B, but to not do any
sleep on node B when shooting node A.

If it is an actual node crash, worst case you need x more seconds for
the stonith action. If it was a split brain, both nodes still alive,
chances are that only A will be shot.

Typically the DC before the split brain will have a slight advantage
anyways, so simultaneously "successfully" shooting each other should not
be that common.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to