Hi,

On Wed, Feb 20, 2008 at 03:30:44PM -0500, Blechman, Ronald I, Jr (Ron) wrote:
> I want to solicit the group's advice on the following proposed
> application of heartbeat.
> Hopefully this explanation will be clearer than my last posting on
> this!....
> 
> We are considering using Heartbeat 1.x to manage two geographically
> separated systems, which are nevertheless on a single subnet using some
> sort of tunneling technology.

Heartbeat uses UDP/IP for communication and it works on WAN. You
should also consider various communication settings to match the
WAN characteristics (packet loss, latency).

> There is no possibility of using multiple redundant communication paths
> between the two instances of heartbeat - there is only one communication
> path.
> We are NOT using drbd or other tightly-coupled replication system, so
> split brain, while undesirable, is not catastrophic.
> In fact, we would rather the system go into split brain than have both
> servers remain in backup mode.
> 
> Now the question:
> We would like to prevent a server from being primary if its ping node is
> unreachable.
> In other words, we want the server to be primary if and only if:
>   1) the partner is dead and the ping node is alive
>   2) or, if the partner is alive but in backup mode.
> What is the best way of accomplishing this?

Did you see ipfail?

> It appears, if we don't have a stonith module, that heartbeat makes each
> server go primary as soon as is sees that it cannot communicate with its
> partner.

All cluster partitions which have quorum try to start resources.
In the case of a two-node cluster, both nodes will have quorum on
split brain.

> We have considered writing a stonith module that will return success
> only if it can ping its ping-node.
> Since the stonith module would not actually shoot anyone -- it
> technically would be returning a "false-positive", but I'm not so sure
> this matters in our environment.
> 
> Is this possible?  Is this an abuse of stonith?  Is this advisable?  Is
> there another hook that we could use to accomplish the same thing?
> 
> Is this whole approach ill advised?  

A split site cluster is a difficult proposition and very hard to
get right. In particular since one usually can't rely on stonith.
Since you're not concerned about having more than one instance of
the resources running (hope that that case has been thoroughly
tested) you could go ahead with the support of ipfail.

Thanks,

Dejan

> Ron Blechman | Distinguished Member of Technical Staff | Avaya |
> 307 Middletown Lincroft Road | Room 3K-305 | Lincroft, NJ 07738 |
> Voice 732.852.2310 | Fax 732.852.1375 | [EMAIL PROTECTED]
> 
>  
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to