Hello,

I have installed Heartbeat on two PCs (let's call them node1 and node2)
working on opensuse 10.3. The aim of the cluster is the high-availability of
an application, let's call it AppliA. if a node fails everything works fine,
i.e. AppliA starts on the other node and the service is still available. My
problem concerns the case a LINK failure happens.
Here is what I observed:
(Node1 and node2 communicate with each other through my local network)
- Node1 is running as master and has currently the control of AppliA. Node2
is waiting for failure.
- I disconnect node1 from the network.
- Node2 detects it and starts AppliA. That's good. But node1 which is
completely isolated is also still running it.
- I reconnect node1. Since AppliA is running on both nodes there is a kind
of conflict I guess so Heartbeat shut down AppliA on BOTH nodes and then
restart it on node1.

The last point is my problem, I would like that node1 detects it has a link
connection problem and shut down the AppliA. Thereof, when the link is up
again I want node1 not to start AppliA and waits for eventual failure of
node2. Actually, I would like to avoid the service for being interrupted
unnecessarly.

I have made the following configuration (here the part I think is the most
important):
##########################################
auto_failback off

respawn hacluster /usr/lib/heartbeat/ipfail

ping 192.168.1.253
#########################################

(When I disconnect node1 from the network it detects it does not receive
response to the ping anymore (according to the log file) but it seems to
interpret it as "the link to 192.168.1.253 is dead but I am not".)

I had a look on the website and thought the previous configuration would be
enough but obviously not. Could somebody tell me how to force node1 to shut
down AppliA when it does not get answer to the ping and wait for failure of
node2 to start again AppliA?


Thanks a lot for your help,
Best regard,
Carole
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to