Dear $ALL

I've just started using heartbeat, which seems like a really nifty
program overall. I'm unsure why I haven't used it much earlier, because
it's really great.

I have a couple of beginner questions, though. My test-setup is two
nodes, test1 and test2, sitting behind a router, router0. The two
machines currently talk to each other through the router, but they
can/will get a dedicated Ethernet channel between them (crossed link).

1) I'm using the following settings:

keepalive       200ms
deadtime        1000ms

No matter what kind of load I put on the machines, this never seems to
break down. Timings this tight allow me to use 5-second failover time
for a HA-NFS server. My question is this: Is there some (perhaps
non-obvious) reason this might be a bad idea? All the documentation
suggests higher times, so I'm wondering.

2) If I unplug eth0 from test1, the cluster will be split-brained,
because neither node can make a decision to be primary or fail. I've
read that "ipfaild" can be used to detect missing-link situations, and
react differently. Can anyone point to some examples, or help me set it
up? And, is it even the right tool.

3) If I forkbomb test1, it is (of course) completely dead service-wise,
but still sending out heartbeats(!). I've read that a service monitoring
daemon can solve this, by checking reasonable access times to (say) NFS.
Can someone recommend examples or documentation? Or, can someone help
set this up? :)

Thanks in advance.

-- 
Med venlig hilsen
Christian Iversen
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to