On Thursday 23 August 2007, mingdao lu wrote:
> hi, all
>
> In my system,  there are 3 nodes in a cluster. Node A and Node B are
> working node. Node C will be the spare one.
> Either node A or node B is down, node c will takeove it.
> Could you tell me how to configure the deadtime, warntime and keepalive
> time to keep the "takeover time" in one second?
You need to provide more context. Since A and B are holding the resources 
heartbeat has to do a lot of operations. 

Heartbeat can do a lot of things but you need to take care about if it is 
saved what you tell heartbeat to do.

Assume A is declared as "dead" and C should take over.

Short version:
1.) it shots down A with STONITH (you really should have a STONITH)
2.) after the STONITH was succesful confirmed it "starts" the resouces on C - 
(see comment below about "start")

Since you have to wait for a succesfull STONITH (which usualy takes longer 
than 1 second) you aready lost.

You could run 2. without waiting for the STONITH. If this is save depends on 
the resources you have to relocate. If they share common data you will be out 
of luck because then you have to wait for the STONITH to guarantee 

"starts" ---> IMHO there is no way to perform this requirement without help on 
application level. This means your node C runs in hot standby resouce 
(application) and all its needs is a go from a resource agent from its start 
operation.

Don't forget that usually moving a resource implies to move a virtual address. 
This includes telling switches the IP address changed which also takes some 
time.

What you really need is loadbalancing, means A and B provide together a 
resource and if one fails the 2nd one can take over the load for a short 
time. And in the meantime you start up the resource on C to take over the 
load from the erronoues node ----> this does NOT mean the takeover needs to 
be in seconds.

If one server can not handle the load for 2 server for a short time you have 
to change to folloing configuration: load balance on A, B and C and put a D 
on standby.

The result remains the same: the time for the failover takes way longer than 1 
second but from the outside view nothing hapapned.

kind regards Max
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to