On Thursday 23 August 2007, mingdao lu wrote: > hi, all > > In my system, there are 3 nodes in a cluster. Node A and Node B are > working node. Node C will be the spare one. > Either node A or node B is down, node c will takeove it. > Could you tell me how to configure the deadtime, warntime and keepalive > time to keep the "takeover time" in one second? You need to provide more context. Since A and B are holding the resources heartbeat has to do a lot of operations.
Heartbeat can do a lot of things but you need to take care about if it is saved what you tell heartbeat to do. Assume A is declared as "dead" and C should take over. Short version: 1.) it shots down A with STONITH (you really should have a STONITH) 2.) after the STONITH was succesful confirmed it "starts" the resouces on C - (see comment below about "start") Since you have to wait for a succesfull STONITH (which usualy takes longer than 1 second) you aready lost. You could run 2. without waiting for the STONITH. If this is save depends on the resources you have to relocate. If they share common data you will be out of luck because then you have to wait for the STONITH to guarantee "starts" ---> IMHO there is no way to perform this requirement without help on application level. This means your node C runs in hot standby resouce (application) and all its needs is a go from a resource agent from its start operation. Don't forget that usually moving a resource implies to move a virtual address. This includes telling switches the IP address changed which also takes some time. What you really need is loadbalancing, means A and B provide together a resource and if one fails the 2nd one can take over the load for a short time. And in the meantime you start up the resource on C to take over the load from the erronoues node ----> this does NOT mean the takeover needs to be in seconds. If one server can not handle the load for 2 server for a short time you have to change to folloing configuration: load balance on A, B and C and put a D on standby. The result remains the same: the time for the failover takes way longer than 1 second but from the outside view nothing hapapned. kind regards Max _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
