>>> Lars Marowsky-Bree <[email protected]> schrieb am 15.05.2013 um 14:08 in >>> Nachricht <[email protected]>: > On 2013-05-15T14:00:45, Ulrich Windl <[email protected]> > wrote: > >> If the node is going to be fenced, try to stop or migrate all local > resources until each was successful or timed out. THEN reset the node >> >> How could it be done? It sounds like a reasonable default to me... > > We did that once. A faulty node then will end up running into cascaded > timeouts for possibly hours. > > All resources are *required* to survive a fence, otherwise it wouldn't > be reliable. So the fence doesn't actually hurt.
...except killing innocent resources. Remember: High availability is not about restarting resources frequently, but to keep them up and running whenever possible. In HP-UX ServiceGuard there is a (AFAIR) "node fail fast" parameter that would fence the node without trying to stop resources properly first. We always had it turned off... Regards, Ulrich _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
