On 2011-07-07 13:52, James Smith wrote: > Hi, > > I appreciate that, but it doesn't answer the question.
Then maybe I misunderstood the question. I had interpreted it to mean "why doesn't my cluster automatically fail over under high load?" -- perhaps you can rephrase to clarify. > What I'm getting at, is there are multiple scenarios where a system > can fail but in my test scenario I was forcing high load. My application > wouldn't, in a working scenario, ever cause this type of load unless there > was a very serious issue that would warrant failover. Er, how can you be so sure? How about if you just had a ton of users (or client services) hammering your application. Then that would cause high load, but it would clearly _not_ warrant failover -- since after you would fail over, the other node would be hammered just as much. > So in this scenario I > want pacemaker to be able to handle this accordingly without the > need to configure additional services entirely separate to the working of > pacemaker. Now please define how exactly Pacemaker would be handling this "accordingly." > For example, it's easy to assume the monitor operations on the RA's can > handle this already. The slave should be initiating a monitor operation > against > the master to see if it's services are still responding. I'm afraid you're missing the fact that in Pacemaker "a slave" does not initiate a monitor operation "against the master", what makes you think that it does? Monitor operations are always run locally. It is only very few resource agents that are configurable as master/slave sets. _Some_ of those can be configured to have a slave contact a master during monitoring (like ocf:heartbeat:mysql), some never do (like ocf:linbit:drbd). > But it seems only the > master does this, No. All nodes do. > but of course the master is foobared so never responds, > so failover never occurs. Surely I'm not the only one that sees this as > rather > flawed? So what would your preferred behavior be? Pacemaker failing over in case load is high? That's a possibility and could be done via the system health feature and an appropriate resource agent, but even if that happens, you stand a pretty good chance -- even though I realize you don't believe this -- that it is your application that causes this high load, and then failover makes matters worse, not better. Cheers, Florian
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems