On 2011-07-07 13:52, James Smith wrote:
> Hi,
> 
> I appreciate that, but it doesn't answer the question.

Then maybe I misunderstood the question. I had interpreted it to mean
"why doesn't my cluster automatically fail over under high load?" --
perhaps you can rephrase to clarify.

> What I'm getting at, is there are multiple scenarios where a system 
> can fail but in my test scenario I was forcing high load.  My application 
> wouldn't, in a working scenario, ever cause this type of load unless there 
> was a very serious issue that would warrant failover.

Er, how can you be so sure? How about if you just had a ton of users (or
client services) hammering your application. Then that would cause high
load, but it would clearly _not_ warrant failover -- since after you
would fail over, the other node would be hammered just as much.

> So in this scenario I 
> want pacemaker to be able to handle this accordingly without the 
> need to configure additional services entirely separate to the working of 
> pacemaker.

Now please define how exactly Pacemaker would be handling this
"accordingly."

> For example, it's easy to assume the monitor operations on the RA's can 
> handle this already.  The slave should be initiating a monitor operation 
> against 
> the master to see if it's services are still responding.

I'm afraid you're missing the fact that in Pacemaker "a slave" does not
initiate a monitor operation "against the master", what makes you think
that it does? Monitor operations are always run locally. It is only very
few resource agents that are configurable as master/slave sets. _Some_
of those can be configured to have a slave contact a master during
monitoring (like ocf:heartbeat:mysql), some never do (like ocf:linbit:drbd).

> But it seems only the 
> master does this,

No. All nodes do.

> but of course the master is foobared so never responds, 
> so failover never occurs.  Surely I'm not the only one that sees this as 
> rather 
> flawed?

So what would your preferred behavior be? Pacemaker failing over in case
load is high? That's a possibility and could be done via the system
health feature and an appropriate resource agent, but even if that
happens, you stand a pretty good chance -- even though I realize you
don't believe this -- that it is your application that causes this high
load, and then failover makes matters worse, not better.

Cheers,
Florian

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to