Christian Rishøj wrote:
On 11/6/07, Andreas Mock <[EMAIL PROTECTED]> wrote:
Hi all,
after many days without dopey questions from me I'm back: ;-)
All questions concerning HAv2 (2.1.2-x).
Is the resource agent action 'monitor' done with a higher
OS process priority?
I'm asking about that, because we had the following phenomena:
1) (Very) High load on server
2) Monitor action initiated
3) Monitor action timed out because of high load
4) Stopping of resource initiated because of resource failover
5) Stop action timed out
I've been seeing the same pattern of events, and my solution has been
to try harder to keep the load levels low, e.g. preventing table locks
in the database, throwing out resource hungry OCFS2 etc. However, my
precautions merely raise the threshold, and do not prevent the
problem.
What might a general solution be? Letting Heartbeat ignore monitor
timeouts during periods of excessive load?
When you look at highly reliable systems like telephone switches and
similar things, they strictly limit load and put strict upper bounds on
resource consumption to ensure that they remain reliable.
They do this by shedding load when the load gets too high.
In the absence of load shedding strategies, I think that long timeouts
are the right answer.
--
Alan Robertson <[EMAIL PROTECTED]>
"Openness is the foundation and preservative of friendship... Let me
claim from you at all times your undisguised opinions." - William
Wilberforce
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems