Jens Mayer schrieb:
Dear all,
as I set up a 2-nodes cluster using heartbeat, I encounter an annoying
problem related to the monitoring process of drbd.
Every few weeks, the drbd-Resource-Monitor times out and forces a failover.
Hi Jens,
without knowing the reason why 10 seconds aren't enough to monitor drbd,
I do recommend to set the
monitor timeout much higher.
We had some discussions about proper action timeout values on this list.
The conclusion was: Better set it
too long than too short. Why? When timeout happens the cluster knows
NOTHING. Can you really estimate
the worst interval for monitoring the ressource in a heavy load
scenario? How much time can you afford to wait
getting an answer about the status of the ressource? What do you like
more: Failing over if not necessary or
a failover which happened some seconds later as it probably could happen?
Best regards
Andreas Mock
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems