Jens Mayer schrieb:
Dear all,

as I set up a 2-nodes cluster using heartbeat, I encounter an annoying
problem related to the monitoring process of drbd.
Every few weeks, the drbd-Resource-Monitor times out and forces a failover.

Hi Jens,

without knowing the reason why 10 seconds aren't enough to monitor drbd, I do recommend to set the
monitor timeout much higher.
We had some discussions about proper action timeout values on this list. The conclusion was: Better set it too long than too short. Why? When timeout happens the cluster knows NOTHING. Can you really estimate the worst interval for monitoring the ressource in a heavy load scenario? How much time can you afford to wait getting an answer about the status of the ressource? What do you like more: Failing over if not necessary or
a failover which happened some seconds later as it probably could happen?

Best regards
Andreas Mock

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to