Hi, On Wed, Jun 04, 2008 at 08:37:50PM +0100, Rodrigo Borges Pereira wrote: > Hello, > > I have a two node cluster that occasionally has a weird behavior. The > cluster runs a number of Xen VM's with virtual disk files on top of a DRBD > device. Every night backups are done of each of the VM's, via rsync/ssh. > Sometimes, the load this generates causes hb to try to failover.
Why? > Then for > some reason it fails to do so, Logs should say why it fails. > and stays on the primary node. So all the > VM's shutdown and then boot again, on the same node. > > I'm pretty sure this has to do with timeout definitions, but what would be > the best locations to tune that? Your feeling may be right, but only logs could give us the whole story. If you're seeing "late heartbeat" messages or "node dead" or "node returning after partition" then you definitely need to adjust timing (keepalive, warntime, and deadtime). Note that the wording of warnings may be different. Otherwise, if the monitor operations are timing out, you should adjust the timeouts in the CIB. Thanks, Dejan > > TIA, > Rodrigo > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
