Hi there. This is my first post for this list as I haven't had problems with heartbeat, until now :)
We have a dual server fail-back configuration in place, in which the two servers have identical resources (nfs, drbd...). Last week I upgraded a system and replaced one of the servers by a virtual machine and installed the latest available version of heartbeat available via yum (3.0.4). Since then Im having a lot of problems with "Late heartbeat" and false dead nodes. Before we could have a "Dead time" of 10sec, while now 30 is not enough. Looking into the log files I could find the following entry, among other similar: "Gmain_timeout_dispatch: Dispatch function for send local status was delayed 30590 ms (> 1010 ms) before being called (GSource: 0x14209a0)" I guess it means that for some reason the function call took over 30 seconds?? In my understanding this number is, at least, three orders of magnitude higher than any acceptable value, even under the worst machine load scenarios. Is there a known problem with this version of heartbeat? Or does anybody experiences this kind of problems when running over a virtual machine (ESXi 5.0)? Thanks a lot for any help. Cheers _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
