Hi,

thanks a lot for the idea. But in fact we are using ESXi, which uses
incredibly few resources. Moreover we are allocating only half the memory
and CPU cores to the VM, so the real machine is pretty relaxed :)
I'm even afraid that it might enter a lower energy consumption mode which
is causing these problems. Any chance might be that?

Independently of the reason, what's the best way to help me debugging the
problem?

Cheers
Fernando

On 18 September 2012 17:12, David Lang <[email protected]> wrote:

> On Tue, 18 Sep 2012, Dejan Muhamedagic wrote:
>
> > Hi,
> >
> > On Tue, Sep 18, 2012 at 11:18:54AM +0200, Fernando Pereira wrote:
> >> Hi there.
> >> This is my first post for this list as I haven't had problems with
> >> heartbeat, until now :)
> >>
> >> We have a dual server fail-back configuration in place, in which the two
> >> servers have identical resources (nfs, drbd...).
> >> Last week I upgraded a system and replaced one of the servers by a
> virtual
> >> machine and installed the latest available version of heartbeat
> available
> >> via yum (3.0.4).
> >>
> >> Since then Im having a lot of problems with "Late heartbeat" and false
> dead
> >> nodes. Before we could have a "Dead time" of 10sec, while now 30 is not
> >> enough.
> >>
> >> Looking into the log files I could find the following entry, among other
> >> similar:
> >> "Gmain_timeout_dispatch: Dispatch function for send local status was
> >> delayed 30590 ms (> 1010 ms) before being called (GSource: 0x14209a0)"
> >>
> >> I guess it means that for some reason the function call took over 30
> >> seconds??
> >> In my understanding this number is, at least, three orders of magnitude
> >> higher than any acceptable value, even under the worst machine load
> >> scenarios.
> >> Is there a known problem with this version of heartbeat? Or does anybody
> >> experiences this kind of problems when running over a virtual machine
> (ESXi
> >> 5.0)?
> >
> > I'd suspect a scheduler issue. The VM is probably starved, hence
> > that long delays. You should check the vmware docs or forums.
>
> I've seen similar logs with real hardware when the system is overheating
> and the
> CPU gets paused by the thermal protection circuits.
>
> I would agree that the host system is probably badly oversubscribed and so
> the
> VM is getting starved.
>
> David Lang
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to