I have done some further testing creating a high load/io environment (by
doing make -j8 buildworld in an infinite loop for those familiar with
freebsd) and have written a very simple script to hammer the sites hosted by
the webserver on my test platform.  Running either individually generates no
warnings but running them simultaneously creates warning messages similar to
what I am getting in production, but with lesser delay values.  What
confuses me is that in my test environment I am generating a lot higher
load/disk io via the compilation, but its likely that the network activity
is less as I do not have a decent method for testing that as my script is
only hitting index pages.

Any thoughts?

Jul 25 14:20:31 spare1 heartbeat: [1297]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status took too long to execute: 1304 ms (>
1010 ms) (GSource: 0x5e3a18)
Jul 25 14:20:33 spare1 heartbeat: [1297]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status took too long to execute: 1257 ms (>
1010 ms) (GSource: 0x5e3a18)
Jul 25 14:20:51 spare1 heartbeat: [1297]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status took too long to execute: 3031 ms (>
1010 ms) (GSource: 0x5e3a18)
Jul 25 14:20:51 spare1 heartbeat: [1297]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status was delayed 1031 ms (> 1010 ms)
before being called (GSource: 0x5e3a18)
Jul 25 14:20:51 spare1 heartbeat: [1297]: WARN: Gmain_timeout_dispatch:
Dispatch function for check for signals was delayed 1710 ms (> 1010 ms)
before being called (GSource: 0x5e4218)
Jul 25 14:21:22 spare1 heartbeat: [1297]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status took too long to execute: 1195 ms (>
1010 ms) (GSource: 0x5e3a18)
Jul 25 14:21:25 spare1 heartbeat: [1297]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status took too long to execute: 1960 ms (>
1010 ms) (GSource: 0x5e3a18)
Jul 25 14:57:13 spare1 heartbeat: [1297]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status took too long to execute: 1078 ms (>
1010 ms) (GSource: 0x5e3a18)


On 7/25/07, Matt Wilder <[EMAIL PROTECTED]> wrote:

Sorry for the delayed reply, I had some other things I had to attend to.

I installed STABLE-2.1.0 from the download site and am having the same
problem.  I have not been able to figure out exactly what is causing it, but
I have a suspicion that it is cpu or disk io that is causing it since I have
not been able to replicated it in production.  The delays seem to get longer
as the software runs also. Below is a recent paste of log data.  I will look
into trying to induce high cpu and disk load on my test platform.

Could anyone point me in the right direction on how to proceed?  If this
could be a freebsd bug, could someone point me in the right direction as to
how to confirm this, or what to ask the freebsd development team?

Thanks.

Jul 25 11:41:51 sparky1 heartbeat: [12771]: WARN: Late heartbeat: Node
sparky1.domainit.com: interval 40187 ms
Jul 25 11:41:51 sparky1 heartbeat: [12771]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status took too long to execute: 35187 ms
(> 2510 ms) (GSource: 0x5e5e18)
Jul 25 11:41:51 sparky1 heartbeat: [12771]: WARN: Gmain_timeout_dispatch:
Dispatch function for send local status was delayed 30187 ms (> 2510 ms)
before being called (GSource: 0x5e5e18)
Jul 25 11:41:51 sparky1 heartbeat: [12771]: WARN: Gmain_timeout_dispatch:
Dispatch function for check for signals was delayed 35187 ms (> 2510 ms)
before being called (GSource: 0x5e7618)
Jul 25 11:41:51 sparky1 heartbeat: [12771]: WARN: Gmain_timeout_dispatch:
Dispatch function for update msgfree count was delayed 34703 ms (> 20000 ms)
before being called (GSource: 0x5e7818)
Jul 25 11:41:51 sparky1 heartbeat: [12771]: WARN: Gmain_timeout_dispatch:
Dispatch function for client audit was delayed 33875 ms (> 5000 ms) before
being called (GSource: 0x5e7418)


On 7/6/07, Lars Marowsky-Bree <[EMAIL PROTECTED]> wrote:
>
> On 2007-07-06T14:02:32, Matt Wilder <[EMAIL PROTECTED]> wrote:
>
> > Thanks Andrew.  I will install 2.1.0 and see if the problem persists.
>
> Best of luck!
>
> Though I'm not sure whether this will really help your particular
> problem. Late heartbeats tend to be issues with running at realtime
> priority, and those tend to be pretty OS specific, and that code hasn't
> changed much.
>
> Can you make the late heartbeat problems worse if you induce a high CPU
> and IO load, or something?
>
>
> Regards,
>     Lars
>
> --
> Teamlead Kernel, SuSE Labs, Research and Development
> SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar
> Wilde
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to