Hi,
On Wed, Jul 21, 2010 at 08:55:54AM -0300, mike wrote:
in my ha logs I have the entries that appear several times a night. Now
I know in a previous post I was told these were indicative of resource
contention. These clusters that are seeing these messages are on a zVM
LPAR so they share CPU, memory and so on. Previously when we saw these
errors, failovers occurred and sometimes things got so bad that
failovers didn't even work. After adding memory and CPU things settled
down but I still see occasional nights with these entries in the log
file. So how much should I concern myself with these entries? Last night
the clusters were fine - no failover, no mention of resource failing. Is
there anything I should do to mitigate these notifications, i.e. perhaps
increase warntime or deadtime? What do you recommend?
You can't tune heartbeat to avoid these warnings, but you should
probably set the warntime and deadtime to higher values to avoid
unneeded failovers. Don't know if it's possible to tune zVM to
make the LPARs less lazy.
Thanks,
Dejan
Jul 21 05:33:40 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for cpu limit was delayed 440
ms ( 50 ms) before being called (GSource: 0x8011cec0)
Jul 21 05:33:40 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: info:
Gmain_timeout_dispatch: started at 4307813164 should have started at
4307813120
Jul 21 06:47:56 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for send local status took too
long to execute: 70 ms ( 50 ms) (GSource: 0x801217d0)
Jul 21 07:42:34 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for send local status took too
long to execute: 60 ms ( 50 ms) (GSource: 0x801217d0)
Jul 21 07:45:06 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for send local status was
delayed 1530 ms ( 1510 ms) before being called (GSource: 0x801217d0)
Jul 21 07:45:06 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: info:
Gmain_timeout_dispatch: started at 4308601743 should have started at
4308601590
Jul 21 07:45:06 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for check for signals was
delayed 1760 ms ( 1510 ms) before being called (GSource: 0x80121aa0)
Jul 21 07:45:06 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: info:
Gmain_timeout_dispatch: started at 4308601766 should have started at
4308601590
Jul 21 07:45:33 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for send local status took too
long to execute: 190 ms ( 50 ms) (GSource: 0x801217d0)
Jul 21 08:44:32 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for send local status took too
long to execute: 200 ms ( 50 ms) (GSource: 0x801217d0)
Jul 21 09:20:04 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for send local status took too
long to execute: 130 ms ( 50 ms) (GSource: 0x801217d0)
Jul 21 10:47:41 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for cpu limit was delayed 380
ms ( 50 ms) before being called (GSource: 0x8011cec0)
Jul 21 10:47:41 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: info:
Gmain_timeout_dispatch: started at 4309697215 should have started at
4309697177
Jul 21 11:00:03 APAUAT1A.intranet.mydomain.com heartbeat: [2763]: WARN:
Gmain_timeout_dispatch: Dispatch function for send local status took too
long to execute: 60 ms ( 50 ms) (GSource: 0x801217d0)
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems