On Thu, Jun 19, 2008 at 12:33:39PM -0400, Greg Haase wrote: > By daily report, I'm referring to the following: > > heartbeat[11160]: 2008/06/18_02:01:42 info: Daily informational memory > statistics > heartbeat[11160]: 2008/06/18_02:01:42 info: MSG stats: 6/2005357 ms age > 0 [pid11160/MST_CONTROL] > heartbeat[11160]: 2008/06/18_02:01:42 info: cl_malloc stats: > 686/61448637 186952/93866 [pid11160/MST_CONTROL] > heartbeat[11160]: 2008/06/18_02:01:42 info: RealMalloc stats: 412208 > total malloc bytes. pid [11160/MST_CONTROL] > heartbeat[11160]: 2008/06/18_02:01:42 info: Current arena value: 0 > heartbeat[11160]: 2008/06/18_02:01:42 info: MSG stats: 0/0 ms age > 5006357820 [pid11162/HBFIFO] > heartbeat[11160]: 2008/06/18_02:01:42 info: cl_malloc stats: 360/415 > 32416/14005 [pid11162/HBFIFO] > heartbeat[11160]: 2008/06/18_02:01:42 info: RealMalloc stats: 32416 > total malloc bytes. pid [11162/HBFIFO] > heartbeat[11160]: 2008/06/18_02:01:42 info: Current arena value: 0 > heartbeat[11160]: 2008/06/18_02:01:42 info: MSG stats: 0/0 ms age > 5006357820 [pid11163/HBWRITE] > heartbeat[11160]: 2008/06/18_02:01:42 info: cl_malloc stats: 380/441185 > 44048/20302 [pid11163/HBWRITE] > heartbeat[11160]: 2008/06/18_02:01:42 info: RealMalloc stats: 52856 > total malloc bytes. pid [11163/HBWRITE] > heartbeat[11160]: 2008/06/18_02:01:44 info: Current arena value: 0 > heartbeat[11160]: 2008/06/18_02:01:45 info: MSG stats: 0/0 ms age > 5006360100 [pid11164/HBREAD] > heartbeat[11160]: 2008/06/18_02:01:45 info: cl_malloc stats: 381/830154 > 44140/20366 [pid11164/HBREAD] > heartbeat[11160]: 2008/06/18_02:01:45 info: RealMalloc stats: 52620 > total malloc bytes. pid [11164/HBREAD] > heartbeat[11160]: 2008/06/18_02:01:45 info: Current arena value: 0 > heartbeat[11160]: 2008/06/18_02:01:45 info: MSG stats: 0/0 ms age > 5006360800 [pid11165/HBWRITE] > heartbeat[11160]: 2008/06/18_02:01:45 info: cl_malloc stats: 392/441209 > 46176/21694 [pid11165/HBWRITE] > heartbeat[11160]: 2008/06/18_02:01:45 info: RealMalloc stats: 63388 > total malloc bytes. pid [11165/HBWRITE] > heartbeat[11160]: 2008/06/18_02:01:45 info: Current arena value: 0 > heartbeat[11160]: 2008/06/18_02:01:45 info: MSG stats: 0/0 ms age > 5006360800 [pid11166/HBREAD] > heartbeat[11160]: 2008/06/18_02:01:45 info: cl_malloc stats: 393/1660076 > 46268/21758 [pid11166/HBREAD] > heartbeat[11160]: 2008/06/18_02:01:45 info: RealMalloc stats: 62968 > total malloc bytes. pid [11166/HBREAD] > heartbeat[11160]: 2008/06/18_02:01:45 info: Current arena value: 0 > heartbeat[11160]: 2008/06/18_02:01:45 info: MSG stats: 0/760538 ms age > 2980 [pid11167/HBWRITE] > heartbeat[11160]: 2008/06/18_02:01:45 info: cl_malloc stats: > 404/20292775 48304/23086 [pid11167/HBWRITE] > heartbeat[11160]: 2008/06/18_02:01:45 info: RealMalloc stats: 79032 > total malloc bytes. pid [11167/HBWRITE] > heartbeat[11160]: 2008/06/18_02:01:45 info: Current arena value: 0 > heartbeat[11160]: 2008/06/18_02:01:45 info: MSG stats: 0/345579 ms age > 2980 [pid11168/HBREAD] > heartbeat[11160]: 2008/06/18_02:01:45 info: cl_malloc stats: 405/7257647 > 48396/23150 [pid11168/HBREAD] > heartbeat[11160]: 2008/06/18_02:01:45 info: RealMalloc stats: 50748 > total malloc bytes. pid [11168/HBREAD] > heartbeat[11160]: 2008/06/18_02:01:45 info: Current arena value: 0 > heartbeat[11160]: 2008/06/18_02:01:45 info: These are nothing to worry > about. > heartbeat[11160]: 2008/06/18_02:01:45 WARN: Gmain_timeout_dispatch: > Dispatch function for memory stats took too long to exe > cute: 2990 ms (> 100 ms) (GSource: 0x1176a208) > heartbeat[11160]: 2008/06/18_02:01:45 WARN: Gmain_timeout_dispatch: > Dispatch function for send local status was delayed 270 > 0 ms (> 510 ms) before being called (GSource: 0x11769de8) > heartbeat[11160]: 2008/06/18_02:01:45 info: Gmain_timeout_dispatch: > started at 500636081 should have started at 500635811 > heartbeat[11160]: 2008/06/18_02:01:45 WARN: Gmain_timeout_dispatch: > Dispatch function for check for signals was delayed 295 > 0 ms (> 510 ms) before being called (GSource: 0x1176a508) > heartbeat[11160]: 2008/06/18_02:01:45 info: Gmain_timeout_dispatch: > started at 500636081 should have started at 500635786
I misunderstood, thought that you were referring to the actual timeouts. Well, this seems like yet another proof that the host (I suppose that this is the one running mysql) is overwhelmed. Thanks, Dejan > On Thu, 2008-06-19 at 18:13 +0200, Dejan Muhamedagic wrote: > > Hi, > > > > On Thu, Jun 19, 2008 at 11:24:09AM -0400, Greg Haase wrote: > > > Attached, please find an hb_report created for this particular setup for > > > the timeframe when the issue occurred. > > > > > > I realize that we're not supposed to sanitize these because it could > > > obfuscate important information, but I've had to go through and sed > > > replace a bunch of stuff for security reasons. I hope I didn't destroy > > > anything useful to troubleshooting. > > > > That's no problem. I'll try to include some more sanitization > > (currently only passwords are obfuscated), though the problem may > > be the pengine files which have some kind of protection hash, and > > changes might make them useless. > > > > > Also, I noticed that I almost _always_ get one of these G_SIG_dispatch > > > delays in the logs at the time when the daily report information is > > > output. > > > > What is "the daily report information"? If it's a CPU/disk > > intensive operation and if those failovers happened at the same > > time, you may have found an explanation. > > > > Thanks, > > > > Dejan > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
