Hi, On Tue, Jan 26, 2010 at 02:36:55PM +0100, Peter Luciak wrote: > Dejan Muhamedagic wrote / napĂsal(a): > > Hi, > > > > On Fri, Jan 22, 2010 at 08:39:35AM +0100, Peter Luciak wrote: > >> Hi, > >> > >> I'm running into weird problems on a Heartbeat v1 cluster: Heartbeat > >> restarts itself with the message: > >> > >> heartbeat[2419]: 2010/01/22_06:30:35 WARN: Exiting HBREAD process 3272 > >> killed by signal 24 [SIGXCPU - CPU limit exceeded]. > >> heartbeat[2419]: 2010/01/22_06:30:35 ERROR: Exiting HBREAD process 3272 > >> dumped core > >> heartbeat[2419]: 2010/01/22_06:30:35 ERROR: Core heartbeat process died! > >> Restarting. > > > > The read process CPU usage is limited to 10 percent. According to > > ha.cf below, heartbeats are every 5 seconds which is quite low. > > Quite low? So you suggest to increase the interval? I wonder what is the > recommended interval for heartbeats?
Sorry, I meant to say "low frequency". There's no recommended interval, but I guess that 5 seconds is on the "high" side. I'd probably make it 1-2 seconds. > >> setserial /dev/ttyS0 > >> /dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4 > >> > >> I turned off the serial line in ha.cf (interestingly I stopped seeing > >> serial in /proc/interrupts afterwards) to see if that will help. > > > > So, did it? > > Yup, after stopping the serial comms, heartbeat didn't crash at all in > the past 4 days. So it was definitely something with the serial line... A kernel problem? Wasn't that some old distribution? Thanks, Dejan > Thanks > Peter > -- > Peter LUCIAK ([email protected]) > IBL Software Engineering, http://www.iblsoft.com/ > Mierová 103, 82105 Bratislava, Slovakia > Phone: +421-2-32662111, Fax: +421-2-32662110 > Direct: +421-2-32662175 > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
