On Thu, Jan 13, 2011 at 10:17:40AM -0600, Igor Chudov wrote: > Again, after about 3-4 days of running, heartbeat master process dies with > SIGXCPU. > > I was fortunate to run strace -p on it, so I captured strace. It looks like > boring, garden variety regular work, and then heartbeat dies with SIGXCPU. > The output is a bit lengthy. > > Is there some way to turn OFF the timeout on CPU?
heartbeat sources, heartbeat/heartbeat.c, look out for cl_cpu_limit_setpercent which itself is defined in glue sources, glue/lib/clplumbing/cpulimits.c There the head comment block explains the intention of it: * This allows us to better catch runaway realtime processes that * might otherwise hang the whole system (if they're POSIX realtime * processes). * * We do this by getting a "lease" on CPU time, and then .... You could of course simply kill invokations of it. It would be interesting to know what heartbeat spends its cpu time on, though, so maybe you can try to profile it? It should usually not consume that much cpu. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
