On Thu, Dec 20, 2012 at 11:43:20AM +0000, James Harper wrote: > > Hi, > > > > On Tue, Dec 18, 2012 at 10:58:18AM +0000, James Harper wrote: > > > For the following failure: > > > > > > Failed actions: > > > p_lvm_iscsi:0_monitor_10000 (node=bitvs6, call=57, rc=-2, > > > status=Timed Out): unknown exec error > > > > > > Is this the ra itself returning a "Timed Out" error, or is it the > > > cluster software determining that the ra is taking too long and so > > > killing it and declaring it failed? stonith kicks in > > > > The latter. > > > > > shortly after this happens so tracking it down is a bit of a pain. > > > > Is it expected? Normally, a monitor failing should cause a resource > > restart. If > > a resource fails to stop, it may be a resource agent bug. > > > > > It happens any time the system gets loaded (eg when making a config > > > change) > > > > What kind of change? > > > > > and I can't seem to put my finger on what is causing it. > > > > Which resource is that? Which version of resource agents do you run? > > > > Any cib change throws the system load up for 10-20 seconds, and then things > start timing out, despite having set the timeouts well in excess of the time > it takes for pacemaker to mark the resource as timed out.
Hmm, unless your CIB (the configuration) is really huge, that shouldn't be happening. I'd open a bugzilla with debian. Check beforehand which processes go wild. Increase timeouts to prevent resources failing and stonith. > All packages are from debian wheezy. I don't know which versions are currently in debian wheezy (looks like 1.1.7). Thanks, Dejan > James > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org