Hi, On Tue, Sep 28, 2010 at 11:37:02AM +0200, Andrew Beekhof wrote: > On Thu, Sep 23, 2010 at 8:49 PM, Phil Armstrong <p...@sgi.com> wrote: > > I posted earlier asking for help because I had a primitive whose monitor > > operation was not getting canceled at the time that a manual relocation was > > performed. I updated pacemaker (as was suggested) to pacemaker-1.1.2-0.6.1 > > which is the latest I could find for an IA64 platform without having to > > build from source. If anyone knows of a later IA64 binary version I would > > appreciate that information. > > 1.1.3 came out the other day. > which distro are you using? > > > > > The monitor problem persisted after the upgrade, though the error messages I > > was seeing earlier were no longer present. They were apparently unrelated. > > Painful trial and error lead me to the conclusion that it was the > > primitive's start-op timeout and monitor-op start-delay values. When I had > > these values set at 480s, the monitor-op did not get canceled for a manual > > relocation and so would get rescheduled after the relocation only to find > > the resource not operational (it had been relocated) and thus set the > > fail-count to non-zero, fencing the resource. If I set the values to 240s, > > everything went smoothly and the monitor-op was canceled. > > > > As a test, I changed a different primitive's values to 480s and that > > primitive then displayed the failing behavior. > > > > If anyone knows why this might be the case (perhaps there are rules I am > > unaware of that prohibit larger values) I would appreciate the information. > > If not, I guess I should will a bug. > > > > Thanks for any help in advance. > > Hmmmm, which version of cluster-glue do you have? > This sounds like it might be related to > > dejan () High: LRM: lrmd: don't allow cancelled operations to get back > to the repeating op list (lf#2417) CS: fc141b7e1e19 On: 2010-06-10 > > which first appeared in cluster-glue 1.0.6 IIRC
Yes, it's in 1.0.6. That looks like the most plausible explanation. Thanks, Dejan > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker