On 11/5/07, Andrew Beekhof <[EMAIL PROTECTED]> wrote:
> On 11/3/07, Christian Rishøj <[EMAIL PROTECTED]> wrote:
> > On 11/3/07, Andrew Beekhof <[EMAIL PROTECTED]> wrote:
> > >
> > > On Nov 3, 2007, at 5:01 AM, Christian Rishøj wrote:
> > >
> > > >
> > > > On 2 Nov 2007, at 14:36, Andrew Beekhof wrote:
> > > >
> > > >>
> > > >> On Nov 1, 2007, at 1:32 AM, Christian Rishøj wrote:
> > > >>
> > > >>> Hi,
> > > >>>
> > > >>> Running 2.1.2 on Ubuntu, 2.6.23 x86_64.
> > > >>>
> > > >>> I am seeing a lot of "failed to get the value of field lrm_opstatus
> > > >>> from a ha_msg" in the syslog.
> > > >>>
> > > >>> These seem to give rise to "do_lrm_invoke: Forcing a local LRM
> > > >>> refresh", which in turn seem to restart the services, causing
> > > >>> interuption.
> > > >>>
> > > >>> What may be the cause of this? Syslog extract, CIB and
> > > >>> configuration attached.
> > > >>
> > > >> Basically, it's this option:
> > > >>
> > > >>   <nvpair id="remove-after-stop" name="remove-after-stop"
> > > >> value="true"/>
> > > >>
> > > >> I'd not enable this option.  Is there a particular reason you
> > > >> enabled it?
> > > >
> > > > Yes. I was seeing resources being restarted whenever I made changes
> > > > to seemingsly unrelated resources in the CIB. After a while I
> > > > tracked the problem down to some leftover state in the CIB/LRM from
> > > > previously deleted resources. Hoping to prevent state being left
> > > > behind deleting resources in the future, I added
> > > >
> > > >   <nvpair id="remove-after-stop" name="remove-after-stop"
> > > > value="true"/>
> > > >
> > > > as well as
> > > >
> > > >   <nvpair id="remove-after-stop" name="stop-orphan-resources"
> > > > value="true"/>
> > > >
> > > > to the cluster options.
> > > >
> > > > I am removing the former now, but would appreciate a hint on "best
> > > > practices" to avoid the problem I was seeing at the time.
> > >
> > > That depends on what the "seemingly unrelated" changes were
> >
> > As I remember the situation at that time, I had an IPaddr2 resource by
> > itself, with no dependencies specified. I believe I changed the
> > cidr_netmask.
>
> That would be enough to restart the IP and anything that depended on it...

Right. However, at the time, independent resources were restartet as
well. It turned out to be some leftover state from previously defined
resources (same parameters, different ids). Heartbeat would report
something like "nothing known about orphan resource XX running on YY"
and "making sure orphan resource XX is stopped". Pruning this leftover
state solved the problems of unexpected restarts.

Regards
Christian
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to