On Thu, Apr 12, 2012 at 12:06:54PM +0200, Lars Ellenberg wrote:
> On Sun, Apr 08, 2012 at 03:16:17PM +0200, David Gubler wrote:
> > Hi Lars,
> >
> > On 05.04.2012 18:53, Lars Ellenberg wrote:
> > > Uhm, "invalid test case".
> > >
> > > rather try:
> > > iptables -I INPUT -p tcp --dport 80 -i lo -j REJECT
> > > or even
> > > iptables -I INPUT -p tcp --dport 80 -i lo -j REJECT --reject-with
> > > tcp-reset
> > Yes, then it works, but that's not surprising, because in this case the
> > operations return immediately and never time out. But why should a
> > non-responsive apache be an invalid test case? We've reached apache's
> > connection limit more than once, and from the client's point of view
> > this produces a very similar effect to '-j DROP'.
> >
> >
> > > Pacemaker behaviour is just the same,
> > > whether a monitor action "timed out", or "failed".
> >
> > I've come to the conclusion that this just isn't true, please see my
> > other mail, I've listed all the steps I did in detail.
> >
> >
> > >
> > > After the monitor action timed out or failed,
> > > the recovery action by pacemaker would be to stop the service,
> > > and restart it (there or elsewhere).
> > >
> > > Did that not happen?
> > >
> > > The start operation of the apache RA internally does monitor as well,
> > > so it likely times out as well.
> > >
> > > I'd expect the cluster to move the unresponsive apache to some other
> > > node, after monitor and restart timed out. Which I think is the right
> > > thing to do.
> >
> > I'm using unmanaged resources, because for our application there's no
> > point in having Pacemaker shut down apache (apache can be used on all
> > hosts in parallel and without restrictions). So no stop/start for us.
>
> Right. So the resources are not managed.
> Did you mention that before?
Hm. So you did. Guess my auto-correction while reading dropped that line...
primitive apache ocf:heartbeat:apache \
params testconffile="/etc/ha.d/doodletest.pm"
testname="doodle"\
op monitor interval="30" timeout="20" \
op monitor interval="31" timeout="20" role=Stopped \
meta is-managed="false"
I think that "monitor role=Stopped" thing works for primitives.
It may work for clones, I'd have to double check that.
iirc, it does not work for ms resources.
At least not last time I checked.
> I won't argue with that, if you think that is how it should be, so be it.
>
> Pacemaker does not monitor resources that are supposed to
> be stopped for "reviving on their own".
> Not by default, at least.
>
> I suggest you add a "monitor" action for "role=Stopped"
> (with a different interval!)
>
> So the better subject would have been
> How to configure Pacemaker to monitor (unmanaged) stopped resources
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems