Re: [Linux-HA] ocf:heartbeat:apache resource agent and timeouts

Lars Ellenberg Thu, 12 Apr 2012 03:23:05 -0700

On Thu, Apr 12, 2012 at 12:06:54PM +0200, Lars Ellenberg wrote:
> On Sun, Apr 08, 2012 at 03:16:17PM +0200, David Gubler wrote:
> > Hi Lars,
> > 
> > On 05.04.2012 18:53, Lars Ellenberg wrote:
> > > Uhm, "invalid test case".
> > >
> > > rather try:
> > > iptables -I INPUT -p tcp --dport 80 -i lo -j REJECT
> > > or even
> > > iptables -I INPUT -p tcp --dport 80 -i lo -j REJECT --reject-with 
> > > tcp-reset
> > Yes, then it works, but that's not surprising, because in this case the 
> > operations return immediately and never time out. But why should a 
> > non-responsive apache be an invalid test case? We've reached apache's 
> > connection limit more than once, and from the client's point of view 
> > this produces a very similar effect to '-j DROP'.
> > 
> > 
> > > Pacemaker behaviour is just the same,
> > > whether a monitor action "timed out", or "failed".
> > 
> > I've come to the conclusion that this just isn't true, please see my 
> > other mail, I've listed all the steps I did in detail.
> > 
> > 
> > >
> > > After the monitor action timed out or failed,
> > > the recovery action by pacemaker would be to stop the service,
> > > and restart it (there or elsewhere).
> > >
> > > Did that not happen?
> > >
> > > The start operation of the apache RA internally does monitor as well,
> > > so it likely times out as well.
> > >
> > > I'd expect the cluster to move the unresponsive apache to some other
> > > node, after monitor and restart timed out.  Which I think is the right
> > > thing to do.
> > 
> > I'm using unmanaged resources, because for our application there's no 
> > point in having Pacemaker shut down apache (apache can be used on all 
> > hosts in parallel and without restrictions). So no stop/start for us.
> 
> Right. So the resources are not managed.
> Did you mention that before?


Hm. So you did. Guess my auto-correction while reading dropped that line...

primitive apache ocf:heartbeat:apache \
         params testconffile="/etc/ha.d/doodletest.pm"
         testname="doodle"\
         op monitor interval="30" timeout="20" \
         op monitor interval="31" timeout="20" role=Stopped \
         meta is-managed="false"


I think that "monitor role=Stopped" thing works for primitives.
It may work for clones, I'd have to double check that.
iirc, it does not work for ms resources.
At least not last time I checked.

> I won't argue with that, if you think that is how it should be, so be it.
> 
> Pacemaker does not monitor resources that are supposed to
> be stopped for "reviving on their own".
> Not by default, at least.
> 
> I suggest you add a "monitor" action for "role=Stopped"
> (with a different interval!)
> 
> So the better subject would have been
> How to configure Pacemaker to monitor (unmanaged) stopped resources

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] ocf:heartbeat:apache resource agent and timeouts

Reply via email to