Hi,

On Wed, Apr 04, 2012 at 07:42:12PM +0200, David Gubler wrote:
> Hi Dejan,
> 
> On 04.04.2012 17:56, Dejan Muhamedagic wrote:
> > The timeout is a timeout, wherever it happens.
> 
> Unfortunately not! If the monitor operation times out and Pacemaker 
> moves on, the wget process (and thus the whole monitor process) will 
> keep running. In fact, it may still be running many minutes after the 
> timeout happened. And since the monitor (at least in case of the apache 
> resource agent) can't be run twice in parallel, this effectively 
> prevents further monitor operations until wget has timed out. And that's 
> exactly where we get a problem.

Hmm, the process running the monitor operation should be removed
(killed) by lrmd on timeout. If that doesn't happen, then you
just hit a jackpot bug!

> > So, you want the resource agent to notice while running monitor
> > that it can now talk to the server?
> Yes, I want automatic recovery. The resource agent should notice when 
> apache is back and working again. And that works fine with a patched 
> apache resource agent.

Hmm, I though we were past this... and I still don't see the
patch :)

Cheers,

Dejan

> >> On a side note:
> >> The apache resource agent allows to supply a config file, where one can
> >> override the parameters for curl/wget. But the implementation here is
> >> bogus, because even if you supply this file, it always does a default
> >> test with default parameters first, so this is useless in this case...
> >> (I consider this behavior to be a bug).
> > If you use a config test file, you'd need to define a monitor
> > with depth 10. The depth 0 monitor (default) is always testing
> > the statusurl.
> Yes, I figured that, but it's besides the point. If I use depth 10, it 
> will first do the simple (depth 0) test anyway (!), and after that the 
> advanced (depth 10) test. And since the simple test doesn't have a 
> useful timeout for wget, it will still stall for a long time if apache 
> doesn't respond, and it is irrelevant what the advanced test does.
> 
> @simple tests: Even though we run a complex web application behind 
> apache (apache acts as a load balancer using mod_jk), I don't want a 
> more complex test than fetching /server-status on localhost. This simple 
> test already shows that apache is working and has threads available for 
> clients to connect. Failover for our application servers is done by 
> mod_jk, I don't need Heardbeat/Pacemaker for that. Think of it as 
> independent failover at each layer: Virtual IPs with Heartbeat/Pacemaker 
> for failover between Apaches, mod_jk for failover between Tomcats, 
> mmm_monitor for failover between MySQL servers.
> 
> 
> Best regards,
> 
> David
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to