Re: [Pound Mailing List] when backend hangs

Xiwen Cheng Thu, 07 May 2009 02:22:07 -0700

Thanks for the links and insights. From what I've read, I'm not quite
satisfied with the proposed solutions. That's just me of course, it's true
one cannot satisfy everybody's desires.


For now I'm focussing on solving the problem why the data source becomes
unavailable (bad autofs with wildcard mappings!) on an irregular basis.
Really painful to troubleshoot a problem one cannot reproduce. 

After that I think I'll write a patch that implements the "efficient
monitor". For now the global idea is to ommit unnecessary URI checks to
backends to determine the liveness of a backend. 

Anyone still have this patch:
http://www.apsis.ch/pound/pound_list/archive/2005/2005-11/1131177343000
The source is gone :(

I decided to read the pound manpage more thoroughly and came across this 
snippet in HIGH-AVAILABILITY section:
> The clients that happen upon a dead backend server will just receive a
> 503 Service Unavailable message.
Does this mean: if the HAport check failed the backend is marked as dead.
All clients (say using session type IP) on that backend will receive 
error-503. This only accounts for ongoing(STATE_ESTABLISHED) connections 
of these clients? New connections made by these clients will be forwarded
to other backends in the pool.

Best regards,
Xiwen

On Wed, May 06, 2009 at 09:54:16AM -0400, Joe Gooch wrote:
> The man page documents the behavior under "HIGH-AVAILABILITY".... Basically 
> it polls that port every "Alive" seconds.
> 
> But the idea is that the port you check with HAPort is not the same port as 
> the HTTP daemon.

> True, but if your data source isn't available, your backend isn't going to be 
> able to serve the data, right?
> 
> In my case, one request running a long report may mean that one request takes 
> a while to complete, but my other request threads are behaving as normal.  
> Unless all of my request threads are full, which Pound isn't going to know, 
> because the vast majority of the requests are succeeding.  Plus, since I'm 
> using session affinity, I would want Pound to be *very* sure that none of the 
> requests are going to succeed to that backend before breaking sessions.
> 
> It's entirely possible that dynamic scaling might help you, as that tries to 
> use timeout values to determine the better/best backend for any given 
> connection.
> 
> It's also possible that your HAPort script doesn't just check NFS, or 
> datasources.  It could also run a simple HTTP request against the backend to 
> verify it responds.  I think the flexibility of the system was the reason it 
> was done this way...  Since all applications/backends are very specific to 
> their use, it's hard to implement a solution that would work for everyone.

> I don't really see any reason this couldn't be done.  It just means 
> thr_resurect() in svc.c will need some additional code, such that if the 
> connect succeeds, it sends a GET request to a URL (if defined in the config), 
> and then it would need to know success conditions. (HTTP status code of 200?  
> Response in n seconds or less?)  Might be worth it if Robert weighed in.
> 
> Then again, this was suggested 11/05/2005.
> http://www.apsis.ch/pound/pound_list/archive/2005/2005-11/1131177343000
> 
> I think the difference in your case is that it would only check the Alive URL 
> to resurrect, not regularly.
> 
> > > This question comes up a lot. I'm sure there are plenty of
> > examples in the list archives.
> > Couldn't find them. Maybe I should look harder.
> 
> http://osdir.com/ml/web.pound.general/2006-03/msg00055.html
> http://www.apsis.ch/pound/pound_list/archive/2006/2006-06/1151100017000/index_html?fullMode=1
> http://www.apsis.ch/pound/pound_list/archive/2006/2006-12/1165505787000
> 
> > > It is interesting however that kill_be does not log that it
> > is killing a backend... That should likely happen.
> > Indeed. Maybe someone else can shed some lights on this kind
> > of bevahiour?
> 
> I've put a patch on my site that should add additional log messages.
> 
> http://users.k12system.com/mrwizard/pound/pound24.html



-- 
--
Xiwen Cheng
System Administrator            ;" Enthusiasm is contagious,
Mathematical Institute          ;  but hype is a disease. "
Leiden University               ;E-mail: [email protected]
Niels Bohrweg 1 K210            ;Office: (+31) 715277134
2333 CA Leiden                  ;Mobile: (+31) 611119991
The Netherlands                 ;GPG Key id: 194F572B
++


--
To unsubscribe send an email with subject unsubscribe to [email protected].
Please contact [email protected] for questions.

Re: [Pound Mailing List] when backend hangs

Reply via email to