RE: [Pound Mailing List] when backend hangs

Joe Gooch Thu, 07 May 2009 12:27:29 -0700

> -----Original Message-----
> From: Xiwen Cheng [mailto:[email protected]]
> Sent: Thursday, May 07, 2009 4:55 AM
> To: [email protected]
> Subject: Re: [Pound Mailing List] when backend hangs
>
> Thanks for the links and insights. From what I've read, I'm not quite
> satisfied with the proposed solutions. That's just me of
> course, it's true
> one cannot satisfy everybody's desires.


I understand...  That's why I have 8 patches on my pound page. :)

> For now I'm focussing on solving the problem why the data
> source becomes
> unavailable (bad autofs with wildcard mappings!) on an
> irregular basis.
> Really painful to troubleshoot a problem one cannot reproduce.
>
> After that I think I'll write a patch that implements the "efficient
> monitor". For now the global idea is to ommit unnecessary URI
> checks to
> backends to determine the liveness of a backend.

Sounds like a good plan!

> Anyone still have this patch:
> http://www.apsis.ch/pound/pound_list/archive/2005/2005-11/1131
> 177343000
> The source is gone :(

I don't have 1.9.4, but I do have 1.9.1.  I believe the patch will apply 
against that.
The patch itself downloaded from the archive page for me, no problem.

http://users.k12system.com/mrwizard/pound/Pound-1.9.1.tgz

For 2.4, you'd likely want to move the AliveURI check into the SERVICE 
structure.

> I decided to read the pound manpage more thoroughly and came
> across this
> snippet in HIGH-AVAILABILITY section:
> > The clients that happen upon a dead backend server will
> just receive a
> > 503 Service Unavailable message.
> Does this mean: if the HAport check failed the backend is
> marked as dead.
> All clients (say using session type IP) on that backend will receive
> error-503. This only accounts for ongoing(STATE_ESTABLISHED)
> connections
> of these clients? New connections made by these clients will
> be forwarded
> to other backends in the pool.

All that logic is in svc.c in get_backend()

Looks to me like:
1) If there are no available backends, it uses the Emergency backend, or sends 
a 503.
2) If you have session affinity, it pulls the existing session. If none exists, 
it creates one with a random backend.
3) If not using session affinity it chooses a backend at random from the list 
of alive backends.

If the selected backend fails, it calls kill_be, which in addition to marking 
the backend dead, will ALSO clear all sessions using that backend.  Thus, those 
sessions would be recreated in the next pass.  Since that's in a loop, the 
failure is trapped and another backend is tried immediately.

Also in svc.c, in do_resurect (which runs every Alive seconds) the first thing 
it does is check all alive servers to make sure they're still alive, using the 
HAPort.  If it fails, the backend is marked dead.  If there's no HA port, it 
skips that check.

Then it checks already dead backends to see if they should now be alive, using 
the haport if available, backend port if not.

So, in short, if HAPort fails, the backend is marked dead.  If HAPort never 
succeeds, the backend stays dead.  If a client is part of a new session, it 
gets a random backend.  When a backend is marked dead, sessions using the old 
backend are deleted, causing them to get a new backend.

So clients shouldn't ever make it to a dead backend.

Take care!
Joe

--
To unsubscribe send an email with subject unsubscribe to [email protected].
Please contact [email protected] for questions.

RE: [Pound Mailing List] when backend hangs

Reply via email to