Pound will only mark a backend dead if the TCP connection to the backend fails. (For instance, I'll add an iptables rule on the backend to REJECT connections to the http port when doing maintenance) Similarly, resurrect checks for a TCP connection to the backend.
What you're talking about would happen if the TCP connection succeeded and the httpd could not return data. This could also happen if a backend process were running and generating content, but took a long time to complete. (This happens a lot in my situation) I wouldn't want my backend to be marked dead because someone ran a large report. Which is why the checks for life are so rudimentary in pound. But it's also why there's a HAPort directive. You can craft a simple perl script that listens on a different port, tries to read a dummy file from NFS on connect attempts, and runs an accept() call if the check succeeds. If it doesn't, the backend will be marked dead and stay dead until that check succeeds. This question comes up a lot. I'm sure there are plenty of examples in the list archives. It is interesting however that kill_be does not log that it is killing a backend... That should likely happen. Take care! Joe > -----Original Message----- > From: Xiwen Cheng [mailto:[email protected]] > Sent: Monday, May 04, 2009 5:28 AM > To: [email protected] > Subject: Re: [Pound Mailing List] when backend hangs > > Bump. > > A scenario this might occur is: Say webdata are served over > NFS, but if the NFS > server becomes unresponsive, either it's a local(backend > host) problem or > the NFS server. As a result requests coming in for apache all > stall until > they time out. > > The bottomline is, I think the condition to resurrect a > backend must be > stricter. > > Anyone can provide more insight in this matter? > > Kind regards, > Xiwen > > On Mon, Apr 06, 2009 at 03:13:19PM +0200, Xiwen Cheng wrote: > > Product: Pound-2.4.4 > > > > To avoid confusion I will define some terms up front: > > * backend hangs: the backend is still active listening for > connections > > but doesn't respond. As a result all requests time out. > > * backend died: the backend processes are dead. Its > associated ports were > > released back to the system. > > > > When a backend hangs, pound doesn't seem to mark it as "dead". But > > later in time it resurrects the backend: > > Apr 5 06:08:22 web pound: (4079d950) connect_nb: error > after getsockopt: Connection timed out > > Apr 5 06:08:22 web pound: (4079d950) backend > XXX.XXX.XXX.XXX:80 connect: Connection timed out > > Apr 5 06:08:36 web pound: (40185950) e500 response error > read from XXX.XXX.XXX.XXX:80/GET /jun2001/ HTTP/1.1: > Connection reset by peer (224.085 secs) > > Apr 5 06:08:39 web pound: BackEnd XXX.XXX.XXX.XXX:80 resurrect > > Apr 5 06:08:41 web pound: (405d6950) e500 response error > read from XXX.XXX.XXX.XXX:80/GET / HTTP/1.1: Connection reset > by peer (215.073 secs) > > > > Pound sees that the backend doesn't respond but doesn't > mark it as dead. > > No occurence of "dead" were found prior to "resurrect". > Even though pound > > reports the backend couldn't handle the requests (time outs > or connection > > reset by peer), pound still dispatches them to this faulty backend. > > > > A checklist based on the possible log-messages to trace > what could have > > happened in connect_nb(): > > Line message Present in syslog? > > svc.c:787 fcntl GETFL failed n > > svc.c:791 fcntl SETFL failed n > > svc.c:798 connect failed n > > svc.c:805 fcntl reSETFL failed n > > svc.c:817 poll timed out n > > svc.c:820 poll failed n > > svc.c:827 getsockopt failed n > > svc.c:833 fcntl reSETFL failed n > > svc.c:840 error after getsockopt y** > > > > ** see log-snippet in the beginning > > > > connect_nb() returned -1 several times when it was called, > but why isn't > > the backend marked as dead? > > -- > -- > Xiwen Cheng > System Administrator ;" Enthusiasm is contagious, > Mathematical Institute ; but hype is a disease. " > Leiden University ;E-mail: [email protected] > Niels Bohrweg 1 K210 ;Office: (+31) 715277134 > 2333 CA Leiden ;Mobile: (+31) 611119991 > The Netherlands ;GPG Key id: 194F572B > ++ > > > -- > To unsubscribe send an email with subject unsubscribe to > [email protected]. > Please contact [email protected] for questions. > -- To unsubscribe send an email with subject unsubscribe to [email protected]. Please contact [email protected] for questions.
