Hi Alexey, On Wed, Jul 13, 2011 at 09:50:25PM +0400, Alexey Vlasov wrote: > On Wed, Jul 13, 2011 at 07:57:05AM +0200, Willy Tarreau wrote: > > > > > > I've got such a scheme on the shared hosting: > > > +- apache_pool1 > > > | > > > apache_fe -> haproxy -|- apache_pool2 > > > | > > > +- apache_pool3 > > > ... > > > > you should have at least "option http-server-close" in your config, > > I don't know whether it is important or not but for some reasons > keep-alive is switched off everywhere in Apache.
OK, this is very common anyway. I just wanted to ensure we were not missing something. Still, "option http-server-close" will actively track the data exchanges on the connection and will be able to actively close the server connection as soon as haproxy receives all the data, which substantially reduces the amount of concurrent connections on each server. But that's an optimization point, it's not needed right now, so let's ignore this for now. > > > 2. haproxy access.log: > > > Jul 12 22:28:04 l19 haproxy_aux2_pools[4944]: 111.111.111.111:42001 > > > [12/Jul/2011:22:28:02.281] backend_pool1 backend_pool1/pool1 > > > 0/0/0/-1/2084 502 204 - - SH-- 24/6/6/6/0 0/0 {clientvhost.com:9099} "GET > > > /?option=com_sobi2&sobi2Task=sobi2Details&sobi2Id=80&default=80&Itemid=7 > > > HTTP/1.1" > > > > > > > The "SH" flags indicate that the server has reset the connection while > > responding. Looking closer, the server waited 2 seconds before doing > > that. Do you know if it is possible that the log was emitted just before > > a process crashed ? Since Apache automatically restarts missing processes, > > it's quite common to see application bugs causing silent crashes. > > Today I once again looked closely the logs, and understood that I was > wrong. > > Apache_pool does not process the request (item 3 of my previous letter), > and returns nothing, neither 200-th code, nor any other. I just made a > mistake. OK, thanks for these precisions. Don't worry, I too am used to report erroneous diags after a first look, because it's very easy to mismatch a log with a request or a network trace :-) > In moments of 502-th errors there's nothing going on with Apache, in any > case I have found nothing strange , no falling, no restarts. But after > some moments the same queries are normally performed. Indeed that's very strange. > > Alternatively, something between haproxy and the application might reset > > the connection once in a while, without the application being aware of > > it. The application finally responds and logs, but the connection's > > already dead. > > Do you have anything in the path which might NAT the traffic, or do you > > have any shared IP address on the network which might randomly jump for > > a short period ? > > I have nothing in common of all these, such a usual LAMP server for a > shared hosting. Fine. > All traffic between the haproxy <-> apache_pools goes through the lo > interface, so I just exclude the impact of iptables, and I don't have > anything more. If you have iptables loaded, it will impact the loopback as well as any interface. The issue with iptables is that it is often shipped with low settings for conntrack, and that above a few hundreds connections per second (even on the loopback), the table fills and no connection can be established until some entries expire. When properly tuned this problem doesn't happen, but usually it's easier to disable it than to tune it. > > Otherwise you might have to start tcpdump so that we find out what's > > precisely happening > > I managed to catch this moment, tcpdumps in an attachment. > > The first file, this is a session between apache_fe and haproxy, and to > mind it's ok with it. And the second dump has really something strange > to show, look, may be it can tell something to you. Ah what you captured is excellent ! Look : apache_fe haproxy apache_pool 12:05:02.543 ----> SYN SYN/ACK <---- ----> ACK ----> GET ----> SYN SYN/ACK <---- ----> ACK ----> REQ ACK <---- ACK <---- 12:05:04:463 FIN <--- ----> RST 502 <---- ----> FIN FIN <---- ----> ACK So what this means : apache_fe sends a complete correct request to haproxy, which forwards it to the apache pool. Nothing happens for 1.9 second. Then the apache server closes the connection without saying anything on it and haproxy returns the 502 to the apache_fe. In my opinion there is no reason for an Apache server to close a connection without saying anything. So either the process simply dies, or there is a module on it doing nasty things and forcing the connection to close without doing anything. Just a hint, could you check if there's an updated version for it ? Maybe this is just a known bug that has recently been fixed ? > > BTW, what version are you running ? > > 1.4.8 OK. If this is a distro backport with all fixes, it's fine. Otherwise you should consider updating it since a number of issues with cookies and chunked-encoding have been fixed since. That's unrelated to your current issue so there's no emergency though. Regards, Willy