RE: slow tcp handshake
You mention loopback interface. You could be running out of port numbers to for the connections. What's your /proc/sys/net/ipv4/ip_local_port_range? What's netstat -s | grep -i listshow on the server? -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Wednesday, October 21, 2009 6:36 AM To: haproxy Subject: slow tcp handshake This isn't haproxy related, but this list is so knowledgable on network problems. I'm troubleshooting our slow webserver and I've drilled down to a TCP handshake taking up to 10 seconds. This handshake doesn't actually really start until the client sends it's 3rd syn. The first 2 syn's are completely ignored, the 3rd is ACKed a full 10 seconds after the first syn is sent. After this, read times are fast. This happens over the loopback interface. Can an app get backed up in it's listen queue and affect some sort of syn queue, or will the kernel handle the handshake irrespective of the server's listen queue? I've searched all over the internets, and I'm plumb out of ideas. syn_cookies are disabled ip_tables unloaded /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active connections to the server never rose above 960, so thought this may be it...but i doubled it and it had no affect Fedora 8 2.6.26.8-57.fc8 Web server is lighttpd No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date: 10/20/09 18:42:00
RE: slow tcp handshake
You could bump your range up. It might help if you have a high connection rate and not just a high number of connections. echo 1024 61000 /proc/sys/net/ipv4/ip_local_port_range Good that nothing shows, as most 0 values are not printed. You could check for anything else that looks strange under netstat -s -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Wednesday, October 21, 2009 7:07 AM To: John Lauro Cc: haproxy Subject: Re: slow tcp handshake On Wed, Oct 21, 2009 at 3:51 AM, John Lauro john.la...@covenanteyes.com wrote: You mention loopback interface. You could be running out of port numbers to for the connections. What's your /proc/sys/net/ipv4/ip_local_port_range? cat /proc/sys/net/ipv4/ip_local_port_range 32768 61000 What's netstat -s | grep -i list show on the server? nothing at all, no list to match on that output also, i've disabled tcp_sack with no effect -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Wednesday, October 21, 2009 6:36 AM To: haproxy Subject: slow tcp handshake This isn't haproxy related, but this list is so knowledgable on network problems. I'm troubleshooting our slow webserver and I've drilled down to a TCP handshake taking up to 10 seconds. This handshake doesn't actually really start until the client sends it's 3rd syn. The first 2 syn's are completely ignored, the 3rd is ACKed a full 10 seconds after the first syn is sent. After this, read times are fast. This happens over the loopback interface. Can an app get backed up in it's listen queue and affect some sort of syn queue, or will the kernel handle the handshake irrespective of the server's listen queue? I've searched all over the internets, and I'm plumb out of ideas. syn_cookies are disabled ip_tables unloaded /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active connections to the server never rose above 960, so thought this may be it...but i doubled it and it had no affect Fedora 8 2.6.26.8-57.fc8 Web server is lighttpd No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date: 10/20/09 18:42:00 No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date: 10/20/09 18:42:00
RE: slow tcp handshake
You may also want to check ulimit -n prior to running your server. It may default to 1024 on your distro, and if lighttpd doesn't automatically increase it for you, that could be your problem. -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Wednesday, October 21, 2009 7:07 AM To: John Lauro Cc: haproxy Subject: Re: slow tcp handshake On Wed, Oct 21, 2009 at 3:51 AM, John Lauro john.la...@covenanteyes.com wrote: You mention loopback interface. You could be running out of port numbers to for the connections. What's your /proc/sys/net/ipv4/ip_local_port_range? cat /proc/sys/net/ipv4/ip_local_port_range 32768 61000 What's netstat -s | grep -i list show on the server? nothing at all, no list to match on that output also, i've disabled tcp_sack with no effect -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Wednesday, October 21, 2009 6:36 AM To: haproxy Subject: slow tcp handshake This isn't haproxy related, but this list is so knowledgable on network problems. I'm troubleshooting our slow webserver and I've drilled down to a TCP handshake taking up to 10 seconds. This handshake doesn't actually really start until the client sends it's 3rd syn. The first 2 syn's are completely ignored, the 3rd is ACKed a full 10 seconds after the first syn is sent. After this, read times are fast. This happens over the loopback interface. Can an app get backed up in it's listen queue and affect some sort of syn queue, or will the kernel handle the handshake irrespective of the server's listen queue? I've searched all over the internets, and I'm plumb out of ideas. syn_cookies are disabled ip_tables unloaded /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active connections to the server never rose above 960, so thought this may be it...but i doubled it and it had no affect Fedora 8 2.6.26.8-57.fc8 Web server is lighttpd No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date: 10/20/09 18:42:00 No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date: 10/20/09 18:42:00
Re: slow tcp handshake
On Wed, Oct 21, 2009 at 4:34 AM, John Lauro john.la...@covenanteyes.com wrote: You may also want to check ulimit -n prior to running your server. It may default to 1024 on your distro, and if lighttpd doesn't automatically increase it for you, that could be your problem. i ran into out of fd's problem before with lighttpd. it is quite verbose about running out of file descriptors. -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Wednesday, October 21, 2009 7:07 AM To: John Lauro Cc: haproxy Subject: Re: slow tcp handshake On Wed, Oct 21, 2009 at 3:51 AM, John Lauro john.la...@covenanteyes.com wrote: You mention loopback interface. You could be running out of port numbers to for the connections. What's your /proc/sys/net/ipv4/ip_local_port_range? cat /proc/sys/net/ipv4/ip_local_port_range 32768 61000 What's netstat -s | grep -i list show on the server? nothing at all, no list to match on that output also, i've disabled tcp_sack with no effect -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Wednesday, October 21, 2009 6:36 AM To: haproxy Subject: slow tcp handshake This isn't haproxy related, but this list is so knowledgable on network problems. I'm troubleshooting our slow webserver and I've drilled down to a TCP handshake taking up to 10 seconds. This handshake doesn't actually really start until the client sends it's 3rd syn. The first 2 syn's are completely ignored, the 3rd is ACKed a full 10 seconds after the first syn is sent. After this, read times are fast. This happens over the loopback interface. Can an app get backed up in it's listen queue and affect some sort of syn queue, or will the kernel handle the handshake irrespective of the server's listen queue? I've searched all over the internets, and I'm plumb out of ideas. syn_cookies are disabled ip_tables unloaded /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active connections to the server never rose above 960, so thought this may be it...but i doubled it and it had no affect Fedora 8 2.6.26.8-57.fc8 Web server is lighttpd No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date: 10/20/09 18:42:00 No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date: 10/20/09 18:42:00
Re: slow tcp handshake
On Wed, Oct 21, 2009 at 4:31 AM, John Lauro john.la...@covenanteyes.com wrote: You could bump your range up. It might help if you have a high connection rate and not just a high number of connections. i dont have a high connection rate. my connection over localhost is just intended to mimic user behavior. i it boils down to is it the os, or lighttpd or both? can a backed up listen queue delay a tcp handshake? my guess would be that the OS would start sending RST's when a server isn't clearing out it's listen queue. echo 1024 61000 /proc/sys/net/ipv4/ip_local_port_range Good that nothing shows, as most 0 values are not printed. You could check for anything else that looks strange under netstat -s -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Wednesday, October 21, 2009 7:07 AM To: John Lauro Cc: haproxy Subject: Re: slow tcp handshake On Wed, Oct 21, 2009 at 3:51 AM, John Lauro john.la...@covenanteyes.com wrote: You mention loopback interface. You could be running out of port numbers to for the connections. What's your /proc/sys/net/ipv4/ip_local_port_range? cat /proc/sys/net/ipv4/ip_local_port_range 32768 61000 What's netstat -s | grep -i list show on the server? nothing at all, no list to match on that output also, i've disabled tcp_sack with no effect -Original Message- From: David Birdsong [mailto:david.birds...@gmail.com] Sent: Wednesday, October 21, 2009 6:36 AM To: haproxy Subject: slow tcp handshake This isn't haproxy related, but this list is so knowledgable on network problems. I'm troubleshooting our slow webserver and I've drilled down to a TCP handshake taking up to 10 seconds. This handshake doesn't actually really start until the client sends it's 3rd syn. The first 2 syn's are completely ignored, the 3rd is ACKed a full 10 seconds after the first syn is sent. After this, read times are fast. This happens over the loopback interface. Can an app get backed up in it's listen queue and affect some sort of syn queue, or will the kernel handle the handshake irrespective of the server's listen queue? I've searched all over the internets, and I'm plumb out of ideas. syn_cookies are disabled ip_tables unloaded /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active connections to the server never rose above 960, so thought this may be it...but i doubled it and it had no affect Fedora 8 2.6.26.8-57.fc8 Web server is lighttpd No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date: 10/20/09 18:42:00 No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date: 10/20/09 18:42:00
Re: slow tcp handshake
On Wed, Oct 21, 2009 at 10:27:23AM -0700, David Birdsong wrote: On Wed, Oct 21, 2009 at 4:31 AM, John Lauro john.la...@covenanteyes.com wrote: You could bump your range up. It might help if you have a high connection rate and not just a high number of connections. i dont have a high connection rate. my connection over localhost is just intended to mimic user behavior. i it boils down to is it the os, or lighttpd or both? can a backed up listen queue delay a tcp handshake? my guess would be that the OS would start sending RST's when a server isn't clearing out it's listen queue. when a backlog is full, the system simply drops the SYNs, that's what allows the client to try again. There are still applications around which do listen(fd, 5) because that was an example in many school manuals. That limits the backlog to 5 entries ... Also, check your somaxconn sysctl. It's also a limit on the SYN backlog size, and that one is generally set to 128 by default. You need to restart the application bound to the listening port when you change somaxconn or tcp_max_syn_backlog for them to take effect. Hoping this helps, Willy