RE: slow tcp handshake

2009-10-21 Thread John Lauro
You mention loopback interface.  You could be running out of port numbers to
for the connections.
What's your /proc/sys/net/ipv4/ip_local_port_range?


What's netstat -s | grep -i listshow on the server?



 -Original Message-
 From: David Birdsong [mailto:david.birds...@gmail.com]
 Sent: Wednesday, October 21, 2009 6:36 AM
 To: haproxy
 Subject: slow tcp handshake
 
 This isn't haproxy related, but this list is so knowledgable on
 network problems.
 
 I'm troubleshooting our slow webserver and I've drilled down to a TCP
 handshake taking up to 10 seconds.  This handshake doesn't actually
 really start until the client sends it's 3rd syn.  The first 2 syn's
 are completely ignored, the 3rd is ACKed a full 10 seconds after the
 first syn is sent.  After this, read times are fast.
 
 This happens over the loopback interface.
 
 Can an app get backed up in it's listen queue and affect some sort of
 syn queue, or will the kernel handle the handshake irrespective of the
 server's listen queue?
 
 I've searched all over the internets, and I'm plumb out of ideas.
 
 syn_cookies are disabled
 ip_tables unloaded
 /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active
 connections to the server never rose above 960, so thought this may be
 it...but i doubled it and it had no affect
 
 
 Fedora 8 2.6.26.8-57.fc8
 Web server is lighttpd
 
 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date:
 10/20/09 18:42:00




RE: slow tcp handshake

2009-10-21 Thread John Lauro
You could bump your range up.  It might help if you have a high connection
rate and not just a high number of connections.

echo 1024 61000  /proc/sys/net/ipv4/ip_local_port_range


Good that nothing shows, as most 0 values are not printed.  You could check
for anything else that looks strange under netstat -s

 -Original Message-
 From: David Birdsong [mailto:david.birds...@gmail.com]
 Sent: Wednesday, October 21, 2009 7:07 AM
 To: John Lauro
 Cc: haproxy
 Subject: Re: slow tcp handshake
 
 On Wed, Oct 21, 2009 at 3:51 AM, John Lauro
 john.la...@covenanteyes.com wrote:
  You mention loopback interface.  You could be running out of port
 numbers to
  for the connections.
  What's your /proc/sys/net/ipv4/ip_local_port_range?
 cat /proc/sys/net/ipv4/ip_local_port_range
 32768 61000
 
 
 
 
  What's netstat -s | grep -i list    show on the server?
 nothing at all, no list to match on that output
 
 
 
 
 also, i've disabled tcp_sack with no effect
 
  -Original Message-
  From: David Birdsong [mailto:david.birds...@gmail.com]
  Sent: Wednesday, October 21, 2009 6:36 AM
  To: haproxy
  Subject: slow tcp handshake
 
  This isn't haproxy related, but this list is so knowledgable on
  network problems.
 
  I'm troubleshooting our slow webserver and I've drilled down to a
 TCP
  handshake taking up to 10 seconds.  This handshake doesn't actually
  really start until the client sends it's 3rd syn.  The first 2 syn's
  are completely ignored, the 3rd is ACKed a full 10 seconds after the
  first syn is sent.  After this, read times are fast.
 
  This happens over the loopback interface.
 
  Can an app get backed up in it's listen queue and affect some sort
 of
  syn queue, or will the kernel handle the handshake irrespective of
 the
  server's listen queue?
 
  I've searched all over the internets, and I'm plumb out of ideas.
 
  syn_cookies are disabled
  ip_tables unloaded
  /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active
  connections to the server never rose above 960, so thought this may
 be
  it...but i doubled it and it had no affect
 
 
  Fedora 8 2.6.26.8-57.fc8
  Web server is lighttpd
 
  No virus found in this incoming message.
  Checked by AVG - www.avg.com
  Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date:
  10/20/09 18:42:00
 
 
 
 
 
 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date:
 10/20/09 18:42:00




RE: slow tcp handshake

2009-10-21 Thread John Lauro
You may also want to check ulimit -n prior to running your server.  It may
default to 1024 on your distro, and if lighttpd doesn't automatically
increase it for you, that could be your problem.

 -Original Message-
 From: David Birdsong [mailto:david.birds...@gmail.com]
 Sent: Wednesday, October 21, 2009 7:07 AM
 To: John Lauro
 Cc: haproxy
 Subject: Re: slow tcp handshake
 
 On Wed, Oct 21, 2009 at 3:51 AM, John Lauro
 john.la...@covenanteyes.com wrote:
  You mention loopback interface.  You could be running out of port
 numbers to
  for the connections.
  What's your /proc/sys/net/ipv4/ip_local_port_range?
 cat /proc/sys/net/ipv4/ip_local_port_range
 32768 61000
 
 
 
 
  What's netstat -s | grep -i list    show on the server?
 nothing at all, no list to match on that output
 
 
 
 
 also, i've disabled tcp_sack with no effect
 
  -Original Message-
  From: David Birdsong [mailto:david.birds...@gmail.com]
  Sent: Wednesday, October 21, 2009 6:36 AM
  To: haproxy
  Subject: slow tcp handshake
 
  This isn't haproxy related, but this list is so knowledgable on
  network problems.
 
  I'm troubleshooting our slow webserver and I've drilled down to a
 TCP
  handshake taking up to 10 seconds.  This handshake doesn't actually
  really start until the client sends it's 3rd syn.  The first 2 syn's
  are completely ignored, the 3rd is ACKed a full 10 seconds after the
  first syn is sent.  After this, read times are fast.
 
  This happens over the loopback interface.
 
  Can an app get backed up in it's listen queue and affect some sort
 of
  syn queue, or will the kernel handle the handshake irrespective of
 the
  server's listen queue?
 
  I've searched all over the internets, and I'm plumb out of ideas.
 
  syn_cookies are disabled
  ip_tables unloaded
  /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active
  connections to the server never rose above 960, so thought this may
 be
  it...but i doubled it and it had no affect
 
 
  Fedora 8 2.6.26.8-57.fc8
  Web server is lighttpd
 
  No virus found in this incoming message.
  Checked by AVG - www.avg.com
  Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date:
  10/20/09 18:42:00
 
 
 
 
 
 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date:
 10/20/09 18:42:00




Re: slow tcp handshake

2009-10-21 Thread David Birdsong
On Wed, Oct 21, 2009 at 4:34 AM, John Lauro john.la...@covenanteyes.com wrote:
 You may also want to check ulimit -n prior to running your server.  It may
 default to 1024 on your distro, and if lighttpd doesn't automatically
 increase it for you, that could be your problem.
i ran into out of fd's problem before with lighttpd.  it is quite
verbose about running out of file descriptors.


 -Original Message-
 From: David Birdsong [mailto:david.birds...@gmail.com]
 Sent: Wednesday, October 21, 2009 7:07 AM
 To: John Lauro
 Cc: haproxy
 Subject: Re: slow tcp handshake

 On Wed, Oct 21, 2009 at 3:51 AM, John Lauro
 john.la...@covenanteyes.com wrote:
  You mention loopback interface.  You could be running out of port
 numbers to
  for the connections.
  What's your /proc/sys/net/ipv4/ip_local_port_range?
 cat /proc/sys/net/ipv4/ip_local_port_range
 32768 61000


 
 
  What's netstat -s | grep -i list    show on the server?
 nothing at all, no list to match on that output

 
 

 also, i've disabled tcp_sack with no effect
 
  -Original Message-
  From: David Birdsong [mailto:david.birds...@gmail.com]
  Sent: Wednesday, October 21, 2009 6:36 AM
  To: haproxy
  Subject: slow tcp handshake
 
  This isn't haproxy related, but this list is so knowledgable on
  network problems.
 
  I'm troubleshooting our slow webserver and I've drilled down to a
 TCP
  handshake taking up to 10 seconds.  This handshake doesn't actually
  really start until the client sends it's 3rd syn.  The first 2 syn's
  are completely ignored, the 3rd is ACKed a full 10 seconds after the
  first syn is sent.  After this, read times are fast.
 
  This happens over the loopback interface.
 
  Can an app get backed up in it's listen queue and affect some sort
 of
  syn queue, or will the kernel handle the handshake irrespective of
 the
  server's listen queue?
 
  I've searched all over the internets, and I'm plumb out of ideas.
 
  syn_cookies are disabled
  ip_tables unloaded
  /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active
  connections to the server never rose above 960, so thought this may
 be
  it...but i doubled it and it had no affect
 
 
  Fedora 8 2.6.26.8-57.fc8
  Web server is lighttpd
 
  No virus found in this incoming message.
  Checked by AVG - www.avg.com
  Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date:
  10/20/09 18:42:00
 
 
 


 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date:
 10/20/09 18:42:00





Re: slow tcp handshake

2009-10-21 Thread David Birdsong
On Wed, Oct 21, 2009 at 4:31 AM, John Lauro john.la...@covenanteyes.com wrote:
 You could bump your range up.  It might help if you have a high connection
 rate and not just a high number of connections.

i dont have a high connection rate.  my connection over localhost is
just intended to mimic user behavior.  i

it boils down to is it the os, or lighttpd or both?

can a backed up listen queue delay a tcp handshake?  my guess would be
that the OS would start sending RST's when a server isn't clearing out
it's listen queue.

 echo 1024 61000  /proc/sys/net/ipv4/ip_local_port_range


 Good that nothing shows, as most 0 values are not printed.  You could check
 for anything else that looks strange under netstat -s

 -Original Message-
 From: David Birdsong [mailto:david.birds...@gmail.com]
 Sent: Wednesday, October 21, 2009 7:07 AM
 To: John Lauro
 Cc: haproxy
 Subject: Re: slow tcp handshake

 On Wed, Oct 21, 2009 at 3:51 AM, John Lauro
 john.la...@covenanteyes.com wrote:
  You mention loopback interface.  You could be running out of port
 numbers to
  for the connections.
  What's your /proc/sys/net/ipv4/ip_local_port_range?
 cat /proc/sys/net/ipv4/ip_local_port_range
 32768 61000


 
 
  What's netstat -s | grep -i list    show on the server?
 nothing at all, no list to match on that output

 
 

 also, i've disabled tcp_sack with no effect
 
  -Original Message-
  From: David Birdsong [mailto:david.birds...@gmail.com]
  Sent: Wednesday, October 21, 2009 6:36 AM
  To: haproxy
  Subject: slow tcp handshake
 
  This isn't haproxy related, but this list is so knowledgable on
  network problems.
 
  I'm troubleshooting our slow webserver and I've drilled down to a
 TCP
  handshake taking up to 10 seconds.  This handshake doesn't actually
  really start until the client sends it's 3rd syn.  The first 2 syn's
  are completely ignored, the 3rd is ACKed a full 10 seconds after the
  first syn is sent.  After this, read times are fast.
 
  This happens over the loopback interface.
 
  Can an app get backed up in it's listen queue and affect some sort
 of
  syn queue, or will the kernel handle the handshake irrespective of
 the
  server's listen queue?
 
  I've searched all over the internets, and I'm plumb out of ideas.
 
  syn_cookies are disabled
  ip_tables unloaded
  /proc/sys/net/ipv4/tcp_max_syn_backlog was set to 1024 and active
  connections to the server never rose above 960, so thought this may
 be
  it...but i doubled it and it had no affect
 
 
  Fedora 8 2.6.26.8-57.fc8
  Web server is lighttpd
 
  No virus found in this incoming message.
  Checked by AVG - www.avg.com
  Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date:
  10/20/09 18:42:00
 
 
 


 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 8.5.422 / Virus Database: 270.14.11/2430 - Release Date:
 10/20/09 18:42:00





Re: slow tcp handshake

2009-10-21 Thread Willy Tarreau
On Wed, Oct 21, 2009 at 10:27:23AM -0700, David Birdsong wrote:
 On Wed, Oct 21, 2009 at 4:31 AM, John Lauro john.la...@covenanteyes.com 
 wrote:
  You could bump your range up.  It might help if you have a high connection
  rate and not just a high number of connections.
 
 i dont have a high connection rate.  my connection over localhost is
 just intended to mimic user behavior.  i
 
 it boils down to is it the os, or lighttpd or both?
 
 can a backed up listen queue delay a tcp handshake?  my guess would be
 that the OS would start sending RST's when a server isn't clearing out
 it's listen queue.

when a backlog is full, the system simply drops the SYNs, that's what
allows the client to try again. There are still applications around
which do listen(fd, 5) because that was an example in many school
manuals. That limits the backlog to 5 entries ...

Also, check your somaxconn sysctl. It's also a limit on the SYN
backlog size, and that one is generally set to 128 by default.
You need to restart the application bound to the listening port
when you change somaxconn or tcp_max_syn_backlog for them to take
effect.

Hoping this helps,
Willy