Hi Krishna,

On Thu, Mar 09, 2017 at 12:03:19PM +0530, Krishna Kumar (Engineering) wrote:
> Hi Willy,
> 
> We use HAProxy as a Forward Proxy (I know this is not the intended
> application for HAProxy) to access outside world from within the DC, and
> this requires setting a source port range for return traffic to reach the
> correct
> box from which a connection was established. On our production boxes, we
> see around 500 "no free ports" errors per day, but this could increase to
> about 120K errors during big sale events. The reason for this is due to
> connect getting a EADDRNOTAVAIL error, since an earlier closed socket
> may be in last-ack state, as it may take some time for the remote server to
> send the final ack.
> 
> The attached patch reduces the number of errors by attempting more ports,
> if they are available.
> 
> Please review, and let me know if this sounds reasonable to implement.

Well, while the patch looks clean I'm really not convinced it's the correct
approach. Normally you should simply be using the "retries" parameter to
increase the amount of connect retries. There's nothing wrong with setting
it to a really high value if needed. Doesn't it work in your case ?

Also a few other points :
  - when the remote server sends the FIN with the last segment, your
    connection ends up in CLOSE_WAIT state. Haproxy then closes as
    well, sending a FIN and your socket ends up in LAST_ACK waiting
    for the server to respond. You may instead ask haproxy to close
    with an RST by setting "option nolinger" in the backend. The port
    will then always be free locally. The side effect is that if the
    RST is lost, the SYN of a new outgoing connection may get an ACK
    instead of a SYN-ACK as a reply and will respond to it with an
    RST and try again. This will result in all connections working,
    some taking slightly longer a time (typically 1 second).

  - 500 outgoing ports is a very low value. You should keep in mind
    that nowadays most servers use 60 seconds FIN_WAIT/TIME_WAIT
    delays (the remote server remains in FIN_WAIT1 while waiting for
    your ACK, then enters TIME_WAIT when receiving your FIN). So with
    only 500 ports, you can *safely* support only 500/60 = 8 connections
    per second. Fortunately in practice it doesn't work like this
    since most of the time connections are correctly closed. But if
    you start to enter big trouble, you need to understand that you
    can very quickly reach some limits. And 500 outgoing ports means
    you don't expect to support more than 500 concurrent conns per
    proxy, which seems quite low.

Thus normally what you're experiencing should only be dealt with
using configuration :
  - increase retries setting
  - possibly enable option nolinger (backend only, never on a frontend)
  - try to increase the available source port ranges.

Regards,
Willy

Reply via email to