Hi Terje,

On Wed, Dec 05, 2012 at 09:33:19AM +0100, Borgen, Terje wrote:
> Hi Willy,
> Thanks for Your quick response.
> I think You might be onto something here. We have a similar setup with
> haproxy using port 80 and have never experienced this problem in that
> environment.

OK.

> /proc/sys/net/ipv4/ip_local_port_range says 32768-61000, so nothing special
> here. We have another similar problem when restarting the Jetty-servers on
> the same server. We always get an error saying that the port is in use and we
> have to wait one minute before it can start again. The Jetty ports (as You
> can see in the config) are also outside the ip_local_port_range. But this
> might be another problem since it happens every restart.

Yes, typically a listening port bound without SO_REUSEADDR. Very common
in fact.

> Some additional info:
> - We have two identical servers running apache http server, haproxy and jetty
> servers. Most of the traffic hits the main server, and the reload problem
> have never happened on the failover server. So this problem might be
> "traffic-related".
> - For one week we changed the inter-parameter on the clusters from default
> 2000 to 60000 leaving rise/fall as default. In that period the problem never
> occurred. 

OK, I see. The health checks are causing too many time-wait sockets.
This issue was very recently fixed (in 1.5-dev14) as haproxy now closes
health check sockets with a TCP reset, thus avoiding the TIME_WAIT. I'm
pretty sure they're the one causing the issue as I've experienced a
similar one recently (reason why I fixed it :-)).

I have not backported this yet as I wanted to keep an observation period.

However you can try something : put "option nolinger" in your BACKENDS,
not your frontends, otherwise some clients will experience truncated
responses!!! All backend connections (including checks) will be closed
by a reset and you should see much less TIME_WAIT sockets between haproxy
and the servers.

Regards,
Willy


Reply via email to