Hi Cyril,

On Wed, Apr 24, 2013 at 10:11:05PM +0200, Cyril Bonté wrote:
> Hi Willy,
> 
> Le 05/12/2012 22:43, Willy Tarreau a écrit :
> >OK, I see. The health checks are causing too many time-wait sockets.
> >This issue was very recently fixed (in 1.5-dev14) as haproxy now closes
> >health check sockets with a TCP reset, thus avoiding the TIME_WAIT. I'm
> >pretty sure they're the one causing the issue as I've experienced a
> >similar one recently (reason why I fixed it :-)).
> >
> >I have not backported this yet as I wanted to keep an observation period.
> 
> This week at work, one team upgraded to haproxy 1.4.23.
> Their configuration embeds a proxy for rsyslog backends in TCP, with TCP 
> checks. Once upgraded, rsyslog started to log periodical errors :
> "netstream session will be closed due to error" caused by TCP resets.
> It's not necessarily a big issue but they're wondering if other TCP 
> services won't be affected. They'll check all of them soon.

Thanks for this feedback. It would be nice if they could get a trace, as
I suspect that rsyslog has the time to accept the connection first, which
is not really desired. Or could they try to add "option tcp-smart-connect"
to the backend to see if the issue disappears ? It should, because the
sequence will be SYN, SYN/ACK, RST which the application never sees, and
if so I think we should always enable this for pure TCP checks.

> Aside of their analysis, I wonder if it's possible to introduce a new 
> option "check-nolinger". This would let users choose one or the other 
> behaviour.

I'd rather avoid this for the reasons which led us to this fix. Having
the client close first in TCP is always problematic and we don't want
to end up with thousands of TIME_WAIT sockets which cannot be reopened
for 2 minutes.

Best regards,
Willy


Reply via email to