Hi Richard,
On Mon, Jan 16, 2017 at 11:40:57PM +1300, Richard Gray wrote:
> Hi All,
>
> I'm using HAProxy 1.5.14 (the packaged version on CentOS 7.2) to front an
> IMAP proxy service, and I've noticed I'm getting quite a lot of connections
> in FIN_WAIT_2. For example, here are the totals for each state on my system
> right now:
>
> $ netstat -an | awk '/tcp/ {print $6}' | sort | uniq -c
> 255 CLOSE_WAIT
> 1 CLOSING
> 1802 ESTABLISHED
> 82 FIN_WAIT1
> 514 FIN_WAIT2
> 8 LAST_ACK
> 11 LISTEN
> 2 SYN_RECV
> 315 TIME_WAIT
>
> What seems to be happening is that the backend is closing the connection,
> leading HAProxy to close the connection to the client by sending a FIN. The
> client-side connection goes to FIN_WAIT_1, and then on receiving an ACK, to
> FIN_WAIT_2.
Absolutely.
> It appears though, that some clients are not sending a FIN in return,
> resulting in the FIN_WAIT_2 connection hanging around until it times
> out.
Yes that's something very common, especially in mobile environments
where moving terminals can lose signal, disconnect then reconnect using
another address leaving the previous connection in a random state.
> I notice here that the connection takes 35 minutes to time out once entering
> FIN_WAIT_2, which is the value I'm setting for 'timeout tunnel'.
This is very strange because the kernel normally has a much shorter
FIN timeout (tcp_fin_timeout=60) so maybe your sysctl is this high ?
> I've tried
> setting 'timeout client-fin' to 30s to mitigate this issue, but it doesn't
> seem to have any effect. Can someone confirm whether timeout client-fin
> applies to FIN_WAIT_2, or if perhaps I'm not using the option correctly? I
> also wonder if the nolinger option might be effective in this case.
Most often it does not change anything towards an HTTP client because the
protocol being asymmetric, haproxy knows that it can completely close the
connection in this case, so it's not waiting for the client to close, it
just performs its close() and the connection becomes an orphan and is
handled by the system.
In your case since you're in TCP mode it's different. After the client-fin
timeout, haproxy will effectively close, but if there's nothing anymore in
the socket buffers (which is the case in FIN_WAIT2), the connection state
doesn't change, it remains an orphan and the system continues to wait for
the client to ACK it. You may notice that even if you kill haproxy these
connections will still be there. The only case where haproxy can get a
connection to be killed in the system is when it still has to send something
(data or FIN) because by disabling lingering, the system destroys the socket.
> Also, in case it is relevant, I should point out that I am using the 'usesrc
> clientip' option on my backend servers.
It will not have any impact here.
In your case I'd have a look at tcp_fin_timeout to possibly lower it, but
that's all. I wouldn't be worried by this number of FIN_WAIT2 connections
though I understand that at least the cause needs to be figured out and
possibly addressed.
Regards,
Willy