Hi Richard, On Thu, Jan 19, 2017 at 05:36:59PM +1300, Richard Gray wrote: > $ netstat -an | grep FIN_WA | grep 122.56.198.232:26734 > tcp 0 0 10.24.20.104:993 122.56.198.232:26734 FIN_WAIT2 > $ echo "show sess" | sudo socat unix-connect:/var/lib/haproxy/stats stdio | > grep 122.56.198.232:26734 > 0x7f906fdb7050: proto=tcpv4 src=122.56.198.232:26734 > fe=hosting-imaps_proxy-external-vip be=hosting-imaps_proxy-external-vip > srv=imap1 ts=08 age=25m42s calls=3 > rq[f=848202h,i=0,an=00h,rx=14m45s,wx=,ax=] > rp[f=04a020h,i=0,an=00h,rx=,wx=,ax=] s0=[7,0h,fd=6684,ex=] > s1=[7,8h,fd=10039,ex=] exp=14m45s
Here rp.f=04a020h (CF_WROTE_DATA | CF_AUTO_CLOSE | CF_SHUTW | CF_SHUTR). This means that the response channel was closed. Despite this exp=14m45s so it seems to ignore timeout.client-fin. Thus I suspect we have an issue there with the way this timeout is applied. > > - your idea of a race is interesting. I'm wondering what happens if > > the connection spends more than 60s still attached to haproxy and > > only then becomes orphaned. It could be possible that the kernel > > timeout only strikes when the socket age reaches the timeout while > > already being orphaned. > In my testing, I find that most of the time, 'timeout client-fin' is > applied. HAProxy sends a FIN to the client, which acks the FIN, but does not > send a FIN of its own. The HAProxy session moves into the client-fin timeout > period of 30 seconds. After 30 seconds, the HAProxy session is removed. The > FIN_WAIT2 connection state is then maintained by the kernel for > tcp_fin_timeout of 60 seconds, before being removed altogether. OK. So if there's a bug it's only in haproxy. > Aha! Testing just now, and I managed to catch one. great :-) > To test, I open a connection to the HAProxy VIP with my python test client. > After 3 minutes, an idle time out on the backend server causes the > connection to be closed. My client doesn't close its end though, leaving the > client side connection in FIN_WAIT2. > > Here's the HAProxy session at 2m19s: > > $ echo "show sess" | sudo socat unix-connect:/var/lib/haproxy/stats stdio | > grep 146.185.142.165 > 0x7f9075ba98c0: proto=tcpv4 src=146.185.142.165:56198 > fe=hosting-imap_proxy-external-vip be=hosting-imap_proxy-external-vip > srv=imap1 ts=08 age=2m19s calls=2 rq[f=808000h,i=0,an=00h,rx=32m41s,wx=,ax=] > rp[f=048202h,i=0,an=00h,rx=32m41s,wx=,ax=] s0=[7,8h,fd=14490,ex=] > s1=[7,8h,fd=14919,ex=] exp=32m41s > > And then again at 3m16s after the backend has closed its side of the > connection: > > $ echo "show sess" | sudo socat unix-connect:/var/lib/haproxy/stats stdio | > grep 146.185.142.165 > 0x7f9075ba98c0: proto=tcpv4 src=146.185.142.165:56198 > fe=hosting-imap_proxy-external-vip be=hosting-imap_proxy-external-vip > srv=imap1 ts=08 age=3m16s calls=3 rq[f=808000h,i=0,an=00h,rx=31m44s,wx=,ax=] > rp[f=04a020h,i=0,an=00h,rx=,wx=,ax=] s0=[7,0h,fd=14490,ex=] > s1=[7,8h,fd=14919,ex=] exp=31m44s > $ netstat -an | grep 146.185.142.165:56198 > tcp 0 0 10.24.20.104:143 146.185.142.165:56198 FIN_WAIT2 > > Note that the exp value is 31m44s, not ~14s as we'd expect if timeout > client-fin had been applied. Yes definitely, the same issue as above. > To compare, here's what usually happens. At age=2m59s just before the server > disconnection: > > $ echo "show sess" | sudo socat unix-connect:/var/lib/haproxy/stats stdio | > grep 146.185.142.165 > 0x7f907f8e1ef0: proto=tcpv4 src=146.185.142.165:56202 > fe=hosting-imap_proxy-external-vip be=hosting-imap_proxy-external-vip > srv=imap1 ts=08 age=2m59s calls=2 rq[f=808000h,i=0,an=00h,rx=32m,wx=,ax=] > rp[f=048202h,i=0,an=00h,rx=32m,wx=,ax=] s0=[7,8h,fd=20385,ex=] > s1=[7,8h,fd=20440,ex=] exp=32m > > And then at 3m: > > $ echo "show sess" | sudo socat unix-connect:/var/lib/haproxy/stats stdio | > grep 146.185.142.165 > 0x7f907f8e1ef0: proto=tcpv4 src=146.185.142.165:56202 > fe=hosting-imap_proxy-external-vip be=hosting-imap_proxy-external-vip > srv=imap1 ts=08 age=3m calls=3 rq[f=808000h,i=0,an=00h,rx=29s,wx=,ax=] > rp[f=04a020h,i=0,an=00h,rx=,wx=,ax=] s0=[7,0h,fd=20385,ex=] > s1=[7,8h,fd=20440,ex=] exp=29s > > Here you can see exp=29s, suggesting that the client-fin timeout has kicked > in. Yep. And the flags are exactly the same! > I still don't know what causes the two different behaviours. It's quite > difficult to trigger the abnormal behaviour in my testing (only happens > maybe 1 time in 10), but with a session rate of up to 80/s on the production > service, it occurs frequently. OK, but 1/10 is already very common even for testing. My difficulties when reproducing issues usually consists in turning some occurences from 1/1 billion to 1/1 million so that it can be observed :-) > I don't have enough data to back this up yet, but I wonder if network > latency might be involved somehow. For most of today, testing from a machine > with a 3ms RTT to the service, I was unable to reproduce the problem. Just > now though, I've tested from a VM in Amsterdam with a 350ms RTT, and have > been able to reproduce the problem 2 times in 4 attempts. Anecdata at this > point, but it would lend itself to the race condition theory. OK that's very useful, I can test with this in mind. I suspect something : we probably apply the client-fin at the moment where we decide to close, but possibly if the session is woken up for another event type later, it would be ignored. Or maybe this happens when the client pushes more data to the server just after it received the shutdown, and there we wouldn't consider client-fin. I'll have to check with this in mind. Thanks a lot. Once I have some clues, I may ask you for more info or to test some patches. Willy

