Dear all,
This is Soukin Bae working on Samsung Elec. Mobile Division.
we have a problem with tcp closing.
in shortly,
1. on 4-way handshking to close session
2. if ack pkt is not arrived from opposite side
3. then session can't be closed forever
in mobile device, condition 2 can be happend in various case.
like as turn off wifi or mobile data. or bad condition of air network, etc..
this could be occur in both side of connection.
when issue happened during active closing, the session remained with FIN_WAIT1
state.
and at passive closing, remained with LAST_ACK state.
---------------------------------------------------------------------------------------------
below is test result after wifi on/off repetition (without mobile data).
maybe 'Foreign Address' sent the fin-ack when wifi-off state.
so device coun't recieve ack pkt further, and the session is remained
permanently.
and their count is growing up. this is resource leak.
### turn on wifi
D:\Test>adb shell netstat -npWae
Proto Recv-Q Send-Q Local Address Foreign
Address State User Inode PID/Program Name
tcp 0 0 127.0.0.1:5037 0.0.0.0:*
LISTEN 0 36357 6907/adbd
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:58660
2404:6800:4008:c00::bc:5228 ESTABLISHED 10041 74347
6523/com.google.android.gms.persistent
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148
2404:6800:4004:800::2003:80 LAST_ACK 0 0 -
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:37512
64:ff9b::3444:f3dc:443 ESTABLISHED 10137 77447
9522/com.samsung.android.game.gos
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294
2404:6800:4005:80c::2004:443 LAST_ACK 0 0 -
tcp6 1 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260
64:ff9b::34d0:9421:80 LAST_ACK 0 0 -
### turn off wifi
D:\Test>adb shell netstat -npWae
Proto Recv-Q Send-Q Local Address Foreign
Address State User Inode PID/Program Name
tcp 0 0 127.0.0.1:5037 0.0.0.0:*
LISTEN 0 36357 6907/adbd
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35148
2404:6800:4004:800::2003:80 LAST_ACK 0 0 -
tcp6 0 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:49294
2404:6800:4005:80c::2004:443 LAST_ACK 0 0 -
tcp6 1 0 2001:2d8:ed1c:de1c:bd94:fe5:2d9a:d8e4:35260
64:ff9b::34d0:9421:80 LAST_ACK 0 0 -
---------------------------------------------------------------------------------------------
this is our analysis
when app finished using the socket(tcp session), it calls sock_close.
then tcp_close() makes sk->sk_state to LAST_ACK, and sock to SOCK_DEAD by
excute sock_orphan().
11-23 11:40:55.676 [5: Thread-44:11210] TCP: bsj: tcp_set_state: TCP
sk=ffffffc8a789c640, in:80092, State Close Wait -> Last ACK,
[2404:6800:4004:800::2003]
11-23 11:40:55.676 [5: Thread-44:11210] Call trace:
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8008ffdda8>]
tcp_set_state+0x1b8/0x1f0
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8008ffe3f8>]
tcp_close+0x484/0x534
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff800902f830>]
inet_release+0x60/0x74
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8009074238>]
inet6_release+0x30/0x48
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8008f35e4c>]
__sock_release+0x40/0x104
11-23 11:40:55.676 [5: Thread-44:11210] [<ffffff8008f3bab0>]
sock_close+0x18/0x28
11-23 11:40:55.678 [5: Thread-44:11210] TCP: bsj: sock_orphan: TCP
sk=ffffffc8a789c640, in:80092, State Last ACK, [2404:6800:4004:800::2003]
at this point, if the FIN_ACK comes, there's no problem. all is well~
but without that and when turn off wifi,
netd trying to close all the session by calling tcp_abort, sock_diag_destory.
11-23 11:41:38.463 [4: netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!!
: TCP sk=ffffffc8a789c640, in:0, State Last ACK, caller: <sock_diag_destroy>,
[2404:6800:4004:800::2003]
11-23 11:41:38.464 [4: netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!!
: TCP sk=ffffffc8a789b840, in:0, State Last ACK, caller: <sock_diag_destroy>,
[2404:6800:4005:80c::2004]
11-23 11:41:38.464 [4: netd: 5323] TCP: bsj: tcp_abort: SOCK_DEAD!!!
: TCP sk=ffffffc8a7899c40, in:0, State Last ACK, caller: <sock_diag_destroy>,
[64:ff9b::34d0:9421]
but because of this sock was already changed to SOCK_DEAT state by tcp_close(),
tcp_done() can't be excuted.
so this session can't be closed.
int tcp_abort(struct sock *sk, int err)
{
...
if (!sock_flag(sk, SOCK_DEAD)) { //// when SOCK_DEAD, tcp_done()
be skip.
...
sk->sk_error_report(sk);
if (tcp_need_reset(sk->sk_state))
tcp_send_active_reset(sk, GFP_ATOMIC);
tcp_done(sk);
}
...
return 0;
}
---------------------------------------------------------------------------------------------
wh thought that just sk_error_report have to reside in that condition,
SOCK_DEAD.
and send-reset and tcp_done should to be always.
we fixed it like as below, and confirmed that issue resolved.
please check this.
Best regards,
>From 30f8d02f9d2ca8820a527444260e6dcf4862db8a Mon Sep 17 00:00:00 2001
From: soukjin bae <[email protected]>
Date: Fri, 23 Nov 2018 15:56:53 +0900
Subject: [PATCH] net: close session always when tcp_abort
session before recieve the FIN_ACK couldn't be closed by tcp_abort
they will remained permanently. close them
Signed-off-by: soukjin bae <[email protected]>
Signed-off-by: geumhwan you <[email protected]>
---
net/ipv4/tcp.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9e6bc4d6daa7..faf4a8bbec8e 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3792,11 +3792,12 @@ int tcp_abort(struct sock *sk, int err)
/* This barrier is coupled with smp_rmb() in tcp_poll() */
smp_wmb();
sk->sk_error_report(sk);
- if (tcp_need_reset(sk->sk_state))
- tcp_send_active_reset(sk, GFP_ATOMIC);
- tcp_done(sk);
}
+ if (tcp_need_reset(sk->sk_state))
+ tcp_send_active_reset(sk, GFP_ATOMIC);
+ tcp_done(sk);
+
bh_unlock_sock(sk);
local_bh_enable();
tcp_write_queue_purge(sk);
--
2.13.0