Clearing the sock TX queue in sk_set_socket() might cause unexpected
out-of-order transmit when called from sock_orphan(), as outstanding
packets can pick a different TX queue and bypass the ones already queued.
This is undesired in general. More specifically, it breaks the in-order
scheduling property guarantee for device-offloaded TLS sockets.
Remove the call to sk_tx_queue_clear() in sk_set_socket(), and add it
explicitly only where needed.
Fixes: e022f0b4a03f ("net: Introduce sk_tx_queue_mapping")
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Boris Pismenny <[email protected]>
---
include/net/sock.h | 1 -
net/core/sock.c | 2 ++
2 files changed, 2 insertions(+), 1 deletion(-)
Please queue to -stable.
Changes v1 -> v2:
Per Eric's suggestion, took sk_tx_queue_clear() out of sk_set_socket()
and called it explicitly where needed.
diff --git a/include/net/sock.h b/include/net/sock.h
index c53cc42b5ab9..3428619faae4 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1848,7 +1848,6 @@ static inline int sk_rx_queue_get(const struct sock *sk)
static inline void sk_set_socket(struct sock *sk, struct socket *sock)
{
- sk_tx_queue_clear(sk);
sk->sk_socket = sock;
}
diff --git a/net/core/sock.c b/net/core/sock.c
index 94391da27754..d832c650287c 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1767,6 +1767,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t
priority,
cgroup_sk_alloc(&sk->sk_cgrp_data);
sock_update_classid(&sk->sk_cgrp_data);
sock_update_netprioidx(&sk->sk_cgrp_data);
+ sk_tx_queue_clear(sk);
}
return sk;
@@ -1990,6 +1991,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const
gfp_t priority)
*/
sk_refcnt_debug_inc(newsk);
sk_set_socket(newsk, NULL);
+ sk_tx_queue_clear(newsk);
RCU_INIT_POINTER(newsk->sk_wq, NULL);
if (newsk->sk_prot->sockets_allocated)
--
1.8.3.1