Public bug reported:

[Impact]

MPTCP users who enable SO_KEEPALIVE after creating a socket but before
connecting it reported that their connections did not get keepalives
enabled.  This lead to applications that maintain a pool of fresh
connections occasionally experiencing resets and latency when their
remote side crashed or had the connection terminated by a proxy.  This
lead to additional latency and intermittent errors on the fraction
of MPTCP connections experiencing this type of disconnect.

To the reporting users, the problem appeared more nebulous because some
of the MPTCP subflow connection methods enabled keepalives successfully
via a different path.  Specifically, this issue only impacts connects
and fallbacks to TCP.  Joins are handled correctly and do not suffer
from this bug.

The problem is due to one particular code path forgetting to apply the
SOCK_KEEPOPEN flag to MPTCP subflows.  For connects (and fallback to
TCP), sync_socket_options, is invoked when the socket is closed.  The
tcp_set_keepalive function, which is the callback for sk_prot->keepalive
in this situation, ignores sockets that are closed or listening.
The other code that calls tcp_set_keepalive deals with this case by
setting SOCK_KEEPOPEN, which tcp_finish_connect uses to enable the
keepalives after connect.  However, since this path does not set that
flag, keepalives on not-yet-established MPTCP sockets don't get their
keepalives enabled in tcp_finish_connect.

It should be noted that a different function is invoked if the user sets
the keepalive after the connection is established, and in that case the
function iterates over all known subflows and correctly enables
keepalives.

[Fix]

Cherry-pick upstream commit:

  648de37416b3 ("mptcp: sockopt: make sync_socket_options propagate
                SOCK_KEEPOPEN")

This patch applied cleanly to the Jammy, Noble, and Plucky git repos.

This commit has been picked up by stable 6.16, 6.12, 6.6, and 6.1 and is
in the queue for 5.15.

[Test Plan]

Applied and built the patch against all 3 repositories.  Additionally
tested by creating a MPTCP connection that falls back to TCP and then
validated by ss and tcpdump that keepalives were being sent after the
fix but not before.

before:

MPTCP connection via ss showing no keepalive set, despite SO_KEEPALIVE

  ESTAB 0      0      172.0.0.3:56960  172.0.0.2:12345 
users:(("client",pid=19349,fd=3)) uid:1000 ino:87093 sk:2 
cgroup:/user.slice/user-1000.slice/session-526.scope <->
           ts sack cubic wscale:7,7 rto:202 rtt:1.119/0.559 mss:8949 pmtu:9001 
rcvmss:536 advmss:8949 cwnd:10 bytes_acked:1 segs_out:2 segs_in:1 send 
639785523bps lastsnd:25714 lastrcv:25714 lastack:25714 pacing_rate 
1279571040bps delivered:1 app_limited rcv_space:56587 rcv_ssthresh:56587 
minrtt:1.119 snd_wnd:62643 tcp-ulp-mptcp flags:Mmc 
token:0000(id:0)/34af5eec(id:0) seq:0 sfseq:1 ssnoff:53d1efd8 maplen:0
  
Tcpdump shows no keepalives.

after:

ss shows keepalive timer for connection:

  ESTAB 0      0      172.0.0.3:56076  172.0.0.2:12345 
users:(("client",pid=980,fd=3)) timer:(keepalive,1.192ms,0) uid:1000 ino:6688 
sk:4 cgroup:/user.slice/user-1000.slice/session-4.scope <->
           ts sack cubic wscale:7,7 rto:202 rtt:1.12/0.56 mss:8949 pmtu:9001 
rcvmss:536 advmss:8949 cwnd:10 bytes_acked:1 segs_out:5 segs_in:4 send 
639214286bps lastsnd:16255 lastrcv:16255 lastack:807 pacing_rate 1278428568bps 
delivered:1 app_limited rcv_space:56587 rcv_ssthresh:56587 minrtt:1.12 
snd_wnd:62720 tcp-ulp-mptcp flags:Mmec token:0000(id:0)/c8bda35f(id:0) seq:0 
sfseq:1 ssnoff:f375a4ba maplen:0
  
tcpdump shows keepalives:

  00:10:14.768246 ens5  Out IP 172.0.0.3.56076 > 172.0.0.2.12345: Flags [.], 
ack 1, win 491, options [nop,nop,TS val 644045863 ecr 4019247204,mptcp 8 dss 
ack 2595666209], length 0
  00:10:14.769235 ens5  In  IP 172.0.0.2.12345 > 172.0.0.3.56076: Flags [.], 
ack 1, win 490, options [nop,nop,TS val 4019252260 ecr 644030416,mptcp 8 dss 
ack 1822061057], length 0
  00:10:19.824246 ens5  Out IP 172.0.0.3.56076 > 172.0.0.2.12345: Flags [.], 
ack 1, win 491, options [nop,nop,TS val 644050919 ecr 4019252260,mptcp 8 dss 
ack 2595666209], length 0
  00:10:19.825239 ens5  In  IP 172.0.0.2.12345 > 172.0.0.3.56076: Flags [.], 
ack 1, win 490, options [nop,nop,TS val 4019257316 ecr 644030416,mptcp 8 dss 
ack 1822061057], length 0
  <...>

[Where problems could occur]

The fix is small and uses existing functions to sync flags to subflows
in an approach that's used by the surrounding code.  The potential for
regression seems generally low; however, since the bug has been present
since the introduction of SO_KEEPALIVE support for MPTCP in 5.12,
there's a chance that some user somewhere may have ended up relying on
this non-obvious behavior.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: patch patch-accepted-upstream

** Tags added: patch patch-accepted-upstream

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2125444

Title:
  ensure mptcp keepalives are honored when set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2125444/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to