This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
tcp: rehash onto different local ECMP path on retransmit timeout

This commit updates ECMP path selection logic to use the socket's txhash
on RTO, PLB, and spurious retransmissions by invalidating the cached dst
and setting fl6->mp_hash for IPv6. It also moves txhash initialization
earlier in tcp_conn_request().

> diff --git a/net/ipv6/inet6_connection_sock.c 
> b/net/ipv6/inet6_connection_sock.c
> index 37534e1168992..fc4b75de6af8e 100644
> --- a/net/ipv6/inet6_connection_sock.c
> +++ b/net/ipv6/inet6_connection_sock.c

[ ... ]

> @@ -70,6 +73,9 @@ struct dst_entry *inet6_csk_route_socket(struct sock *sk,
>       fl6->saddr = np->saddr;
>       fl6->flowlabel = np->flow_label;
>       IP6_ECN_flow_xmit(sk, fl6->flowlabel);
> +
> +     /* >> 1 for 31-bit mp_hash range matching nhc_upper_bound. */
> +     fl6->mp_hash = sk->sk_txhash >> 1;
>       fl6->flowi6_oif = sk->sk_bound_dev_if;
>       fl6->flowi6_mark = sk->sk_mark;
>       fl6->fl6_sport = inet->inet_sport;

Does this leave active IPv6 connections and syncookie-created sockets vulnerable
to spontaneous ECMP path flapping upon natural route invalidation?

The patch updates inet6_sk_rebuild_header() and inet6_csk_route_socket() to
use sk_txhash for ECMP path selection, but it appears to miss two critical
paths:

1. tcp_v6_connect() (active open): fl6.mp_hash is 0 during
   ip6_dst_lookup_flow(). sk_set_txhash(sk) is called only after the route
   is cached.

2. cookie_v6_check() (IPv6 syncookies): fl6.mp_hash is 0 during
   ip6_dst_lookup_flow(), despite treq->txhash having been generated in
   cookie_tcp_reqsk_alloc().

In both cases, the initial dst is selected via the standard 5-tuple hash,
but the socket is assigned a random txhash. If the route is later naturally
invalidated (e.g., via PMTU discovery), the rebuild functions will apply
sk_txhash >> 1. Because this new hash is uncorrelated with the 5-tuple, the
connection could unpredictably flap to a different ECMP path, potentially
breaking connections passing through stateful firewalls/NATs and causing
severe packet reordering.
-- 
pw-bot: cr

Reply via email to