When an ICMP Fragmentation Needed error is received for a tunneled IPVS
connection, ip_vs_in_icmp() recomputes the MTU that the original packet
can use by subtracting the tunnel overhead from the reported next-hop
MTU.

The current code always subtracts sizeof(struct iphdr), which is only
the IPIP overhead. For GUE and GRE tunnels, ipvs_udp_decap() and
ipvs_gre_decap() already compute the additional tunnel header length,
but that value is scoped to the decapsulation block and is lost before
the ICMP_FRAG_NEEDED handling. As a result, the ICMP error sent back to
the client advertises an MTU that is too large, so PMTUD can fail to
converge for GUE/GRE-tunneled real servers.

With a reported next-hop MTU of 1400, a GUE tunnel currently returns
1380 to the client. The correct value is 1368:

  1400 - sizeof(struct iphdr) - sizeof(struct udphdr) -
  sizeof(struct guehdr)

Hoist the tunnel header length into the main ip_vs_in_icmp() scope and
subtract sizeof(struct iphdr) + ulen in the Fragmentation Needed path.
The IPIP path keeps ulen as 0, so its existing 1400 - 20 = 1380 result
is unchanged.

Fixes: 508f744c0de3 ("ipvs: strip udp tunnel headers from icmp errors")
Cc: [email protected]
Reported-by: Yizhou Zhao <[email protected]>
Reported-by: Yuxiang Yang <[email protected]>
Reported-by: Ao Wang <[email protected]>
Reported-by: Xuewei Feng <[email protected]>
Reported-by: Qi Li <[email protected]>
Reported-by: Ke Xu <[email protected]>
Assisted-by: Claude Code:GLM-5.2
Signed-off-by: Yizhou Zhao <[email protected]>
---
 net/netfilter/ipvs/ip_vs_core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index d40b404c1bf62..74c5bd8b5f48 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1765,8 +1765,9 @@ ip_vs_in_icmp(struct netns_ipvs *ipvs, struct sk_buff 
*skb, int *related,
        struct ip_vs_proto_data *pd;
        unsigned int offset, offset2, ihl, verdict;
        bool tunnel, new_cp = false;
        union nf_inet_addr *raddr;
        char *outer_proto = "IPIP";
+       int ulen = 0;
 
        *related = 1;
 
@@ -1831,7 +1832,6 @@ ip_vs_in_icmp(struct netns_ipvs *ipvs, struct sk_buff 
*skb, int *related,
                   /* Error for our tunnel must arrive at LOCAL_IN */
                   (skb_rtable(skb)->rt_flags & RTCF_LOCAL)) {
                __u8 iproto;
-               int ulen;
 
                /* Non-first fragment has no UDP/GRE header */
                if (unlikely(cih->frag_off & htons(IP_OFFSET)))
@@ -1936,8 +1936,8 @@ ip_vs_in_icmp(struct netns_ipvs *ipvs, struct sk_buff 
*skb, int *related,
                                if (dest_dst)
                                        mtu = dst_mtu(dest_dst->dst_cache);
                        }
-                       if (mtu > 68 + sizeof(struct iphdr))
-                               mtu -= sizeof(struct iphdr);
+                       if (mtu > 68 + sizeof(struct iphdr) + ulen)
+                               mtu -= sizeof(struct iphdr) + ulen;
                        info = htonl(mtu);
                }
                /* Strip outer IP, ICMP and IPIP/UDP/GRE, go to IP header of


Reply via email to