Currently, IPVS skips MTU checks for GSO packets by excluding them with
the !skb_is_gso(skb) condition. This creates problems when IPVS tunnel
mode encapsulates GSO packets with IPIP headers.

The issue manifests in two ways:

1. MTU violation after encapsulation:
   When a GSO packet passes through IPVS tunnel mode, the original MTU
   check is bypassed. After adding the IPIP tunnel header, the packet
   size may exceed the outgoing interface MTU, leading to unexpected
   fragmentation at the IP layer.

2. Fragmentation with problematic IP IDs:
   When net.ipv4.vs.pmtu_disc=1 and a GSO packet with multiple segments
   is fragmented after encapsulation, each segment gets a sequentially
   incremented IP ID (0, 1, 2, ...). This happens because:

   a) The GSO packet bypasses MTU check and gets encapsulated
   b) At __ip_finish_output, the oversized GSO packet is split into
      separate SKBs (one per segment), with IP IDs incrementing
   c) Each SKB is then fragmented again based on the actual MTU

   This sequential IP ID allocation differs from the expected behavior
   and can cause issues with fragment reassembly and packet tracking.

Fix this by properly validating GSO packets using
skb_gso_validate_network_len(). This function correctly validates
whether the GSO segments will fit within the MTU after segmentation. If
validation fails, send an ICMP Fragmentation Needed message to enable
proper PMTU discovery.

Fixes: 4cdd34084d53 ("netfilter: nf_conntrack_ipv6: improve fragmentation 
handling")
Signed-off-by: Yingnan Zhang <[email protected]>
---
v3:
- Fixed compilation error (removed extra closing brace in IPv6 function)
- Fixed indentation to match kernel style

v2: https://lore.kernel.org/netdev/[email protected]/
v1: https://lore.kernel.org/netdev/[email protected]/
---
 net/netfilter/ipvs/ip_vs_xmit.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 3601eb86d..a4ca7cad0 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -111,8 +111,8 @@ __mtu_check_toobig_v6(const struct sk_buff *skb, u32 mtu)
                 */
                if (IP6CB(skb)->frag_max_size > mtu)
                        return true; /* largest fragment violate MTU */
-       }
-       else if (skb->len > mtu && !skb_is_gso(skb)) {
+       } else if (skb->len > mtu &&
+                  !(skb_is_gso(skb) && skb_gso_validate_network_len(skb, 
mtu))) {
                return true; /* Packet size violate MTU size */
        }
        return false;
@@ -232,8 +232,9 @@ static inline bool ensure_mtu_is_adequate(struct netns_ipvs 
*ipvs, int skb_af,
                        return true;
 
                if (unlikely(ip_hdr(skb)->frag_off & htons(IP_DF) &&
-                            skb->len > mtu && !skb_is_gso(skb) &&
-                            !ip_vs_iph_icmp(ipvsh))) {
+                            skb->len > mtu && !ip_vs_iph_icmp(ipvsh) &&
+                            !(skb_is_gso(skb) &&
+                              skb_gso_validate_network_len(skb, mtu)))) {
                        icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
                                  htonl(mtu));
                        IP_VS_DBG(1, "frag needed for %pI4\n",
-- 
2.51.0


Reply via email to