>Number:         146628
>Category:       kern
>Synopsis:       [patch] TCP does not clear DF when MTU is below a threshold
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun May 16 10:00:14 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Matthew Luckie
>Release:        FreeBSD 8.0-STABLE i386
>Organization:
>Environment:
System: FreeBSD mylar.luckie.org.nz 8.0-STABLE FreeBSD 8.0-STABLE #3: Sun May 
16 21:31:15 NZST 2010 [email protected]:/usr/src/sys/i386/compile/mylar 
i386

>Description:

FreeBSD, like most operating systems, will refuse to lower TCP's
segment size in response to an ICMP needfrag below a threshold.  In
FreeBSD's case, this is 512 bytes.  If a needfrag next-hop MTU 256 is
received, FreeBSD will reduce the connection's segment size to 512
bytes, and will then resend the presumed missing packet, but without
first clearing the DF bit.  If the Path MTU is in fact less than 512
bytes FreeBSD will get another needfrag, which it will ignore.  The
patch below will cause subsequent segments to be sent without the DF
bit set, and does not change FreeBSD's default behaviour of refusing
to reduce its segment size below a defined threshold.

>How-To-Repeat:

install net/scamper

scamper -F ipfw -I "tbit -M 256 -u '<url on webserver>' -i <ip address>"

>Fix:

--- patch-pmtud begins here ---
--- tcp_var.h.orig      2009-08-03 20:13:06.000000000 +1200
+++ tcp_var.h   2010-05-14 21:03:42.000000000 +1200
@@ -234,6 +234,7 @@
 #define        TF_ECN_PERMIT   0x4000000       /* connection ECN-ready */
 #define        TF_ECN_SND_CWR  0x8000000       /* ECN CWR in queue */
 #define        TF_ECN_SND_ECE  0x10000000      /* ECN ECE in queue */
+#define TF_IPDF                0x20000000      /* set the DF bit */
 
 #define IN_FASTRECOVERY(tp)    (tp->t_flags & TF_FASTRECOVERY)
 #define ENTER_FASTRECOVERY(tp) tp->t_flags |= TF_FASTRECOVERY
--- tcp_subr.c.orig     2009-08-03 20:13:06.000000000 +1200
+++ tcp_subr.c  2010-05-16 21:26:50.000000000 +1200
@@ -656,7 +656,9 @@
                tlen += sizeof (struct tcpiphdr);
                ip->ip_len = tlen;
                ip->ip_ttl = V_ip_defttl;
-               if (V_path_mtu_discovery)
+               if (tp != NULL && tp->t_flags & TF_IPDF)
+                       ip->ip_off |= IP_DF;
+               else if (tp == NULL && V_path_mtu_discovery)
                        ip->ip_off |= IP_DF;
        }
        m->m_len = tlen;
@@ -757,6 +759,9 @@
                tp->t_flags = (TF_REQ_SCALE|TF_REQ_TSTMP);
        if (V_tcp_do_sack)
                tp->t_flags |= TF_SACK_PERMIT;
+       if (V_path_mtu_discovery)
+               tp->t_flags |= TF_IPDF;
+
        TAILQ_INIT(&tp->snd_holes);
        tp->t_inpcb = inp;      /* XXX */
        /*
@@ -1361,9 +1366,11 @@
                                            if (mtu < max(296, V_tcp_minmss
                                                 + sizeof(struct tcpiphdr)))
                                                mtu = 0;
-                                           if (!mtu)
+                                           if (!mtu) {
                                                mtu = V_tcp_mssdflt
                                                 + sizeof(struct tcpiphdr);
+                                               tp->t_flags &= ~TF_IPDF;
+                                           }
                                            /*
                                             * Only cache the the MTU if it
                                             * is smaller than the interface
--- tcp_syncache.c.orig 2010-05-16 21:30:21.000000000 +1200
+++ tcp_syncache.c      2010-05-16 21:31:00.000000000 +1200
@@ -779,6 +779,9 @@
        if (sc->sc_flags & SCF_ECN)
                tp->t_flags |= TF_ECN_PERMIT;
 
+       if (V_path_mtu_discovery)
+               tp->t_flags |= TF_IPDF;
+
        /*
         * Set up MSS and get cached values from tcp_hostcache.
         * This might overwrite some of the defaults we just set.
--- tcp_output.c.orig   2009-11-18 05:17:11.000000000 +1300
+++ tcp_output.c        2010-05-16 20:38:25.000000000 +1200
@@ -1181,7 +1181,7 @@
         * Section 2. However the tcp hostcache migitates the problem
         * so it affects only the first tcp connection with a host.
         */
-       if (V_path_mtu_discovery)
+       if (tp->t_flags & TF_IPDF)
                ip->ip_off |= IP_DF;
 
        error = ip_output(m, tp->t_inpcb->inp_options, NULL,
--- patch-pmtud ends here ---


>Release-Note:
>Audit-Trail:
>Unformatted:
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "[email protected]"

Reply via email to