Folks, This is not limited to NetBSD current but with all NetBSD releases with IPv6, and other BSD based varients as well.
Applications which do not want to depend on Path MTU Discovery may set IPV6_USE_MIN_MTU=1 to use IPv6_MMTU (1280) as defined in RFC3542. This works well for UDP as long as fragmentation is allowed and no other method to reduce message size is available. But when the socket is set with this option, TCP doesn't see it and it may send a larger segment especially when the peer notifies larger MSS value. In this case, the segment is fragmented, and delivery efficiency decreases. In some cases, if a middle box refuses fragments (some of the boxes only pass the first fragment), the TCP session breaks. There was a short discussion on IETF 6man mailing list but no conclusion was made: https://www.ietf.org/mail-archive/web/ipv6/current/msg24977.html I understood that it may not be the best to use IPV6_USE_MIN_MTU to control this, however, there is no suitable knob defined at this moment, and some TCP sessions currently fail in IPv6. When IPV6_USE_MIN_MTU is defined to be 1 (IP6PO_MINMTU_ALL), followings are necessary to avoid IPv6 fragmentation: - it should advertise TCP MSS to be 1220 or less - when TCP MSS is advertised from its peer, it should be clipped to be 1220 or less - when it send a TCP segment, its segment size should be 1220 or less Note: 1220 is IPV6_MMTU - IP6/TCP header size. Enclosed is a quick ugly patch for NetBSD-7 to mitigate this situation, which may not be complete. -- Akira Kato, WIDE Project
*** tcp_input.c.ORG Sun Aug 2 15:08:15 2015 --- tcp_input.c Mon Nov 7 17:12:05 2016 *************** *** 4443,4448 **** --- 4443,4461 ---- sc->sc_ourmaxseg = tcp_mss_to_advertise(m->m_flags & M_PKTHDR ? m->m_pkthdr.rcvif : NULL, sc->sc_src.sa.sa_family); + #ifdef INET6 + if (tp && tp->t_in6pcb && tp->t_in6pcb->in6p_outputopts) { + if (tp->t_in6pcb->in6p_outputopts->ip6po_minmtu == + IP6PO_MINMTU_ALL) { + sc->sc_ourmaxseg = min(sc->sc_ourmaxseg, + IPV6_MMTU - sizeof(struct ip6_hdr) + - sizeof(struct tcphdr)); + sc->sc_peermaxseg = min(sc->sc_peermaxseg, + IPV6_MMTU - sizeof(struct ip6_hdr) + - sizeof(struct tcphdr)); + } + } + #endif sc->sc_win = win; sc->sc_timebase = tcp_now - 1; /* see tcp_newtcpcb() */ sc->sc_timestamp = tb.ts_recent; *** tcp_output.c.ORG Sun Aug 2 15:08:15 2015 --- tcp_output.c Tue Nov 1 08:29:38 2016 *************** *** 1124,1129 **** --- 1124,1138 ---- tp->snd_nxt = tp->iss; tp->t_ourmss = tcp_mss_to_advertise(synrt != NULL ? synrt->rt_ifp : NULL, af); + #ifdef INET6 + if (tp->t_in6pcb && tp->t_in6pcb->in6p_outputopts) { + if (tp->t_in6pcb->in6p_outputopts->ip6po_minmtu == + IP6PO_MINMTU_ALL) + tp->t_ourmss = min(tp->t_ourmss, + IPV6_MMTU - sizeof(struct ip6_hdr) + - sizeof(struct tcphdr)); + } + #endif if ((tp->t_flags & TF_NOOPT) == 0 && OPT_FITS(4)) { opt[0] = TCPOPT_MAXSEG; opt[1] = 4; *** tcp_subr.c.ORG Thu Feb 26 12:44:13 2015 --- tcp_subr.c Thu Nov 3 09:25:45 2016 *************** *** 2001,2006 **** --- 2001,2014 ---- mss = tcp_mssdflt; if (offer) mss = offer; + #ifdef INET6 + if (tp->t_in6pcb && tp->t_in6pcb->in6p_outputopts) { + if (tp->t_in6pcb->in6p_outputopts->ip6po_minmtu == + IP6PO_MINMTU_ALL) + mss = min(mss, IPV6_MMTU - sizeof(struct ip6_hdr) + - sizeof(struct tcphdr)); + } + #endif mss = max(mss, 256); /* sanity */ tp->t_peermss = mss; mss -= tcp_optlen(tp);