On Wed, Jun 29, 2022 at 02:24:35PM +0200, Jan Klemkow wrote: > Hi, > > This diff introduces the sending side of TSO to our TCP/IP stack. > If the hardware has TSO capabilities tcp_output() will send huge TCP > segments down the stack to the interface. ip{6}_output() will ignore > the size is greater then eny MTU in this case. > > I tested it with IPv4, IPv6 and VLAN. VLAN sending is not offloaded > now, because this interface does not inherited the TSO capability. > I will do this in a later diff. > > If you have an ix(4) NIC of 82599 or newer, just enable TSO with > ifconfig(8): > > # ifconfig ix0 tso > > Thanks for testing, > Jan
A few comments below: > Index: netinet/tcp_input.c > =================================================================== > RCS file: /mount/openbsd/cvs/src/sys/netinet/tcp_input.c,v > retrieving revision 1.375 > diff -u -p -r1.375 tcp_input.c > --- netinet/tcp_input.c 4 Jan 2022 06:32:39 -0000 1.375 > +++ netinet/tcp_input.c 28 Jun 2022 17:12:43 -0000 > @@ -2851,6 +2851,15 @@ tcp_mss(struct tcpcb *tp, int offer) > mssopt = ifp->if_mtu - iphlen - sizeof(struct tcphdr); > mssopt = max(tcp_mssdflt, mssopt); > } > + > + if (ISSET(ifp->if_xflags, IFXF_TSO)) { > + tp->t_flags |= TF_TSO; > + > + if (ifp->if_hw_tsomax < MAXMCLBYTES) > + tp->t_tsomax = ifp->if_hw_tsomax; > + else > + tp->t_tsomax = MAXMCLBYTES; > + } Why is there a limit on MAXMCLBYTES? I guess the card must support chained buffers because a 64k mbuf is not linearly mapped. > out: > if_put(ifp); > /* > Index: sys/mbuf.h > =================================================================== > RCS file: /mount/openbsd/cvs/src/sys/sys/mbuf.h,v > retrieving revision 1.254 > diff -u -p -r1.254 mbuf.h > --- sys/mbuf.h 14 Feb 2022 04:33:18 -0000 1.254 > +++ sys/mbuf.h 28 Jun 2022 17:29:00 -0000 > @@ -133,6 +133,7 @@ struct pkthdr { > u_int16_t ph_flowid; /* pseudo unique flow id */ > u_int16_t csum_flags; /* checksum flags */ > u_int16_t ether_vtag; /* Ethernet 802.1p+Q vlan tag */ > + u_int16_t ph_mss; /* max. seg. size. */ > u_int ph_rtableid; /* routing table id */ > u_int ph_ifidx; /* rcv interface index */ > u_int8_t ph_loopcnt; /* mbuf is looping in kernel */ Please move the two u_int fields above the u_int16_t block so that the pkthdr packs nicely. You add a 5th uint16_t and so that would insert an extra pad. > @@ -226,13 +227,14 @@ struct mbuf { > #define M_IPV6_DF_OUT 0x1000 /* don't fragment outgoing IPv6 > */ > #define M_TIMESTAMP 0x2000 /* ph_timestamp is set */ > #define M_FLOWID 0x4000 /* ph_flowid is set */ > +#define M_TCP_TSO 0x8000 /* TCP Segmentation Offload > needed */ > > #ifdef _KERNEL > #define MCS_BITS \ > ("\20\1IPV4_CSUM_OUT\2TCP_CSUM_OUT\3UDP_CSUM_OUT\4IPV4_CSUM_IN_OK" \ > "\5IPV4_CSUM_IN_BAD\6TCP_CSUM_IN_OK\7TCP_CSUM_IN_BAD\10UDP_CSUM_IN_OK" \ > > "\11UDP_CSUM_IN_BAD\12ICMP_CSUM_OUT\13ICMP_CSUM_IN_OK\14ICMP_CSUM_IN_BAD" \ > - "\15IPV6_NODF_OUT" "\16TIMESTAMP" "\17FLOWID") > + "\15IPV6_NODF_OUT" "\16TIMESTAMP" "\17FLOWID" "\20TCP_TSO") > #endif > > /* mbuf types */ > What does happen when pf reroutes the packets to a non TSO capable interface? I think this should be done like HW CSUM offloading where the stack fixies up packets in ip_output() when it realizes that the HW does not support it. Only then this becomes generally more usable. Also doing TSO LSO in ip_output would allow to use LRO / LSO in forwarding and in many more places. It will also improve performance since we can batch send TCP packets. -- :wq Claudio