On Tue, May 30, 2023 at 09:08:45PM +0200, Alexander Bluhm wrote:
> Hi,
> 
> Currently packets sent over loopback interface get their checksum
> calculated twice.  In the output path it is set and during TCP/IP
> input it is calculated again to be compared with the previous value.
> 
> This can be avoided by claiming that lo(4) supports hardware checksum
> offloading.  For each packet convert the flag that the checksum
> should be calculated to the flag that it has been checked successfully.

so rather than calculate it twice, don't calculate it at all?

i love it.

> In a simple test on a vmm guest I see between 30% to 60% increase
> of thoughput over lo0.
> 
> A drawback is that "tcpdump -ni lo0 -v" reports invalid checksum.
> But people are used to that with physical interfaces and hardware
> offloading.

we could export these csum flags as part of the bpf header so we can
teach tcpdump to shut up in this situation.

> The question is, does it break corner cases?  Please test with pf
> route-to, IPsec, bridging, IPv6 and other setups where loopback
> might be involved.

who knows how bridge works? otherwise i think it should be ok.
hopefully.

> 
> bluhm
> 
> Index: net/if.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/net/if.c,v
> retrieving revision 1.697
> diff -u -p -r1.697 if.c
> --- net/if.c  16 May 2023 14:32:54 -0000      1.697
> +++ net/if.c  30 May 2023 12:33:42 -0000
> @@ -778,7 +778,7 @@ if_input(struct ifnet *ifp, struct mbuf_
>  int
>  if_input_local(struct ifnet *ifp, struct mbuf *m, sa_family_t af)
>  {
> -     int keepflags;
> +     int keepflags, keepcksum;
>  
>  #if NBPFILTER > 0
>       /*
> @@ -796,11 +796,26 @@ if_input_local(struct ifnet *ifp, struct
>       }
>  #endif
>       keepflags = m->m_flags & (M_BCAST|M_MCAST);
> +     /*
> +      * Preserve outgoing checksum flags, in case the packet is
> +      * forwarded to another interface.  Then the checksum, which
> +      * is now incorrect, will be calculated before sending.
> +      */
> +     keepcksum = m->m_pkthdr.csum_flags & (M_IPV4_CSUM_OUT |
> +         M_TCP_CSUM_OUT | M_UDP_CSUM_OUT | M_ICMP_CSUM_OUT);
>       m_resethdr(m);
>       m->m_flags |= M_LOOP | keepflags;
> +     m->m_pkthdr.csum_flags = keepcksum;
>       m->m_pkthdr.ph_ifidx = ifp->if_index;
>       m->m_pkthdr.ph_rtableid = ifp->if_rdomain;
>  
> +     if (ISSET(keepcksum, M_TCP_CSUM_OUT))
> +             m->m_pkthdr.csum_flags |= M_TCP_CSUM_IN_OK;
> +     if (ISSET(keepcksum, M_UDP_CSUM_OUT))
> +             m->m_pkthdr.csum_flags |= M_UDP_CSUM_IN_OK;
> +     if (ISSET(keepcksum, M_ICMP_CSUM_OUT))
> +             m->m_pkthdr.csum_flags |= M_ICMP_CSUM_IN_OK;
> +
>       ifp->if_opackets++;
>       ifp->if_obytes += m->m_pkthdr.len;
>  
> @@ -809,6 +824,8 @@ if_input_local(struct ifnet *ifp, struct
>  
>       switch (af) {
>       case AF_INET:
> +             if (ISSET(keepcksum, M_IPV4_CSUM_OUT))
> +                     m->m_pkthdr.csum_flags |= M_IPV4_CSUM_IN_OK;
>               ipv4_input(ifp, m);
>               break;
>  #ifdef INET6
> Index: net/if_loop.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_loop.c,v
> retrieving revision 1.93
> diff -u -p -r1.93 if_loop.c
> --- net/if_loop.c     21 Oct 2022 14:20:03 -0000      1.93
> +++ net/if_loop.c     30 May 2023 11:48:55 -0000
> @@ -173,6 +173,9 @@ loop_clone_create(struct if_clone *ifc, 
>       ifp->if_mtu = LOMTU;
>       ifp->if_flags = IFF_LOOPBACK | IFF_MULTICAST;
>       ifp->if_xflags = IFXF_CLONED;
> +     ifp->if_capabilities = IFCAP_CSUM_IPv4 |
> +         IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4 |
> +         IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
>       ifp->if_rtrequest = lortrequest;
>       ifp->if_ioctl = loioctl;
>       ifp->if_input = loinput;
> 

Reply via email to