On Wed, Sep 17, 2014 at 4:27 AM, Stefano Garzarella
<stefanogarzare...@gmail.com> wrote:
> Much of the advantage of TSO comes from crossing the network stack only
> once per (large) segment instead of once per 1500-byte frame.
> GSO does the same both for segmentation (TCP) and fragmentation (UDP)
> by doing these operations as late as possible.

My initial impression is that this is a layering violation.  Code like
this gives me pause:

+ eh = mtod(m, struct ether_vlan_header *);
+ if (eh->evl_encap_proto == htons(ETHERTYPE_VLAN)) {
+ } else {
+ eh_len = ETHER_HDR_LEN;
+ }
+ return gso_dispatch(ifp, m, eh_len);

If someone adds QinQ support, this code must be updated.  When vxlan
support comes in, we must update this code or else the outer UDP
packet gets fragmented instead of the inner TCP payload being
segmented.  As more tunneling protocols get added to FreeBSD, the
dispatch code for GSO gets uglier and uglier.

It seems to me that the real problem that we are trying to solve is a
lack of batching in the kernel.  Currently the network stack operates
on the mbuf (packet) boundary.  It seems to me that we could introduce
a "packet group" concept that is guaranteed to have the same L3 and L2
endpoint.  In the transmit path, we would initially have a single
(potentially oversized) packet in the group.  When TCP segments the
packet, it would add each packet to the packet group and pass it down
the stack.  Because we guarantee that the endpoints are the same for
every packet in the group, the L3 code can do a single routing table
lookup and the L2 code can do a single l2table lookup for the entire

The disadvantages of packet groups would be that:
a) You have touch a lot more code in a lot more places to take
advantage of the concept.
b) TSO inherently has the same layering problems.  If we're going to
solve the problem for tunneling protocols then GSO might well be able
to take advantage of them.
freebsd-current@freebsd.org mailing list
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to