CVSROOT:        /cvs
Module name:    src
Changes by:     bl...@cvs.openbsd.org   2025/05/07 08:10:19

Modified files:
        sys/net        : if.c if_var.h 
        sys/netinet    : tcp_input.c tcp_var.h 

Log message:
Cache socket lock during TCP input.

Parallel TCP input is running for a few days now and looks quite
stable.  Final step is to implement caching of the socket lock.
Without large receive offloading (LRO) in the driver layer, it is
very likely that consecutive TCP segments are in the input queue.
This leads to contention of the socket lock between TCP input and
socket receive syscall from userland.
With this commit, ip_deliver() moves all TCP packets that are in
the softnet queue temporarily to a TCP queue.  This queue is per
softnet thread so no locking is needed.  Finally in the same shared
netlock context, tcp_input_mlist() processes all TCP packets.  It
keeps a pointer to the socket lock.  tcp_input_solocked() switches
the lock only when the TCP stream changes.  A bunch of packets are
processed and placed into the socket receive buffer under the same
lock.  Then soreceive() can copy huge chunks to userland.  The
contention of the socket lock is gone.
On a 4 core machine I see between 12% to 22% improvement with 10
parallel TCP streams.  When testing only with a single TCP stream,
throughput increases between 38% to 100%.

tested by Mark Patruck a while ago; OK mvs@

Reply via email to