CVSROOT: /cvs Module name: src Changes by: d...@cvs.openbsd.org 2023/07/05 22:55:05
Modified files: sys/net : if.c if_pfsync.c if_pfsync.h netisr.h pf.c pf_ioctl.c pf_norm.c pfvar.h pfvar_priv.h sys/netinet : in_proto.c ip_ipsp.h Log message: big update to pfsync to try and clean up locking in particular. moving pf forward has been a real struggle, and pfsync has been a constant source of pain. we have been papering over the problems for a while now, but it reached the point that it needed a fundamental restructure, which is what this diff is. the big headliner changes in this diff are: - pfsync specific locks this is the whole reason for this diff. rather than rely on NET_LOCK or KERNEL_LOCK or whatever, pfsync now has it's own locks to protect it's internal data structures. this is important because pfsync runs a bunch of timeouts and tasks to push pfsync packets out on the wire, or when it's handling requests generated by incoming pfsync packets, both of which happen outside pf itself running. having pfsync specific locks around pfsync data structures makes the mutations of these data structures a lot more explicit and auditable. - partitioning to enable future parallelisation of the network stack, this rewrite includes support for pfsync to partition states into different "slices". these slices run independently, ie, the states collected by one slice are serialised into a separate packet to the states collected and serialised by another slice. states are mapped to pfsync slices based on the pf state hash, which is the same hash that the rest of the network stack and multiq hardware uses. - no more pfsync called from netisr pfsync used to be called from netisr to try and bundle packets, but now that there's multiple pfsync slices this doesnt make sense. instead it uses tasks in softnet tqs. - improved bulk transfer handling there's shiny new state machines around both the bulk transmit and receive handling. pfsync used to do horrible things to carp demotion counters, but now it is very predictable and returns the counters back where they started. - better tdb handling the tdb handling was pretty hairy, but hrvoje has kicked this around a lot with ipsec and sasyncd and we've found and fixed a bunch of issues as a result of that testing. - mpsafe pf state purges this was committed previously, but because the locks pfsync relied on weren't clear this just caused a ton of bugs. as part of this diff it's now reliable, and moves a big chunk of work out from under KERNEL_LOCK, which in turn improves the responsiveness and throughput of a firewall even if you're not using pfsync. there's a bunch of other little changes along the way, but the above are the big ones. hrvoje has done performance testing with this diff and notes a big improvement when pfsync is not in use. performance when pfsync is enabled is about the same, but im hoping the slices means we can scale along with pf as it improves. lots (months) of testing by me and hrvoje on pfsync boxes tests and ok sashan@ deraadt@ says this is a good time to put it in