On 29/09/2015, at 11:14 PM, Alexandr Nedvedicky wrote: > Hello, > > I've tried Richard's patch on sparc. I took a brief look at its source code. > It's essentially what PF is doing on Solaris.
Thanks for doing that. Just for the record, I look at PF's patched checksum handling as follows. Before Henning's work, the checksum computation was spread (pseudo-header) over different parts of the code and PF would face partially checksummed packets. But there was no state indicating what the checksum covered, so when PF altered the packet it could not determine whether or not it should also modify the checksum, resulting in the rdr-to-localhost issue. This problem goes away now that checksums either cover what they ought or are flagged for computation. Now when PF alters a packet it need only modify the checksum as appropriate for the protocol. And although that's unnecessary when the checksum is as-yet uncomputed, it does no harm. > The approach we take on Solaris is as follows: > [...] > The things are getting pretty wild in 2b, when PF is doing PBR (policy based > routing) on outbound packets. Consider situation when IP stack routes packet > via NIC, which is able to calculate chksum in HW. IP stack sets flags and > fields and passes packet to PF. PF changes interface, where packet is bound > to, > to NIC, which is not able to calculate checksum, so the HW-cksum flags set by > IP stack are no longer valid. In this case we always revert to calculation > in SW. As I see it in OpenBSD (maybe Solaris differs) the M_*_CSUM_OUT flags decouple checksum policy ("whether to") from checksum mechanism ("how to"). So I don't think of these as HW-cksum flags but as indicating that a needed checksum is as-yet uncomputed. As that's not specific to an interface, they cannot become invalid if the output interface changes. What they allow is the choice of checksumming method, whether by software or by harware (or by small furry animals trained in ones-complement addition), to be left until late in the output path when the capabilities of the output interface are known. > I currently have small suggestion to improve Richard's patch. The macro in > PF_ALGNMNT() in pfvar.h uses modulo: > > #define PF_HI (true) > #define PF_LO (!PF_HI) > #define PF_ALGNMNT(off) (((off) % 2) == 0 ? PF_HI : PF_LO) > > I think we can get away with simple and operation (& 1), which will be faster > than % on many platforms. > > #define PF_ALGNMNT(off) (((off) & 1) == 0 ? PF_HI : PF_LO) I've no strong feelings either way here. I note that gcc emits the same code in either case on ppc and i386. best, Richard.