On 29/09/2015, at 11:14 PM, Alexandr Nedvedicky wrote:

> Hello,
> 
> I've tried Richard's patch on sparc. I took a brief look at its source code.
> It's essentially what PF is doing on Solaris.

Thanks for doing that.

Just for the record, I look at PF's patched checksum handling as follows.

Before Henning's work, the checksum computation was spread (pseudo-header)
over different parts of the code and PF would face partially checksummed
packets. But there was no state indicating what the checksum covered, so when
PF altered the packet it could not determine whether or not it should also
modify the checksum, resulting in the rdr-to-localhost issue.

This problem goes away now that checksums either cover what they ought or are
flagged for computation. Now when PF alters a packet it need only modify the
checksum as appropriate for the protocol. And although that's unnecessary
when the checksum is as-yet uncomputed, it does no harm.

> The approach we take on Solaris is as follows:
> [...]

> The things are getting pretty wild in 2b, when PF is doing PBR (policy based
> routing) on outbound packets. Consider situation when IP stack routes packet
> via NIC, which is able to calculate chksum in HW.  IP stack sets flags and
> fields and passes packet to PF. PF changes interface, where packet is bound 
> to,
> to NIC, which is not able to calculate checksum, so the HW-cksum flags set by
> IP stack are no longer valid. In this case we always revert to calculation
> in SW.

As I see it in OpenBSD (maybe Solaris differs) the M_*_CSUM_OUT flags decouple
checksum policy ("whether to") from checksum mechanism ("how to"). So I don't
think of these as HW-cksum flags but as indicating that a needed checksum is
as-yet uncomputed. As that's not specific to an interface, they cannot become 
invalid if the output interface changes.

What they allow is the choice of checksumming method, whether by software or
by harware (or by small furry animals trained in ones-complement addition),
to be left until late in the output path when the capabilities of the output
interface are known.

> I currently have small suggestion to improve Richard's patch. The macro in
> PF_ALGNMNT() in pfvar.h uses modulo:
> 
>    #define PF_HI (true)
>    #define PF_LO (!PF_HI)
>    #define PF_ALGNMNT(off) (((off) % 2) == 0 ? PF_HI : PF_LO)
> 
> I think we can get away with simple and operation (& 1), which will be faster
> than % on many platforms.
> 
>    #define PF_ALGNMNT(off) (((off) & 1) == 0 ? PF_HI : PF_LO)

I've no strong feelings either way here. I note that gcc emits the same code
in either case on ppc and i386. 

best, 
Richard. 


Reply via email to