Garrett left out the real motivation for my strong objection:

primarily:
 - UDP's special handling of the zeros checksum means that packets which
checksum to 0 can be undetectably corrupted downstream unless the udp
checksum is correctly encoded as all-ones (the one's complement -0).

secondarily:
 - Using an all-one's (-0) TCP checksum can cause some real-world
systems to drop your packet, causing your connections to hang.  see:

6533773 tcp checksum 0xFFFF (-0) used instead of 0x0000 (+0) since
Solaris 10

On Mon, 2007-07-30 at 19:06 -0700, Garrett D'Amore wrote:
> 1) the stack interface hcksum_retrieve() does not differentiate between
> TCP and UDP for partial checksums.
> 
> 2) the "software" workaround for tiny packets in qfe (and that I mostly
> borrowed for hme) is broken.

2a) since cutting & pasting code, especially code as subtle and as easy
to screw up as the ones-complement checksum, is bad, the hcksum_* family
of helper functions needs a "hcksum_punt()" function for drivers to call
when they determine that, for whatever reason, their hardware can't cope
with checksumming a particular packet.  

> 3) we do not know what the Sun hardware (hme, eri/gem, ce, nxge) do in
> the situation... do they return +0, or -0 (0xffff)?

I think you mean "send" rather than "return".

> 4) unless the hardware is examining the packet contents to learn if the
> data is TCP or UDP, then it is *wrong* for either TCP or UDP.

more properly, it is either wrong for UDP, or wrong for everything else
(ICMP uses the same algorithm as TCP).

> So, I need answers from folks with access to knowledge about the
> hardware (#4 above).  Once we know which way the hardware behaves, we
> can decide what to do about TCP and UDP checksum offload.  

it seems like it ought to be straightforward to answer this
experimentally.  

> Probably,
> what we need to do is turn off checksum offload for UDP, because I
> *think* it is right for TCP.

I think we only have to worry about UDP partial-checksum transmit
offload.  

> The fix for 6587116 (which may impact PIT testing) is blocked until I
> can figure out how to proceed.  My code reviewer has denied me from
> taking the same course of action used in qfe, as a result of this issue.

my recommendation was instead to disable checksum offload in hme until
we figure this out.

                                                - Bill





_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to