On Mon, 2007-07-30 at 22:32 -0400, Bill Sommerfeld wrote:
> Garrett left out the real motivation for my strong objection:
>
> primarily:
> - UDP's special handling of the zeros checksum means that packets which
> checksum to 0 can be undetectably corrupted downstream unless the udp
> checksum is correctly encoded as all-ones (the one's complement -0).
>
> secondarily:
> - Using an all-one's (-0) TCP checksum can cause some real-world
> systems to drop your packet, causing your connections to hang. see:
>
> 6533773 tcp checksum 0xFFFF (-0) used instead of 0x0000 (+0) since
> Solaris 10
Yes, these are the consequences I didn't spell out clearly.
>
> On Mon, 2007-07-30 at 19:06 -0700, Garrett D'Amore wrote:
> > 1) the stack interface hcksum_retrieve() does not differentiate between
> > TCP and UDP for partial checksums.
> >
> > 2) the "software" workaround for tiny packets in qfe (and that I mostly
> > borrowed for hme) is broken.
>
> 2a) since cutting & pasting code, especially code as subtle and as easy
> to screw up as the ones-complement checksum, is bad, the hcksum_* family
> of helper functions needs a "hcksum_punt()" function for drivers to call
> when they determine that, for whatever reason, their hardware can't cope
> with checksumming a particular packet.
heh. I was copying from what I *believed* to be a known good
implementation.
But yes, having a simple function, hcksum_punt() taking an mblk_t *,
would be much cleaner. I hated copying that code as a matter of
principle.
>
> > 3) we do not know what the Sun hardware (hme, eri/gem, ce, nxge) do in
> > the situation... do they return +0, or -0 (0xffff)?
>
> I think you mean "send" rather than "return".
Yes.
>
> > 4) unless the hardware is examining the packet contents to learn if the
> > data is TCP or UDP, then it is *wrong* for either TCP or UDP.
>
> more properly, it is either wrong for UDP, or wrong for everything else
> (ICMP uses the same algorithm as TCP).
>
> > So, I need answers from folks with access to knowledge about the
> > hardware (#4 above). Once we know which way the hardware behaves, we
> > can decide what to do about TCP and UDP checksum offload.
>
> it seems like it ought to be straightforward to answer this
> experimentally.
I don't have access to all the hardware at hand, and I really would
prefer to know what the hardware engineers *designed* it to do.
Experimentation may give clues, but what happens on edge cases is
important too... e.g. what about for SCTP? What about for ICMP? Etc.
>
> > Probably,
> > what we need to do is turn off checksum offload for UDP, because I
> > *think* it is right for TCP.
>
> I think we only have to worry about UDP partial-checksum transmit
> offload.
>
> > The fix for 6587116 (which may impact PIT testing) is blocked until I
> > can figure out how to proceed. My code reviewer has denied me from
> > taking the same course of action used in qfe, as a result of this issue.
>
> my recommendation was instead to disable checksum offload in hme until
> we figure this out.
Yes, and I'm stating that, if hme is wrong here, then I think all
drivers are wrong. The hideous explanatory comment I'd have at the top
of the code explaining this problem makes me really really want not to
put back the code until we have some clearer picture.
In particular, either
a) all Sun NICs are wrong (or at least so I hypothesize, unless there
is some parsing of the packet going that I'm unaware of!
or
b) the stack is wrong and should not be trying to perform partial UDP
checksum offload (at least on these NICs)
I _think_ it works out that for receive there is no problem... the
checksum will be verified fine ... but unfortunately we have no way in
the stack to register to provide rx checksum offload without also
offering to provide tx checksum offload.
-- Garrett
_______________________________________________
networking-discuss mailing list
[email protected]