On 14/09/2015, at 11:51 PM, Henning Brauer wrote:
> * Martin Pieuchot <m...@openbsd.org> [2015-09-11 13:54]:
>> On 11/09/15(Fri) 13:28, Henning Brauer wrote:
>>> Ryan pointed me to this diff and we briefly discussed it; we remain
>>> convinced that the in-tree approach is better than this.
>> Could you elaborate why?
> 
> [ elaboration, quoted further down ] 
> 
> And given that, the approach that has less and simpler code and makes
> better use of offloading wins.

Well, I agree that less and simpler code is better, all else being equal. 
In fact if you'd like my opinion, the in-tree approach actually places a
greater burden of complexity on the programmer. The patch needs one line 
to alter a value

        rewrite += pf_change_16(pd, &pd->hdr.icmp->icmp_id, icmpid);

; the in-tree code requires all packet alterations to occur between 
two calls, analogous to the way locking must enclose a critical section:

        if (pd->csum_status == PF_CSUM_UNKNOWN)
               pf_check_proto_cksum(pd, pd->off, pd->tot_len - pd->off,
                   pd->proto, pd->af); 
           [...]
           header->field = new_value; 
           [...]
        pf_cksum(pd, pd->m);

This is the stuff mild headaches are made of. Especially as 1) pf_cksum()
usually occurs elsewhere and as 2) the idiom tempts programmers to rely on
existing pf_check_proto_cksum() calls. This last occurs three places in the
existing code. I presume these are correct but it is much easier to avoid the
question by replacing them with the one-liner.

The new interface also provides a place to put all the altered-value guards,
replacing, e.g.:


        if (icmpid != pd->hdr.icmp->icmp_id) {
                if (pd->csum_status == PF_CSUM_UNKNOWN)
                        pf_check_proto_cksum(pd, pd->off,
                            pd->tot_len - pd->off, pd->proto,
                            pd->af);
                pd->hdr.icmp->icmp_id = icmpid;
                rewrite = 1;
        }

with:

        rewrite += pf_change_16(pd, &pd->hdr.icmp->icmp_id, icmpid);

This saves code. Here are some numbers. First, non-comment[0] 
lines of code for all affected files:

            pf.c:1.944   pfvar.h:1.420   pf_norm.c:1.182
current     5724         1613            1030 
patched     5682         1614            1028 
            ----         ----            ----
             -88           +1              -2

(amd64)     pf.o          pf_norm.o
current     123520 bytes  30816 bytes
patched     123584 bytes  30912 bytes
            -----         -----
              +64           +96

(i386)      pf.o          pf_norm.o 
current     95904 bytes   21096 bytes
patched     94956 bytes   21156 bytes
            -----         -----
             -948           -60

The patch's numbers include the new pf_change_* interface and the checksum 
modification code. So the proposed interface pays its way. Ok, let's move 
on to handling checksums:

> Well we've been thru it more than once; the argument presented here
> was that modifying the cksum instead of verify + recalc is better as
> it wouldn't hide cksum mismatches if the cksum engines on the NICs we
> offload to misbehave. After many years with the verify + recalc
> approach I think it is pretty safe to say that this is of no
> concern...

I'm sorry, that's not what I'm saying. Even supposing perfection of every
offload engine, offload engines have no monopoly on faults; regenerated
checksums also mask bus, stack and memory faults in pf-altered packets.

Regeneration always hides faults in /something/, unfortunately. 
pf_check_proto_cksum() takes care to mitigate that by preventing it from
hiding faults in the routers a packet passes through --- except for the current
one, which it can't.

That is, either router faults are a concern or they're not. pf can't
have it both ways, concerned to detect faults in every router other than 
the one on which it is running. The proposed patch fixes that.

And, as I wrote in a previous post, we know that routers corrupt data:

>> Right now my home firewall shows 30 TCP segments dropped for bad checksums.
>> As checks at least as strong are used by every sane link-layer this
>> virtually implies the dropped packets suffered router or end-point faults.

Another (home) router I administer was seeing IIRC five a day on average over
42 days, something we expect of a globe-spanning internetwork. Passing one of
these damaged segments to the user sufficies to break MACs and drop secure
connections.

Nonetheless, one might still argue that the reduced reliability is acceptable.
If there were some compelling upside, perhaps, maybe it might be, but I've
seen no evidence of performance benefits and the proposed patch addresses the
code complexity from which the more reliable method suffered.


Lastly, I don't want to sound like I'm trying to beat up on your checksum
work. I know very well how much effort is involved and it's very much
appreciated; you've cleaned up the code considerably. The patch on the table
builds on that and it would be far uglier without you having moved checksum
calculation to the output paths, stopped partially checksummed packets passing
through the stack, eliminated the possibility of rdr-to-localhost errors, and
generally simplified things all over the place.

I also appreciate that you aim to avoid knarly problems by doing something
completely different. Too often people just grit their teeth and keep chewing
mindlessly on the knot, or tie a bigger one to workaround it. That they don't
around here is one of the reasons I like OpenBSD.


best, 
Richard. 

[0]

#!/bin/sh
gcc -P -fpreprocessed -dD -E $1

[1] pf.c:1.944

Index: net/pf.c
===================================================================
--- net.orig/pf.c
+++ net/pf.c
@@ -5092,21 +5092,24 @@ pf_test_state_icmp(struct pf_pdesc *pd,
 #ifdef INET6
                                case AF_INET6:
                                        m_copyback(pd->m, pd->off,
                                            sizeof(struct icmp6_hdr),
                                            pd->hdr.icmp6, M_NOWAIT);
                                        m_copyback(pd2.m, ipoff2, sizeof(h2_6),
                                            &h2_6, M_NOWAIT);
                                        break;
 #endif /* INET6 */
                                }
-                               uh.uh_sum = 0;
+                               /* Avoid recomputing quoted UDP checksum.
+                                * note: udp6 0 csum invalid per rfc2460 p27.
+                                * but presumed nothing cares in this context */
+                               pf_change_16(pd, &uh.uh_sum, 0);
                                m_copyback(pd2.m, pd2.off, sizeof(uh), &uh,
                                    M_NOWAIT);
                                copyback = 1;
                        }
                        break;
                }
                case IPPROTO_ICMP: {
                        struct icmp             iih;
 
                        if (pd2.af != AF_INET) {
@@ -5158,21 +5161,22 @@ pf_test_state_icmp(struct pf_pdesc *pd,
                                            pd->hdr.icmp6, M_NOWAIT);
                                        if (pf_change_icmp_af(pd->m, ipoff2,
                                            pd, &pd2, &nk->addr[sidx],
                                            &nk->addr[didx], pd->af, nk->af))
                                                return (PF_DROP);
                                        pd->proto = IPPROTO_ICMPV6;
                                        if (pf_translate_icmp_af(nk->af, &iih))
                                                return (PF_DROP);
                                        if (virtual_type == htons(ICMP_ECHO) &&
                                            nk->port[iidx] != iih.icmp_id)
-                                               iih.icmp_id = nk->port[iidx];
+                                               pf_change_16(pd, &iih.icmp_id,
+                                                   nk->port[iidx]);
                                        m_copyback(pd2.m, pd2.off, ICMP_MINLEN,
                                            &iih, M_NOWAIT);
                                        pd->m->m_pkthdr.ph_rtableid =
                                            nk->rdomain;
                                        pd->destchg = 1;
                                        PF_ACPY(&pd->nsaddr,
                                            &nk->addr[pd2.sidx], nk->af);
                                        PF_ACPY(&pd->ndaddr,
                                            &nk->addr[pd2.didx], nk->af);
                                        pd->naf = nk->af;
@@ -5270,21 +5274,22 @@ pf_test_state_icmp(struct pf_pdesc *pd,
                                        if (pf_change_icmp_af(pd->m, ipoff2,
                                            pd, &pd2, &nk->addr[sidx],
                                            &nk->addr[didx], pd->af, nk->af))
                                                return (PF_DROP);
                                        pd->proto = IPPROTO_ICMP;
                                        if (pf_translate_icmp_af(nk->af, &iih))
                                                return (PF_DROP);
                                        if (virtual_type ==
                                            htons(ICMP6_ECHO_REQUEST) &&
                                            nk->port[iidx] != iih.icmp6_id)
-                                               iih.icmp6_id = nk->port[iidx];
+                                               pf_change_16(pd, &iih.icmp6_id,
+                                                   nk->port[iidx]);
                                        m_copyback(pd2.m, pd2.off,
                                            sizeof(struct icmp6_hdr), &iih,
                                            M_NOWAIT);
                                        pd->m->m_pkthdr.ph_rtableid =
                                            nk->rdomain;
                                        pd->destchg = 1;
                                        PF_ACPY(&pd->nsaddr,
                                            &nk->addr[pd2.sidx], nk->af);
                                        PF_ACPY(&pd->ndaddr,
                                            &nk->addr[pd2.didx], nk->af);










 



> 

Reply via email to