On Mon, Oct 19, 2020 at 12:33:25PM +0100, Stuart Henderson wrote: > On 2020/10/19 19:53, David Gwynne wrote: > > On Mon, Oct 19, 2020 at 09:34:31AM +0100, Stuart Henderson wrote: > > > On 2020/10/19 15:35, David Gwynne wrote: > > > > every few years i try and use route-to in pf, and every time it > > > > goes badly. i tried it again last week in a slightly different > > > > setting, and actually tried to understand the sharp edges i hit > > > > this time instead of giving up. it turns out there are 2 or 3 > > > > different things together that have cause me trouble, which is why > > > > the diff below is so big. > > > > > > I used to route-to/reply-to quite a lot at places with poor internet > > > connections to split traffic between lines (mostly those have better > > > connections now so I don't need it as often). It worked as I expected - > > > but I only ever used it with the interface specified. > > > > cool. did it work beyond the first packet in a connection? > > It must have done. The webcams would have utterly broken the rest of > traffic if it hadn't :)
> > > > I mostly used it with pppoe interfaces so the peer address was unknown > > > at ruleset load time. (I was lucky and had static IPs my side, but the > > > ISP side was variable). I relied on the fact that once packets are > > > directed at a point-point interface there's only one place for them to > > > go. I didn't notice that ":peer" might be useful here (and the syntax > > > 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't > > > have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc > > > (though actually I think it works with $any_random_address@pppoeX). > > > > yes. i was trying to use it with peers over ethernet, and always > > struggled with the syntax. > > > > > > the first and i would argue most fundamental problem is a semantic > > > > problem. if you ask a random person who has some clue about networks > > > > and routing what they would expect the "argument" to route-to or > > > > reply-to to be, they would say "a nexthop address" or "a gateway > > > > address". eg, say i want to force packets to a specific backend > > > > server without using NAT, i would write a rule like this: > > > > > > > > n_servers="192.0.2.128/27" > > > > pass out on $if_internal to $n_servers route-to 192.168.0.1 > > > > > > > > pfctl will happily parse this, shove it into the kernel, let you read > > > > the rules back out again with pfctl -sr, and it all looks plausible, but > > > > it turns out that it's using the argument to route-to as an interface > > > > name. because rulesets can refer to interfaces that don't exist yet, pf > > > > just passes the IP address around as a string, hoping i'll plug in an > > > > interface with a driver name that looks like an ip address. i spent > > > > literally a day trying to figure out why a rule like this wasn't > > > > working. > > > > > > I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's > > > supposed to work. So either doc or implementation bug there. > > > > im leaning toward implementation bug. > > > > > route = ( "route-to" | "reply-to" | "dup-to" ) > > > ( routehost | "{" routehost-list "}" ) > > > [ pooltype ] > > > > > > routehost-list = routehost [ [ "," ] routehost-list ] > > > > > > routehost = host | host "@" interface-name | > > > "(" interface-name [ address [ "/" mask-bits ] ] ")" > > > > > > > the second problem is that the pf_route calls from pfsync don't > > > > have all the information it is supposed to have. more specifically, > > > > an ifp pointer isn't set which leads to a segfault. the ifp pointer > > > > isn't set because pfsync doesnt track which interface a packet is > > > > going out, it assumes the ip layer will get it right again later, or a > > > > rule provided something usable. > > > > > > > > the third problem is that pf_route relies on information from rules to > > > > work correctly. this is a problem in a pfsync environment because you > > > > cannot have the same ruleset on both firewalls 100% of the time, which > > > > means you cannot have route-to/reply-to behave consistently on a pair of > > > > firwalls 100% of the time. > > > > > > I didn't run into this because pppoe(4) and pfsync/carp don't really > > > go well together, but ouch! > > > > > > > all of this together makes things work pretty obviously and smoothly. > > > > in my opinion anyway. route-to now works more like rdr-to, it just > > > > feels like it changes the address used for the route lookup rather > > > > than changing the actual IP address in the packet. it also works > > > > predictably in a pfsync pair, which is great from the point of view of > > > > high availability. > > > > > > > > the main caveat is that it's not backward compatible. if you're already > > > > using route-to, you will need to tweak your rules to have them parse. > > > > however, i doubt anyone is using this stuff because it feels very broken > > > > to me. > > > > > > Do you expect this to work with a bracketed "address" to defer lookup > > > until rule evaluation time? i.e. > > > > > > pass out proto tcp to any port 22 route-to (pppoe1:peer) > > > > in my opinion route-to should be able to take whatever rdr-to accepts. > > however, i just tried it and it doesnt currently work, but i'm sure i > > can figure it out. i think i tried it with the wrong pfctl. it does appear to work as we want. > > > > > I think that will be all that's needed to allow converting the pppoe > > > use case. I don't have a multiple pppoe setup handy but I can probably > > > hack together some sort of test. > > > > > > I've also used route-to with squid "transparent" proxying (shown in > > > the pkg-readme), I don't do that any more but I can put a squid test > > > together easily enough. > > > > it looks doable. > > > > my changes will break route-to with "no state" rules though. should > > i try and keep those working too? > > I don't think I ever used that myself. aja@ added the "no state" in > squid's pkg-readme (to avoid state problems when hairpinning the > intercepted packets back to a machine on a subnet alongside the client, > squid replying to the client directly), no idea if he's still using that, > I would guess probably not. > > If I understand the problem correctly I think this case could use "keep > state (sloppy)" instead anyway. cool. here's a tweaked diff. i also checked that it did the thing sashan@ and i were talking about where the route-to address is used as a destination, and the gateway for that destination is used for the routing. i am feeling very warm and fuzzy about this diff at the moment. Index: sbin/pfctl/parse.y =================================================================== RCS file: /cvs/src/sbin/pfctl/parse.y,v retrieving revision 1.704 diff -u -p -r1.704 parse.y --- sbin/pfctl/parse.y 1 Oct 2020 14:02:08 -0000 1.704 +++ sbin/pfctl/parse.y 19 Oct 2020 23:19:22 -0000 @@ -276,6 +276,7 @@ struct filter_opts { struct redirspec nat; struct redirspec rdr; struct redirspec rroute; + u_int8_t rt; /* scrub opts */ int nodf; @@ -284,15 +285,6 @@ struct filter_opts { int randomid; int max_mss; - /* route opts */ - struct { - struct node_host *host; - u_int8_t rt; - u_int8_t pool_opts; - sa_family_t af; - struct pf_poolhashkey *key; - } route; - struct { u_int32_t limit; u_int32_t seconds; @@ -517,7 +509,6 @@ int parseport(char *, struct range *r, i %type <v.host> ipspec xhost host dynaddr host_list %type <v.host> table_host_list tablespec %type <v.host> redir_host_list redirspec -%type <v.host> route_host route_host_list routespec %type <v.os> os xos os_list %type <v.port> portspec port_list port_item %type <v.uid> uids uid_list uid_item @@ -974,7 +965,7 @@ anchorrule : ANCHOR anchorname dir quick YYERROR; } - if ($9.route.rt) { + if ($9.rt) { yyerror("cannot specify route handling " "on anchors"); YYERROR; @@ -1842,37 +1833,13 @@ pfrule : action dir logquick interface decide_address_family($7.src.host, &r.af); decide_address_family($7.dst.host, &r.af); - if ($8.route.rt) { - if (!r.direction) { + if ($8.rt) { + if ($8.rt != PF_DUPTO && !r.direction) { yyerror("direction must be explicit " "with rules that specify routing"); YYERROR; } - r.rt = $8.route.rt; - r.route.opts = $8.route.pool_opts; - if ($8.route.key != NULL) - memcpy(&r.route.key, $8.route.key, - sizeof(struct pf_poolhashkey)); - } - if (r.rt) { - decide_address_family($8.route.host, &r.af); - if ((r.route.opts & PF_POOL_TYPEMASK) == - PF_POOL_NONE && ($8.route.host->next != NULL || - $8.route.host->addr.type == PF_ADDR_TABLE || - DYNIF_MULTIADDR($8.route.host->addr))) - r.route.opts |= PF_POOL_ROUNDROBIN; - if ($8.route.host->next != NULL) { - if (!PF_POOL_DYNTYPE(r.route.opts)) { - yyerror("address pool option " - "not supported by type"); - YYERROR; - } - } - /* fake redirspec */ - if (($8.rroute.rdr = calloc(1, - sizeof(*$8.rroute.rdr))) == NULL) - err(1, "$8.rroute.rdr"); - $8.rroute.rdr->host = $8.route.host; + r.rt = $8.rt; } if (expand_divertspec(&r, &$8.divert)) @@ -2136,30 +2103,14 @@ filter_opt : USER uids { sizeof(filter_opts.nat.pool_opts)); filter_opts.nat.pool_opts.staticport = 1; } - | ROUTETO routespec pool_opts { - filter_opts.route.host = $2; - filter_opts.route.rt = PF_ROUTETO; - filter_opts.route.pool_opts = $3.type | $3.opts; - memcpy(&filter_opts.rroute.pool_opts, &$3, - sizeof(filter_opts.rroute.pool_opts)); - if ($3.key != NULL) - filter_opts.route.key = $3.key; + | ROUTETO routespec { + filter_opts.rt = PF_ROUTETO; } - | REPLYTO routespec pool_opts { - filter_opts.route.host = $2; - filter_opts.route.rt = PF_REPLYTO; - filter_opts.route.pool_opts = $3.type | $3.opts; - if ($3.key != NULL) - filter_opts.route.key = $3.key; - } - | DUPTO routespec pool_opts { - filter_opts.route.host = $2; - filter_opts.route.rt = PF_DUPTO; - filter_opts.route.pool_opts = $3.type | $3.opts; - memcpy(&filter_opts.rroute.pool_opts, &$3, - sizeof(filter_opts.rroute.pool_opts)); - if ($3.key != NULL) - filter_opts.route.key = $3.key; + | REPLYTO routespec { + filter_opts.rt = PF_REPLYTO; + } + | DUPTO routespec { + filter_opts.rt = PF_DUPTO; } | not RECEIVEDON if_item { if (filter_opts.rcv) { @@ -3733,122 +3684,21 @@ pool_opt : BITMASK { } ; -route_host : STRING { - /* try to find @if0 address specs */ - if (strrchr($1, '@') != NULL) { - if (($$ = host($1, pf->opts)) == NULL) { - yyerror("invalid host for route spec"); - YYERROR; - } - free($1); - } else { - $$ = calloc(1, sizeof(struct node_host)); - if ($$ == NULL) - err(1, "route_host: calloc"); - $$->ifname = $1; - $$->addr.type = PF_ADDR_NONE; - set_ipmask($$, 128); - $$->next = NULL; - $$->tail = $$; - } - } - | STRING '/' STRING { - char *buf; - - if (asprintf(&buf, "%s/%s", $1, $3) == -1) - err(1, "host: asprintf"); - free($1); - if (($$ = host(buf, pf->opts)) == NULL) { - /* error. "any" is handled elsewhere */ - free(buf); - yyerror("could not parse host specification"); - YYERROR; - } - free(buf); - } - | '<' STRING '>' { - if (strlen($2) >= PF_TABLE_NAME_SIZE) { - yyerror("table name '%s' too long", $2); - free($2); - YYERROR; - } - $$ = calloc(1, sizeof(struct node_host)); - if ($$ == NULL) - err(1, "host: calloc"); - $$->addr.type = PF_ADDR_TABLE; - if (strlcpy($$->addr.v.tblname, $2, - sizeof($$->addr.v.tblname)) >= - sizeof($$->addr.v.tblname)) - errx(1, "host: strlcpy"); - free($2); - $$->next = NULL; - $$->tail = $$; - } - | dynaddr '/' NUMBER { - struct node_host *n; - - if ($3 < 0 || $3 > 128) { - yyerror("bit number too big"); - YYERROR; - } - $$ = $1; - for (n = $1; n != NULL; n = n->next) - set_ipmask(n, $3); - } - | '(' STRING host ')' { - struct node_host *n; - - $$ = $3; - /* XXX check masks, only full mask should be allowed */ - for (n = $3; n != NULL; n = n->next) { - if ($$->ifname) { - yyerror("cannot specify interface twice " - "in route spec"); - YYERROR; - } - if (($$->ifname = strdup($2)) == NULL) - errx(1, "host: strdup"); - } - free($2); - } - ; - -route_host_list : route_host optweight optnl { - if ($2 > 0) { - struct node_host *n; - for (n = $1; n != NULL; n = n->next) - n->weight = $2; - } - $$ = $1; - } - | route_host_list comma route_host optweight optnl { - if ($1->af == 0) - $1->af = $3->af; - if ($1->af != $3->af) { - yyerror("all pool addresses must be in the " - "same address family"); +routespec : redirspec pool_opts { + struct redirection *redir; + if (filter_opts.rt != PF_NOPFROUTE) { + yyerror("cannot respecify " + "route-to/reply-to/dup-to"); YYERROR; } - $1->tail->next = $3; - $1->tail = $3->tail; - if ($4 > 0) { - struct node_host *n; - for (n = $3; n != NULL; n = n->next) - n->weight = $4; - } - $$ = $1; - } - ; - -routespec : route_host optweight { - if ($2 > 0) { - struct node_host *n; - for (n = $1; n != NULL; n = n->next) - n->weight = $2; - } - $$ = $1; + redir = calloc(1, sizeof(*redir)); + if (redir == NULL) + err(1, "routespec calloc"); + redir->host = $1; + filter_opts.rroute.rdr = redir; + memcpy(&filter_opts.rroute.pool_opts, &$2, + sizeof(filter_opts.rroute.pool_opts)); } - | '{' optnl route_host_list '}' { $$ = $3; } ; timeout_spec : STRING NUMBER @@ -4709,7 +4559,7 @@ expand_rule(struct pf_rule *r, int keepr error += collapse_redirspec(&r->rdr, r, rdr, 0); error += collapse_redirspec(&r->nat, r, nat, 0); - error += collapse_redirspec(&r->route, r, rroute, 1); + error += collapse_redirspec(&r->route, r, rroute, 0); /* disallow @if in from or to for the time being */ if ((src_host->addr.type == PF_ADDR_ADDRMASK && @@ -5955,7 +5805,7 @@ filteropts_to_rule(struct pf_rule *r, st yyerror("af-to can only be used with direction in"); return (1); } - if ((opts->marker & FOM_AFTO) && opts->route.rt) { + if ((opts->marker & FOM_AFTO) && opts->rt) { yyerror("af-to cannot be used together with " "route-to, reply-to, dup-to"); return (1); Index: sys/net/if_pfsync.c =================================================================== RCS file: /cvs/src/sys/net/if_pfsync.c,v retrieving revision 1.278 diff -u -p -r1.278 if_pfsync.c --- sys/net/if_pfsync.c 24 Aug 2020 15:30:58 -0000 1.278 +++ sys/net/if_pfsync.c 19 Oct 2020 23:19:22 -0000 @@ -612,7 +612,8 @@ pfsync_state_import(struct pfsync_state st->rtableid[PF_SK_STACK] = ntohl(sp->rtableid[PF_SK_STACK]); /* copy to state */ - bcopy(&sp->rt_addr, &st->rt_addr, sizeof(st->rt_addr)); + st->rt_addr = sp->rt_addr; + st->rt = sp->rt; st->creation = getuptime() - ntohl(sp->creation); st->expire = getuptime(); if (ntohl(sp->expire)) { @@ -643,7 +644,6 @@ pfsync_state_import(struct pfsync_state st->rule.ptr = r; st->anchor.ptr = NULL; - st->rt_kif = NULL; st->pfsync_time = getuptime(); st->sync_state = PFSYNC_S_NONE; @@ -1843,6 +1843,7 @@ pfsync_undefer(struct pfsync_deferral *p { struct pfsync_softc *sc = pfsyncif; struct pf_pdesc pdesc; + struct pf_state *s = pd->pd_st; NET_ASSERT_LOCKED(); @@ -1852,35 +1853,33 @@ pfsync_undefer(struct pfsync_deferral *p TAILQ_REMOVE(&sc->sc_deferrals, pd, pd_entry); sc->sc_deferred--; - CLR(pd->pd_st->state_flags, PFSTATE_ACK); + CLR(s->state_flags, PFSTATE_ACK); if (drop) m_freem(pd->pd_m); else { - if (pd->pd_st->rule.ptr->rt == PF_ROUTETO) { + if (s->rt == PF_ROUTETO) { if (pf_setup_pdesc(&pdesc, - pd->pd_st->key[PF_SK_WIRE]->af, - pd->pd_st->direction, pd->pd_st->rt_kif, + s->key[PF_SK_WIRE]->af, + s->direction, s->kif, pd->pd_m, NULL) != PF_PASS) { m_freem(pd->pd_m); goto out; } - switch (pd->pd_st->key[PF_SK_WIRE]->af) { + switch (s->key[PF_SK_WIRE]->af) { case AF_INET: - pf_route(&pdesc, - pd->pd_st->rule.ptr, pd->pd_st); + pf_route(&pdesc, s); break; #ifdef INET6 case AF_INET6: - pf_route6(&pdesc, - pd->pd_st->rule.ptr, pd->pd_st); + pf_route6(&pdesc, s); break; #endif /* INET6 */ default: - unhandled_af(pd->pd_st->key[PF_SK_WIRE]->af); + unhandled_af(s->key[PF_SK_WIRE]->af); } pd->pd_m = pdesc.m; } else { - switch (pd->pd_st->key[PF_SK_WIRE]->af) { + switch (s->key[PF_SK_WIRE]->af) { case AF_INET: ip_output(pd->pd_m, NULL, NULL, 0, NULL, NULL, 0); @@ -1892,12 +1891,12 @@ pfsync_undefer(struct pfsync_deferral *p break; #endif /* INET6 */ default: - unhandled_af(pd->pd_st->key[PF_SK_WIRE]->af); + unhandled_af(s->key[PF_SK_WIRE]->af); } } } out: - pf_state_unref(pd->pd_st); + pf_state_unref(s); pool_put(&sc->sc_pool, pd); } Index: sys/net/pf.c =================================================================== RCS file: /cvs/src/sys/net/pf.c,v retrieving revision 1.1094 diff -u -p -r1.1094 pf.c --- sys/net/pf.c 24 Jul 2020 18:17:15 -0000 1.1094 +++ sys/net/pf.c 19 Oct 2020 23:19:22 -0000 @@ -1122,12 +1122,6 @@ pf_find_state(struct pf_pdesc *pd, struc } *state = s; - if (pd->dir == PF_OUT && s->rt_kif != NULL && s->rt_kif != pd->kif && - ((s->rule.ptr->rt == PF_ROUTETO && - s->rule.ptr->direction == PF_OUT) || - (s->rule.ptr->rt == PF_REPLYTO && - s->rule.ptr->direction == PF_IN))) - return (PF_PASS); return (PF_MATCH); } @@ -1186,7 +1180,8 @@ pf_state_export(struct pfsync_state *sp, /* copy from state */ strlcpy(sp->ifname, st->kif->pfik_name, sizeof(sp->ifname)); - memcpy(&sp->rt_addr, &st->rt_addr, sizeof(sp->rt_addr)); + sp->rt = st->rt; + sp->rt_addr = st->rt_addr; sp->creation = htonl(getuptime() - st->creation); expire = pf_state_expires(st); if (expire <= getuptime()) @@ -3433,29 +3428,13 @@ pf_set_rt_ifp(struct pf_state *s, struct struct pf_rule *r = s->rule.ptr; int rv; - s->rt_kif = NULL; - if (!r->rt) + if (r->rt == PF_NOPFROUTE) return (0); - switch (af) { - case AF_INET: - rv = pf_map_addr(AF_INET, r, saddr, &s->rt_addr, NULL, sns, - &r->route, PF_SN_ROUTE); - break; -#ifdef INET6 - case AF_INET6: - rv = pf_map_addr(AF_INET6, r, saddr, &s->rt_addr, NULL, sns, - &r->route, PF_SN_ROUTE); - break; -#endif /* INET6 */ - default: - rv = 1; - } - - if (rv == 0) { - s->rt_kif = r->route.kif; - s->natrule.ptr = r; - } + rv = pf_map_addr(af, r, saddr, &s->rt_addr, NULL, sns, + &r->route, PF_SN_ROUTE); + if (rv == 0) + s->rt = r->rt; return (rv); } @@ -5986,15 +5965,13 @@ pf_rtlabel_match(struct pf_addr *addr, s /* pf_route() may change pd->m, adjust local copies after calling */ void -pf_route(struct pf_pdesc *pd, struct pf_rule *r, struct pf_state *s) +pf_route(struct pf_pdesc *pd, struct pf_state *s) { struct mbuf *m0, *m1; struct sockaddr_in *dst, sin; struct rtentry *rt = NULL; struct ip *ip; struct ifnet *ifp = NULL; - struct pf_addr naddr; - struct pf_src_node *sns[PF_SN_MAX]; int error = 0; unsigned int rtableid; @@ -6004,11 +5981,11 @@ pf_route(struct pf_pdesc *pd, struct pf_ return; } - if (r->rt == PF_DUPTO) { + if (s->rt == PF_DUPTO) { if ((m0 = m_dup_pkt(pd->m, max_linkhdr, M_NOWAIT)) == NULL) return; } else { - if ((r->rt == PF_REPLYTO) == (r->direction == pd->dir)) + if ((s->rt == PF_REPLYTO) == (s->direction == pd->dir)) return; m0 = pd->m; } @@ -6021,44 +5998,31 @@ pf_route(struct pf_pdesc *pd, struct pf_ ip = mtod(m0, struct ip *); - memset(&sin, 0, sizeof(sin)); - dst = &sin; - dst->sin_family = AF_INET; - dst->sin_len = sizeof(*dst); - dst->sin_addr = ip->ip_dst; - rtableid = m0->m_pkthdr.ph_rtableid; - if (pd->dir == PF_IN) { if (ip->ip_ttl <= IPTTLDEC) { - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pf_send_icmp(m0, ICMP_TIMXCEED, ICMP_TIMXCEED_INTRANS, 0, - pd->af, r, pd->rdomain); + pd->af, s->rule.ptr, pd->rdomain); goto bad; } ip->ip_ttl -= IPTTLDEC; } - if (s == NULL) { - memset(sns, 0, sizeof(sns)); - if (pf_map_addr(AF_INET, r, - (struct pf_addr *)&ip->ip_src, - &naddr, NULL, sns, &r->route, PF_SN_ROUTE)) { - DPFPRINTF(LOG_ERR, - "%s: pf_map_addr() failed", __func__); - goto bad; - } + memset(&sin, 0, sizeof(sin)); + dst = &sin; + dst->sin_family = AF_INET; + dst->sin_len = sizeof(*dst); + dst->sin_addr.s_addr = s->rt_addr.v4.s_addr; + rtableid = m0->m_pkthdr.ph_rtableid; - if (!PF_AZERO(&naddr, AF_INET)) - dst->sin_addr.s_addr = naddr.v4.s_addr; - ifp = r->route.kif ? - r->route.kif->pfik_ifp : NULL; - } else { - if (!PF_AZERO(&s->rt_addr, AF_INET)) - dst->sin_addr.s_addr = - s->rt_addr.v4.s_addr; - ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL; + rt = rtalloc(sintosa(dst), RT_RESOLVE, rtableid); + if (!rtisvalid(rt)) { + ipstat_inc(ips_noroute); + goto bad; } + + ifp = if_get(rt->rt_ifidx); if (ifp == NULL) goto bad; @@ -6074,12 +6038,6 @@ pf_route(struct pf_pdesc *pd, struct pf_ } ip = mtod(m0, struct ip *); } - - rt = rtalloc(sintosa(dst), RT_RESOLVE, rtableid); - if (!rtisvalid(rt)) { - ipstat_inc(ips_noroute); - goto bad; - } /* A locally generated packet may have invalid source address. */ if ((ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET && (ifp->if_flags & IFF_LOOPBACK) == 0) @@ -6105,9 +6063,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ */ if (ip->ip_off & htons(IP_DF)) { ipstat_inc(ips_cantfrag); - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pf_send_icmp(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, - ifp->if_mtu, pd->af, r, pd->rdomain); + ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain); goto bad; } @@ -6131,8 +6089,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ ipstat_inc(ips_fragmented); done: - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pd->m = NULL; + if_put(ifp); rtfree(rt); return; @@ -6144,15 +6103,13 @@ bad: #ifdef INET6 /* pf_route6() may change pd->m, adjust local copies after calling */ void -pf_route6(struct pf_pdesc *pd, struct pf_rule *r, struct pf_state *s) +pf_route6(struct pf_pdesc *pd, struct pf_state *s) { struct mbuf *m0; struct sockaddr_in6 *dst, sin6; struct rtentry *rt = NULL; struct ip6_hdr *ip6; struct ifnet *ifp = NULL; - struct pf_addr naddr; - struct pf_src_node *sns[PF_SN_MAX]; struct m_tag *mtag; unsigned int rtableid; @@ -6162,11 +6119,11 @@ pf_route6(struct pf_pdesc *pd, struct pf return; } - if (r->rt == PF_DUPTO) { + if (s->rt == PF_DUPTO) { if ((m0 = m_dup_pkt(pd->m, max_linkhdr, M_NOWAIT)) == NULL) return; } else { - if ((r->rt == PF_REPLYTO) == (r->direction == pd->dir)) + if ((s->rt == PF_REPLYTO) == (s->direction == pd->dir)) return; m0 = pd->m; } @@ -6178,42 +6135,31 @@ pf_route6(struct pf_pdesc *pd, struct pf } ip6 = mtod(m0, struct ip6_hdr *); - memset(&sin6, 0, sizeof(sin6)); - dst = &sin6; - dst->sin6_family = AF_INET6; - dst->sin6_len = sizeof(*dst); - dst->sin6_addr = ip6->ip6_dst; - rtableid = m0->m_pkthdr.ph_rtableid; - if (pd->dir == PF_IN) { if (ip6->ip6_hlim <= IPV6_HLIMDEC) { - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pf_send_icmp(m0, ICMP6_TIME_EXCEEDED, ICMP6_TIME_EXCEED_TRANSIT, 0, - pd->af, r, pd->rdomain); + pd->af, s->rule.ptr, pd->rdomain); goto bad; } ip6->ip6_hlim -= IPV6_HLIMDEC; } - if (s == NULL) { - memset(sns, 0, sizeof(sns)); - if (pf_map_addr(AF_INET6, r, (struct pf_addr *)&ip6->ip6_src, - &naddr, NULL, sns, &r->route, PF_SN_ROUTE)) { - DPFPRINTF(LOG_ERR, - "%s: pf_map_addr() failed", __func__); - goto bad; - } - if (!PF_AZERO(&naddr, AF_INET6)) - pf_addrcpy((struct pf_addr *)&dst->sin6_addr, - &naddr, AF_INET6); - ifp = r->route.kif ? r->route.kif->pfik_ifp : NULL; - } else { - if (!PF_AZERO(&s->rt_addr, AF_INET6)) - pf_addrcpy((struct pf_addr *)&dst->sin6_addr, - &s->rt_addr, AF_INET6); - ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL; + memset(&sin6, 0, sizeof(sin6)); + dst = &sin6; + dst->sin6_family = AF_INET6; + dst->sin6_len = sizeof(*dst); + pf_addrcpy((struct pf_addr *)&dst->sin6_addr, &s->rt_addr, AF_INET6); + rtableid = m0->m_pkthdr.ph_rtableid; + + rt = rtalloc(sin6tosa(dst), RT_RESOLVE, rtableid); + if (!rtisvalid(rt)) { + ip6stat_inc(ip6s_noroute); + goto bad; } + + ifp = if_get(rt->rt_ifidx); if (ifp == NULL) goto bad; @@ -6231,11 +6177,7 @@ pf_route6(struct pf_pdesc *pd, struct pf if (IN6_IS_SCOPE_EMBED(&dst->sin6_addr)) dst->sin6_addr.s6_addr16[1] = htons(ifp->if_index); - rt = rtalloc(sin6tosa(dst), RT_RESOLVE, rtableid); - if (!rtisvalid(rt)) { - ip6stat_inc(ip6s_noroute); - goto bad; - } + /* A locally generated packet may have invalid source address. */ if (IN6_IS_ADDR_LOOPBACK(&ip6->ip6_src) && (ifp->if_flags & IFF_LOOPBACK) == 0) @@ -6253,15 +6195,16 @@ pf_route6(struct pf_pdesc *pd, struct pf ifp->if_output(ifp, m0, sin6tosa(dst), rt); } else { ip6stat_inc(ip6s_cantfrag); - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pf_send_icmp(m0, ICMP6_PACKET_TOO_BIG, 0, - ifp->if_mtu, pd->af, r, pd->rdomain); + ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain); goto bad; } done: - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pd->m = NULL; + if_put(ifp); rtfree(rt); return; @@ -6271,7 +6214,6 @@ bad: } #endif /* INET6 */ - /* * check TCP checksum and set mbuf flag * off is the offset where the protocol header starts @@ -7287,14 +7229,14 @@ done: pd.m = NULL; break; default: - if (r->rt) { + if (s && s->rt) { switch (pd.af) { case AF_INET: - pf_route(&pd, r, s); + pf_route(&pd, s); break; #ifdef INET6 case AF_INET6: - pf_route6(&pd, r, s); + pf_route6(&pd, s); break; #endif /* INET6 */ } Index: sys/net/pfvar.h =================================================================== RCS file: /cvs/src/sys/net/pfvar.h,v retrieving revision 1.496 diff -u -p -r1.496 pfvar.h --- sys/net/pfvar.h 24 Aug 2020 15:30:58 -0000 1.496 +++ sys/net/pfvar.h 19 Oct 2020 23:19:22 -0000 @@ -133,7 +133,7 @@ enum { PFTM_TCP_FIRST_PACKET, PFTM_TCP_O */ #define PF_FRAG_ENTRY_LIMIT 64 -enum { PF_NOPFROUTE, PF_ROUTETO, PF_DUPTO, PF_REPLYTO }; +enum { PF_NOPFROUTE = 0 , PF_ROUTETO, PF_DUPTO, PF_REPLYTO }; enum { PF_LIMIT_STATES, PF_LIMIT_SRC_NODES, PF_LIMIT_FRAGS, PF_LIMIT_TABLES, PF_LIMIT_TABLE_ENTRIES, PF_LIMIT_PKTDELAY_PKTS, PF_LIMIT_MAX }; @@ -762,7 +762,6 @@ struct pf_state { struct pf_sn_head src_nodes; struct pf_state_key *key[2]; /* addresses stack and wire */ struct pfi_kif *kif; - struct pfi_kif *rt_kif; u_int64_t packets[2]; u_int64_t bytes[2]; int32_t creation; @@ -797,6 +796,7 @@ struct pf_state { u_int16_t if_index_out; pf_refcnt_t refcnt; u_int16_t delay; + u_int8_t rt; }; /* @@ -852,7 +852,7 @@ struct pfsync_state { u_int8_t proto; u_int8_t direction; u_int8_t log; - u_int8_t pad0; + u_int8_t rt; u_int8_t timeout; u_int8_t sync_flags; u_int8_t updates; @@ -1798,8 +1798,8 @@ int pf_state_key_attach(struct pf_state_ int pf_translate(struct pf_pdesc *, struct pf_addr *, u_int16_t, struct pf_addr *, u_int16_t, u_int16_t, int); int pf_translate_af(struct pf_pdesc *); -void pf_route(struct pf_pdesc *, struct pf_rule *, struct pf_state *); -void pf_route6(struct pf_pdesc *, struct pf_rule *, struct pf_state *); +void pf_route(struct pf_pdesc *, struct pf_state *); +void pf_route6(struct pf_pdesc *, struct pf_state *); void pf_init_threshold(struct pf_threshold *, u_int32_t, u_int32_t); int pf_delay_pkt(struct mbuf *, u_int);