On Fri, Jan 08, 2021 at 04:43:39PM +0100, Alexandr Nedvedicky wrote: > Hello, > > </snip> > > ---------------------------- > > revision 1.294 > > date: 2003/01/02 01:56:56; author: dhartmei; state: Exp; lines: +27 -49; > > When route-to/reply-to is used in combination with address translation, > > pf_test() may be called twice for the same packet. In this case, make > > sure the translation is only applied in the second call. This solves > > the problem with state insert failures where the second pf_test() call > > tried to insert another state entry after the first call's translation. > > ok henning@, mcbride@, thanks to Joe Nall for additional testing. > > ---------------------------- > > > > I have tested your diffs in my setup, they all pass. I have not > > tested the scenario mentioned in the commit message. Note that the > > address translation implementation in 2003 was different from what > > we have now. And sasha@'s analysis shows that the current code is > > wrong in other use cases. > > > > I've completely forgot there was a change in NAT. Therefore I could > not understand the commit message. > > </snip> > > > > > The only way to find out is to commit it. It reduces comlexity that > > noone understands. > > > > OK bluhm@ to remove the check > > > > Please leave the "if (pd->kif->pfik_ifp != ifp)" around pf_test() > > in pf_route() as it is for now. > > I agree with bluhm@ here. we should proceed with small steps in such > case and let things to settle down before making next move.
ok. i don't know how to split up the rest of the change though. here's an updated diff that includes the rest of the kernel changes and the pfctl and pf.conf tweaks. it's probably useful for me to try and explain at a high level what i think the semantics should be, otherwise we might end up arguing about which bits of the current config i broke. so, from an extremely high level point of view, and apologies if this is condescending, pf sits between the network stack and an interface that a packet travels on. for connections handled by the local box, this means packets come from the stack and get an output interface selected by a route lookup, then pf checks it, and then it goes out the selected interface. replies come into an interface, get checked by pf, and then enter the stack. when forwarding, a packet comes into an interface, pf checks it, the stack does a route lookup to pick an interface, pf checks it again, and then it goes out the interface. so what does it mean when route-to (or reply-to) gets involved? i'm saying that when route-to is applied to a packet, pf takes the packet away from the stack and immediately forwards it toward to specified destination address. for a packet entering the system, ie, when the packet is going from the interface into the stack, route-to should pretend that it is forwarding the packet and basically push it straight out an interface. however, like normal forwarding via the stack, there might be some policy on packets leaving that interface that you want to apply, so pf should run pf_test in that situation so the policy can be applied. this is especially useful if you need to apply nat-to when packets leave a particular interface. however, if you route-to when a packet is on the way out of the stack, i'm arguing that pf should not run again against that packet. currently route-to rules run pf_test again if the interface the packet is routed out of changes, which means pf runs multiple times against a packet if rules keep changing which interface it goes out. this means there's loop prevention in pf to mitigate against this, and weird potentials for multiple states to be created when nat gets involved. for simplicity, both in terms of reasoning and code i think pf should only be run once when a packet enters the system, and only once when it leaves the system. the only reason i can come up with for running pf_test multiple times when route-to changes the outgoing interface is so you can check the packet with "pass out on $new_if" type rules. we don't rerun pf again when nat/rdr changes addresses, so this feels inconsistent to me. i also don't think route-to is used much. getting basic functionality working is surprisingly hard, so the complicated possibilities in the current code are almost certainly not taken advantage of. we're going to break existing configurations anyway, so if we can agree that pf only runs twice even if route-to gets involved, then i'm not going to feel bad about breaking something this complicated anyway. this also breaks the ability to do route-to without states. is there a reason to do that apart from the DSR type things? did we agree that those use cases could be handled by sloppy states instead? lastly, the "argument" or address specified with route-to (and reply-to and dup-to) is a destination address, not a next-hop. this has been discussed on the lists a couple of times before, so i won't go over it again, except to reiterate that it allows pf to force "sticky" path selection while opening up the possibility for ecmp and failover for where that path traverses. Index: sbin/pfctl/parse.y =================================================================== RCS file: /cvs/src/sbin/pfctl/parse.y,v retrieving revision 1.708 diff -u -p -r1.708 parse.y --- sbin/pfctl/parse.y 12 Jan 2021 00:10:34 -0000 1.708 +++ sbin/pfctl/parse.y 22 Jan 2021 07:33:29 -0000 @@ -276,6 +276,7 @@ struct filter_opts { struct redirspec nat; struct redirspec rdr; struct redirspec rroute; + u_int8_t rt; /* scrub opts */ int nodf; @@ -284,15 +285,6 @@ struct filter_opts { int randomid; int max_mss; - /* route opts */ - struct { - struct node_host *host; - u_int8_t rt; - u_int8_t pool_opts; - sa_family_t af; - struct pf_poolhashkey *key; - } route; - struct { u_int32_t limit; u_int32_t seconds; @@ -372,7 +364,7 @@ void expand_label(char *, size_t, cons struct node_port *, u_int8_t); int expand_divertspec(struct pf_rule *, struct divertspec *); int collapse_redirspec(struct pf_pool *, struct pf_rule *, - struct redirspec *rs, u_int8_t); + struct redirspec *rs, int); int apply_redirspec(struct pf_pool *, struct pf_rule *, struct redirspec *, int, struct node_port *); void expand_rule(struct pf_rule *, int, struct node_if *, @@ -518,7 +510,6 @@ int parseport(char *, struct range *r, i %type <v.host> ipspec xhost host dynaddr host_list %type <v.host> table_host_list tablespec %type <v.host> redir_host_list redirspec -%type <v.host> route_host route_host_list routespec %type <v.os> os xos os_list %type <v.port> portspec port_list port_item %type <v.uid> uids uid_list uid_item @@ -975,7 +966,7 @@ anchorrule : ANCHOR anchorname dir quick YYERROR; } - if ($9.route.rt) { + if ($9.rt) { yyerror("cannot specify route handling " "on anchors"); YYERROR; @@ -1843,37 +1834,13 @@ pfrule : action dir logquick interface decide_address_family($7.src.host, &r.af); decide_address_family($7.dst.host, &r.af); - if ($8.route.rt) { - if (!r.direction) { + if ($8.rt) { + if ($8.rt != PF_DUPTO && !r.direction) { yyerror("direction must be explicit " "with rules that specify routing"); YYERROR; } - r.rt = $8.route.rt; - r.route.opts = $8.route.pool_opts; - if ($8.route.key != NULL) - memcpy(&r.route.key, $8.route.key, - sizeof(struct pf_poolhashkey)); - } - if (r.rt) { - decide_address_family($8.route.host, &r.af); - if ((r.route.opts & PF_POOL_TYPEMASK) == - PF_POOL_NONE && ($8.route.host->next != NULL || - $8.route.host->addr.type == PF_ADDR_TABLE || - DYNIF_MULTIADDR($8.route.host->addr))) - r.route.opts |= PF_POOL_ROUNDROBIN; - if ($8.route.host->next != NULL) { - if (!PF_POOL_DYNTYPE(r.route.opts)) { - yyerror("address pool option " - "not supported by type"); - YYERROR; - } - } - /* fake redirspec */ - if (($8.rroute.rdr = calloc(1, - sizeof(*$8.rroute.rdr))) == NULL) - err(1, "$8.rroute.rdr"); - $8.rroute.rdr->host = $8.route.host; + r.rt = $8.rt; } if (expand_divertspec(&r, &$8.divert)) @@ -2137,30 +2104,14 @@ filter_opt : USER uids { sizeof(filter_opts.nat.pool_opts)); filter_opts.nat.pool_opts.staticport = 1; } - | ROUTETO routespec pool_opts { - filter_opts.route.host = $2; - filter_opts.route.rt = PF_ROUTETO; - filter_opts.route.pool_opts = $3.type | $3.opts; - memcpy(&filter_opts.rroute.pool_opts, &$3, - sizeof(filter_opts.rroute.pool_opts)); - if ($3.key != NULL) - filter_opts.route.key = $3.key; + | ROUTETO routespec { + filter_opts.rt = PF_ROUTETO; } - | REPLYTO routespec pool_opts { - filter_opts.route.host = $2; - filter_opts.route.rt = PF_REPLYTO; - filter_opts.route.pool_opts = $3.type | $3.opts; - if ($3.key != NULL) - filter_opts.route.key = $3.key; - } - | DUPTO routespec pool_opts { - filter_opts.route.host = $2; - filter_opts.route.rt = PF_DUPTO; - filter_opts.route.pool_opts = $3.type | $3.opts; - memcpy(&filter_opts.rroute.pool_opts, &$3, - sizeof(filter_opts.rroute.pool_opts)); - if ($3.key != NULL) - filter_opts.route.key = $3.key; + | REPLYTO routespec { + filter_opts.rt = PF_REPLYTO; + } + | DUPTO routespec { + filter_opts.rt = PF_DUPTO; } | not RECEIVEDON if_item { if (filter_opts.rcv) { @@ -3743,122 +3694,21 @@ pool_opt : BITMASK { } ; -route_host : STRING { - /* try to find @if0 address specs */ - if (strrchr($1, '@') != NULL) { - if (($$ = host($1, pf->opts)) == NULL) { - yyerror("invalid host for route spec"); - YYERROR; - } - free($1); - } else { - $$ = calloc(1, sizeof(struct node_host)); - if ($$ == NULL) - err(1, "route_host: calloc"); - $$->ifname = $1; - $$->addr.type = PF_ADDR_NONE; - set_ipmask($$, 128); - $$->next = NULL; - $$->tail = $$; - } - } - | STRING '/' STRING { - char *buf; - - if (asprintf(&buf, "%s/%s", $1, $3) == -1) - err(1, "host: asprintf"); - free($1); - if (($$ = host(buf, pf->opts)) == NULL) { - /* error. "any" is handled elsewhere */ - free(buf); - yyerror("could not parse host specification"); - YYERROR; - } - free(buf); - } - | '<' STRING '>' { - if (strlen($2) >= PF_TABLE_NAME_SIZE) { - yyerror("table name '%s' too long", $2); - free($2); - YYERROR; - } - $$ = calloc(1, sizeof(struct node_host)); - if ($$ == NULL) - err(1, "host: calloc"); - $$->addr.type = PF_ADDR_TABLE; - if (strlcpy($$->addr.v.tblname, $2, - sizeof($$->addr.v.tblname)) >= - sizeof($$->addr.v.tblname)) - errx(1, "host: strlcpy"); - free($2); - $$->next = NULL; - $$->tail = $$; - } - | dynaddr '/' NUMBER { - struct node_host *n; - - if ($3 < 0 || $3 > 128) { - yyerror("bit number too big"); - YYERROR; - } - $$ = $1; - for (n = $1; n != NULL; n = n->next) - set_ipmask(n, $3); - } - | '(' STRING host ')' { - struct node_host *n; - - $$ = $3; - /* XXX check masks, only full mask should be allowed */ - for (n = $3; n != NULL; n = n->next) { - if ($$->ifname) { - yyerror("cannot specify interface twice " - "in route spec"); - YYERROR; - } - if (($$->ifname = strdup($2)) == NULL) - errx(1, "host: strdup"); - } - free($2); - } - ; - -route_host_list : route_host optweight optnl { - if ($2 > 0) { - struct node_host *n; - for (n = $1; n != NULL; n = n->next) - n->weight = $2; - } - $$ = $1; - } - | route_host_list comma route_host optweight optnl { - if ($1->af == 0) - $1->af = $3->af; - if ($1->af != $3->af) { - yyerror("all pool addresses must be in the " - "same address family"); +routespec : redirspec pool_opts { + struct redirection *redir; + if (filter_opts.rt != PF_NOPFROUTE) { + yyerror("cannot respecify " + "route-to/reply-to/dup-to"); YYERROR; } - $1->tail->next = $3; - $1->tail = $3->tail; - if ($4 > 0) { - struct node_host *n; - for (n = $3; n != NULL; n = n->next) - n->weight = $4; - } - $$ = $1; - } - ; - -routespec : route_host optweight { - if ($2 > 0) { - struct node_host *n; - for (n = $1; n != NULL; n = n->next) - n->weight = $2; - } - $$ = $1; + redir = calloc(1, sizeof(*redir)); + if (redir == NULL) + err(1, "routespec calloc"); + redir->host = $1; + filter_opts.rroute.rdr = redir; + memcpy(&filter_opts.rroute.pool_opts, &$2, + sizeof(filter_opts.rroute.pool_opts)); } - | '{' optnl route_host_list '}' { $$ = $3; } ; timeout_spec : STRING NUMBER @@ -4478,7 +4328,7 @@ expand_divertspec(struct pf_rule *r, str int collapse_redirspec(struct pf_pool *rpool, struct pf_rule *r, - struct redirspec *rs, u_int8_t allow_if) + struct redirspec *rs, int routing) { struct pf_opt_tbl *tbl = NULL; struct node_host *h, *hprev = NULL; @@ -4494,6 +4344,15 @@ collapse_redirspec(struct pf_pool *rpool r->naf = rs->af; for (h = rs->rdr->host; h != NULL; h = h->next) { + if (routing) { + if (h->addr.type == PF_ADDR_DYNIFTL && + h->addr.iflags != PFI_AFLAG_PEER) { + yyerror("route spec requires :peer with " + "dynamic interface addresses"); + return (1); + } + } + /* set rule address family if redirect spec has one */ if (rs->af && !r->af && !af) { /* swap address families for af-to */ @@ -4515,7 +4374,7 @@ collapse_redirspec(struct pf_pool *rpool if (!r->af && af && af != h->af) { yyerror("%s spec contains addresses with " "different address families", - allow_if ? "routing" : "translation"); + routing ? "routing" : "translation"); return (1); } } else if (h->af) { /* af-to case */ @@ -4526,7 +4385,7 @@ collapse_redirspec(struct pf_pool *rpool if (rs->af && rs->af != h->af) { yyerror("%s spec contains addresses that " "don't match target address family", - allow_if ? "routing" : "translation"); + routing ? "routing" : "translation"); return (1); } } @@ -4541,8 +4400,9 @@ collapse_redirspec(struct pf_pool *rpool if (naddr == 0) { /* the first host */ rpool->addr = h->addr; - if (!allow_if && h->ifname) { - yyerror("@if not permitted for translation"); + if (h->ifname) { + yyerror("@if not permitted for %s", + routing ? "routing" : "translation"); return (1); } if (h->ifname && strlcpy(rpool->ifname, h->ifname, @@ -4564,8 +4424,9 @@ collapse_redirspec(struct pf_pool *rpool "not supported for translation or routing"); return (1); } - if (!allow_if && h->ifname) { - yyerror("@if not permitted for translation"); + if (h->ifname) { + yyerror("@if not permitted for %s", + routing ? "routing" : "translation"); return (1); } if (hprev) { @@ -4596,7 +4457,7 @@ collapse_redirspec(struct pf_pool *rpool r->af = af; if (!naddr) { yyerror("af mismatch in %s spec", - allow_if ? "routing" : "translation"); + routing ? "routing" : "translation"); return (1); } if (tbl) { @@ -5992,7 +5853,7 @@ filteropts_to_rule(struct pf_rule *r, st yyerror("af-to can only be used with direction in"); return (1); } - if ((opts->marker & FOM_AFTO) && opts->route.rt) { + if ((opts->marker & FOM_AFTO) && opts->rt) { yyerror("af-to cannot be used together with " "route-to, reply-to, dup-to"); return (1); Index: share/man/man5/pf.conf.5 =================================================================== RCS file: /cvs/src/share/man/man5/pf.conf.5,v retrieving revision 1.585 diff -u -p -r1.585 pf.conf.5 --- share/man/man5/pf.conf.5 7 Dec 2020 08:29:41 -0000 1.585 +++ share/man/man5/pf.conf.5 22 Jan 2021 07:33:29 -0000 @@ -1113,8 +1113,8 @@ the incoming connection arrived through .It Cm route-to The .Cm route-to -option routes the packet to the specified interface with an optional address -for the next hop. +option routes the packet to the specified destination address instead +of the destination address in the packet header. When a .Cm route-to rule creates state, only packets that pass in the same direction as the @@ -2858,8 +2858,7 @@ ifspec = ( [ "!" ] ( interface-n interface-list = [ "!" ] ( interface-name | interface-group ) [ [ "," ] interface-list ] route = ( "route-to" | "reply-to" | "dup-to" ) - ( routehost | "{" routehost-list "}" ) - [ pooltype ] + ( redirhost | "{" redirhost-list "}" ) af = "inet" | "inet6" protospec = "proto" ( proto-name | proto-number | @@ -2878,14 +2877,11 @@ host = [ "!" ] ( address [ "we address [ "/" mask-bits ] [ "weight" number ] | "<" string ">" ) redirhost = address [ "/" mask-bits ] -routehost = host | host "@" interface-name | - "(" interface-name [ address [ "/" mask-bits ] ] ")" address = ( interface-name | interface-group | "(" ( interface-name | interface-group ) ")" | hostname | ipv4-dotted-quad | ipv6-coloned-hex ) host-list = host [ [ "," ] host-list ] redirhost-list = redirhost [ [ "," ] redirhost-list ] -routehost-list = routehost [ [ "," ] routehost-list ] port = "port" ( unary-op | binary-op | "{" op-list "}" ) portspec = "port" ( number | name ) [ ":" ( "*" | number | name ) ] Index: sys/conf/GENERIC =================================================================== RCS file: /cvs/src/sys/conf/GENERIC,v retrieving revision 1.273 diff -u -p -r1.273 GENERIC --- sys/conf/GENERIC 30 Sep 2020 14:51:17 -0000 1.273 +++ sys/conf/GENERIC 22 Jan 2021 07:33:30 -0000 @@ -82,6 +82,7 @@ pseudo-device msts 1 # MSTS line discipl pseudo-device endrun 1 # EndRun line discipline pseudo-device vnd 4 # vnode disk devices pseudo-device ksyms 1 # kernel symbols device +pseudo-device kstat #pseudo-device dt # Dynamic Tracer # clonable devices Index: sys/net/if_pfsync.c =================================================================== RCS file: /cvs/src/sys/net/if_pfsync.c,v retrieving revision 1.281 diff -u -p -r1.281 if_pfsync.c --- sys/net/if_pfsync.c 18 Jan 2021 18:29:19 -0000 1.281 +++ sys/net/if_pfsync.c 22 Jan 2021 07:33:31 -0000 @@ -613,6 +613,7 @@ pfsync_state_import(struct pfsync_state /* copy to state */ st->rt_addr = sp->rt_addr; + st->rt = sp->rt; st->creation = getuptime() - ntohl(sp->creation); st->expire = getuptime(); if (ntohl(sp->expire)) { @@ -643,7 +644,6 @@ pfsync_state_import(struct pfsync_state st->rule.ptr = r; st->anchor.ptr = NULL; - st->rt_kif = NULL; st->pfsync_time = getuptime(); st->sync_state = PFSYNC_S_NONE; @@ -1860,7 +1860,7 @@ pfsync_undefer(struct pfsync_deferral *p if (drop) m_freem(pd->pd_m); else { - if (st->rule.ptr->rt == PF_ROUTETO) { + if (st->rt == PF_ROUTETO) { if (pf_setup_pdesc(&pdesc, st->key[PF_SK_WIRE]->af, st->direction, st->kif, pd->pd_m, NULL) != PF_PASS) { @@ -1869,11 +1869,11 @@ pfsync_undefer(struct pfsync_deferral *p } switch (st->key[PF_SK_WIRE]->af) { case AF_INET: - pf_route(&pdesc, st->rule.ptr, st); + pf_route(&pdesc, st); break; #ifdef INET6 case AF_INET6: - pf_route6(&pdesc, st->rule.ptr, st); + pf_route6(&pdesc, st); break; #endif /* INET6 */ default: Index: sys/net/pf.c =================================================================== RCS file: /cvs/src/sys/net/pf.c,v retrieving revision 1.1101 diff -u -p -r1.1101 pf.c --- sys/net/pf.c 19 Jan 2021 22:22:23 -0000 1.1101 +++ sys/net/pf.c 22 Jan 2021 07:33:31 -0000 @@ -1180,6 +1180,7 @@ pf_state_export(struct pfsync_state *sp, /* copy from state */ strlcpy(sp->ifname, st->kif->pfik_name, sizeof(sp->ifname)); + sp->rt = st->rt; sp->rt_addr = st->rt_addr; sp->creation = htonl(getuptime() - st->creation); expire = pf_state_expires(st); @@ -3430,16 +3431,13 @@ pf_set_rt_ifp(struct pf_state *s, struct struct pf_rule *r = s->rule.ptr; int rv; - s->rt_kif = NULL; if (!r->rt) return (0); rv = pf_map_addr(af, r, saddr, &s->rt_addr, NULL, sns, &r->route, PF_SN_ROUTE); - if (rv == 0) { - s->rt_kif = r->route.kif; - s->natrule.ptr = r; - } + if (rv == 0) + s->rt = r->rt; return (rv); } @@ -5963,15 +5961,13 @@ pf_rtlabel_match(struct pf_addr *addr, s /* pf_route() may change pd->m, adjust local copies after calling */ void -pf_route(struct pf_pdesc *pd, struct pf_rule *r, struct pf_state *s) +pf_route(struct pf_pdesc *pd, struct pf_state *s) { struct mbuf *m0, *m1; struct sockaddr_in *dst, sin; struct rtentry *rt = NULL; struct ip *ip; struct ifnet *ifp = NULL; - struct pf_addr naddr; - struct pf_src_node *sns[PF_SN_MAX]; int error = 0; unsigned int rtableid; @@ -5981,11 +5977,11 @@ pf_route(struct pf_pdesc *pd, struct pf_ return; } - if (r->rt == PF_DUPTO) { + if (s->rt == PF_DUPTO) { if ((m0 = m_dup_pkt(pd->m, max_linkhdr, M_NOWAIT)) == NULL) return; } else { - if ((r->rt == PF_REPLYTO) == (r->direction == pd->dir)) + if ((s->rt == PF_REPLYTO) == (s->direction == pd->dir)) return; m0 = pd->m; } @@ -5998,48 +5994,40 @@ pf_route(struct pf_pdesc *pd, struct pf_ ip = mtod(m0, struct ip *); - memset(&sin, 0, sizeof(sin)); - dst = &sin; - dst->sin_family = AF_INET; - dst->sin_len = sizeof(*dst); - dst->sin_addr = ip->ip_dst; - rtableid = m0->m_pkthdr.ph_rtableid; - if (pd->dir == PF_IN) { if (ip->ip_ttl <= IPTTLDEC) { - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pf_send_icmp(m0, ICMP_TIMXCEED, ICMP_TIMXCEED_INTRANS, 0, - pd->af, r, pd->rdomain); + pd->af, s->rule.ptr, pd->rdomain); goto bad; } ip->ip_ttl -= IPTTLDEC; } - if (s == NULL) { - memset(sns, 0, sizeof(sns)); - if (pf_map_addr(AF_INET, r, - (struct pf_addr *)&ip->ip_src, - &naddr, NULL, sns, &r->route, PF_SN_ROUTE)) { - DPFPRINTF(LOG_ERR, - "%s: pf_map_addr() failed", __func__); - goto bad; - } + memset(&sin, 0, sizeof(sin)); + dst = &sin; + dst->sin_family = AF_INET; + dst->sin_len = sizeof(*dst); + dst->sin_addr = s->rt_addr.v4; + rtableid = m0->m_pkthdr.ph_rtableid; - if (!PF_AZERO(&naddr, AF_INET)) - dst->sin_addr.s_addr = naddr.v4.s_addr; - ifp = r->route.kif ? - r->route.kif->pfik_ifp : NULL; - } else { - if (!PF_AZERO(&s->rt_addr, AF_INET)) - dst->sin_addr.s_addr = - s->rt_addr.v4.s_addr; - ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL; + rt = rtalloc(sintosa(dst), RT_RESOLVE, rtableid); + if (!rtisvalid(rt)) { + ipstat_inc(ips_noroute); + goto bad; } + + ifp = if_get(rt->rt_ifidx); if (ifp == NULL) goto bad; - if (pd->kif->pfik_ifp != ifp) { + /* A locally generated packet may have invalid source address. */ + if ((ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET && + (ifp->if_flags & IFF_LOOPBACK) == 0) + ip->ip_src = ifatoia(rt->rt_ifa)->ia_addr.sin_addr; + + if (pd->dir == PF_IN) { if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) goto bad; else if (m0 == NULL) @@ -6052,16 +6040,6 @@ pf_route(struct pf_pdesc *pd, struct pf_ ip = mtod(m0, struct ip *); } - rt = rtalloc(sintosa(dst), RT_RESOLVE, rtableid); - if (!rtisvalid(rt)) { - ipstat_inc(ips_noroute); - goto bad; - } - /* A locally generated packet may have invalid source address. */ - if ((ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET && - (ifp->if_flags & IFF_LOOPBACK) == 0) - ip->ip_src = ifatoia(rt->rt_ifa)->ia_addr.sin_addr; - in_proto_cksum_out(m0, ifp); if (ntohs(ip->ip_len) <= ifp->if_mtu) { @@ -6082,9 +6060,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ */ if (ip->ip_off & htons(IP_DF)) { ipstat_inc(ips_cantfrag); - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pf_send_icmp(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, - ifp->if_mtu, pd->af, r, pd->rdomain); + ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain); goto bad; } @@ -6108,8 +6086,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ ipstat_inc(ips_fragmented); done: - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pd->m = NULL; + if_put(ifp); rtfree(rt); return; @@ -6121,15 +6100,13 @@ bad: #ifdef INET6 /* pf_route6() may change pd->m, adjust local copies after calling */ void -pf_route6(struct pf_pdesc *pd, struct pf_rule *r, struct pf_state *s) +pf_route6(struct pf_pdesc *pd, struct pf_state *s) { struct mbuf *m0; struct sockaddr_in6 *dst, sin6; struct rtentry *rt = NULL; struct ip6_hdr *ip6; struct ifnet *ifp = NULL; - struct pf_addr naddr; - struct pf_src_node *sns[PF_SN_MAX]; struct m_tag *mtag; unsigned int rtableid; @@ -6139,11 +6116,11 @@ pf_route6(struct pf_pdesc *pd, struct pf return; } - if (r->rt == PF_DUPTO) { + if (s->rt == PF_DUPTO) { if ((m0 = m_dup_pkt(pd->m, max_linkhdr, M_NOWAIT)) == NULL) return; } else { - if ((r->rt == PF_REPLYTO) == (r->direction == pd->dir)) + if ((s->rt == PF_REPLYTO) == (s->direction == pd->dir)) return; m0 = pd->m; } @@ -6155,46 +6132,42 @@ pf_route6(struct pf_pdesc *pd, struct pf } ip6 = mtod(m0, struct ip6_hdr *); - memset(&sin6, 0, sizeof(sin6)); - dst = &sin6; - dst->sin6_family = AF_INET6; - dst->sin6_len = sizeof(*dst); - dst->sin6_addr = ip6->ip6_dst; - rtableid = m0->m_pkthdr.ph_rtableid; - if (pd->dir == PF_IN) { if (ip6->ip6_hlim <= IPV6_HLIMDEC) { - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pf_send_icmp(m0, ICMP6_TIME_EXCEEDED, ICMP6_TIME_EXCEED_TRANSIT, 0, - pd->af, r, pd->rdomain); + pd->af, s->rule.ptr, pd->rdomain); goto bad; } ip6->ip6_hlim -= IPV6_HLIMDEC; } - if (s == NULL) { - memset(sns, 0, sizeof(sns)); - if (pf_map_addr(AF_INET6, r, (struct pf_addr *)&ip6->ip6_src, - &naddr, NULL, sns, &r->route, PF_SN_ROUTE)) { - DPFPRINTF(LOG_ERR, - "%s: pf_map_addr() failed", __func__); - goto bad; - } - if (!PF_AZERO(&naddr, AF_INET6)) - pf_addrcpy((struct pf_addr *)&dst->sin6_addr, - &naddr, AF_INET6); - ifp = r->route.kif ? r->route.kif->pfik_ifp : NULL; - } else { - if (!PF_AZERO(&s->rt_addr, AF_INET6)) - pf_addrcpy((struct pf_addr *)&dst->sin6_addr, - &s->rt_addr, AF_INET6); - ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL; + memset(&sin6, 0, sizeof(sin6)); + dst = &sin6; + dst->sin6_family = AF_INET6; + dst->sin6_len = sizeof(*dst); + dst->sin6_addr = s->rt_addr.v6; + rtableid = m0->m_pkthdr.ph_rtableid; + + if (IN6_IS_SCOPE_EMBED(&dst->sin6_addr)) + dst->sin6_addr.s6_addr16[1] = htons(ifp->if_index); + rt = rtalloc(sin6tosa(dst), RT_RESOLVE, rtableid); + if (!rtisvalid(rt)) { + ip6stat_inc(ip6s_noroute); + goto bad; } + + ifp = if_get(rt->rt_ifidx); if (ifp == NULL) goto bad; - if (pd->kif->pfik_ifp != ifp) { + /* A locally generated packet may have invalid source address. */ + if (IN6_IS_ADDR_LOOPBACK(&ip6->ip6_src) && + (ifp->if_flags & IFF_LOOPBACK) == 0) + ip6->ip6_src = ifatoia6(rt->rt_ifa)->ia_addr.sin6_addr; + + if (pd->dir == PF_IN) { if (pf_test(AF_INET6, PF_OUT, ifp, &m0) != PF_PASS) goto bad; else if (m0 == NULL) @@ -6206,18 +6179,6 @@ pf_route6(struct pf_pdesc *pd, struct pf } } - if (IN6_IS_SCOPE_EMBED(&dst->sin6_addr)) - dst->sin6_addr.s6_addr16[1] = htons(ifp->if_index); - rt = rtalloc(sin6tosa(dst), RT_RESOLVE, rtableid); - if (!rtisvalid(rt)) { - ip6stat_inc(ip6s_noroute); - goto bad; - } - /* A locally generated packet may have invalid source address. */ - if (IN6_IS_ADDR_LOOPBACK(&ip6->ip6_src) && - (ifp->if_flags & IFF_LOOPBACK) == 0) - ip6->ip6_src = ifatoia6(rt->rt_ifa)->ia_addr.sin6_addr; - in6_proto_cksum_out(m0, ifp); /* @@ -6230,15 +6191,16 @@ pf_route6(struct pf_pdesc *pd, struct pf ifp->if_output(ifp, m0, sin6tosa(dst), rt); } else { ip6stat_inc(ip6s_cantfrag); - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pf_send_icmp(m0, ICMP6_PACKET_TOO_BIG, 0, - ifp->if_mtu, pd->af, r, pd->rdomain); + ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain); goto bad; } done: - if (r->rt != PF_DUPTO) + if (s->rt != PF_DUPTO) pd->m = NULL; + if_put(ifp); rtfree(rt); return; @@ -7282,11 +7244,11 @@ done: if (r->rt) { switch (pd.af) { case AF_INET: - pf_route(&pd, r, s); + pf_route(&pd, s); break; #ifdef INET6 case AF_INET6: - pf_route6(&pd, r, s); + pf_route6(&pd, s); break; #endif /* INET6 */ } Index: sys/net/pfvar.h =================================================================== RCS file: /cvs/src/sys/net/pfvar.h,v retrieving revision 1.498 diff -u -p -r1.498 pfvar.h --- sys/net/pfvar.h 12 Jan 2021 00:10:34 -0000 1.498 +++ sys/net/pfvar.h 22 Jan 2021 07:33:31 -0000 @@ -762,7 +762,6 @@ struct pf_state { struct pf_sn_head src_nodes; struct pf_state_key *key[2]; /* addresses stack and wire */ struct pfi_kif *kif; - struct pfi_kif *rt_kif; u_int64_t packets[2]; u_int64_t bytes[2]; int32_t creation; @@ -797,6 +796,7 @@ struct pf_state { u_int16_t if_index_out; pf_refcnt_t refcnt; u_int16_t delay; + u_int8_t rt; }; /* @@ -852,7 +852,7 @@ struct pfsync_state { u_int8_t proto; u_int8_t direction; u_int8_t log; - u_int8_t pad0; + u_int8_t rt; u_int8_t timeout; u_int8_t sync_flags; u_int8_t updates; @@ -1798,8 +1798,8 @@ int pf_state_key_attach(struct pf_state_ int pf_translate(struct pf_pdesc *, struct pf_addr *, u_int16_t, struct pf_addr *, u_int16_t, u_int16_t, int); int pf_translate_af(struct pf_pdesc *); -void pf_route(struct pf_pdesc *, struct pf_rule *, struct pf_state *); -void pf_route6(struct pf_pdesc *, struct pf_rule *, struct pf_state *); +void pf_route(struct pf_pdesc *, struct pf_state *); +void pf_route6(struct pf_pdesc *, struct pf_state *); void pf_init_threshold(struct pf_threshold *, u_int32_t, u_int32_t); int pf_delay_pkt(struct mbuf *, u_int);