Re: [External] : Re: pf route-to issues
On 2021/01/26 09:29, Alexandr Nedvedicky wrote: > Hello, > > > > > > > > > > > I'm not sure if proposed scenario real. Let's assume there > > > is a PF box with three NICs running on this awkward set up > > > > > > em1 ... 192.168.1.10 > > > > > > em0 > > > > > > em2 ... 192.168.1.10 > > > > > > em0 is attached to LAN em1 and em2 are facing to internet which is > > > reachable with two different physical lines. both lines are connected > > > via > > > equipment, which uses fixed IP address 192.168.1.10 and PF admin has > > > no way to change that. > > > > in this scenario are em1 and em2 connected to the same ethernet > > segment and ip subnet? > > em1 and em2 are connected to different wires (broadcast domains). > > and I agree this set up is really awkward. not sure if it will work > because it also depends on how ARP table is organized. I think it > works on Solaris, but I have not tried that with OpenBSD. I don't know about ethernet/ARP, but that does work with pppoe. FWIW this is how I was doing things. (I'm not doing this any more) PRIO_LINES="81.187.8.161@pppoe4 81.187.218.235@pppoe5" PRIO_LINES_V6="fe80::@pppoe4 fe80::@pppoe5" .. pass in log quick on {lan, wifi} proto udp to 193.35.128.0/20 port {500, 4500} tag uma-orange route-to {$PRIO_LINES} pass in log quick on {lan, wifi} proto {esp, ah} to 193.35.128.0/20 tag uma-orange route-to {$PRIO_LINES} pass out log quick on aaisp tagged uma-orange nat-to 81.187.41.148/30 random sticky-address static-port pass in quick inet proto {tcp,udp} from {(self), 10.71.12.5, 192.168.10.5} to !10/8 \ port domain tag priority-dns route-to {$PRIO_LINES} pass in quick inet6 proto {tcp,udp} from {(self), 2001:8b0:6412:11::5} to !2001:8b0:6412::/48 \ port domain tag priority-dns route-to {$PRIO_LINES_V6} pass out on aaisp tagged priority-dns .. In these I used the "my side" address on the pppoe interfaces because the "ISP side" address was always 81.187.81.187. AFAIK PF didn't care about the actual address when it was point-to-point and just a dummy address worked. (Same with the standard route table too, but that got changed, I think maybe with ART). > > the other is to add add routes "beyond" 192.168.1.10 and route-to > > them: > > > > # route add 127.0.1.10 192.168.1.10 -ifp em1 > > # route add 127.0.2.10 192.168.1.10 -ifp em2 > > > > then you can write rules like this: > > > > pass in on em0 from 172.16.0.0/16 route-to 127.0.1.10 > > pass in on em0 from 172.17.0.0/16 route-to 127.0.2.10 That sounds like a usable hack. Maybe even better than before (this route should go down with the interface, so the traffic wouldn't end up blackholed like it did with route-to ip@iface). I think that's easier to deal with than rtables if you have incoming traffic to services running on the router itself. > yes I understand that. I'm not able to judge how many currently working > configurations will remain broken after we kill 'address@interface' form. > I'm sure many deployments will get fixed with simple tweak: > echo 'address@interface'|cut -d @ -f 1 standard ethernet: address@em0 -> address ppp, different isps:address@pppoe0 -> (pppoe0:peer) ppp with same isps, or conflicting ethernet: rtable or "routes beyond" hack dhcp, rtadv:ETOOHARD, don't think it worked before without scripted pf.conf changes anyway. use rtable.
Re: [External] : Re: pf route-to issues
Hello, On Tue, Jan 26, 2021 at 12:33:25PM +0100, Alexander Bluhm wrote: > On Tue, Jan 26, 2021 at 10:39:30AM +1000, David Gwynne wrote: > > > But what about dup-to? The packet is duplicated for both directions. > > > I guess the main use case for dup-to is implementing a monitor port. > > > There you have to pass packets stateless, otherwise it would not > > > work anyway. The strange semantics is not related to this diff. > > > > are you saying i should skip pf_test for all dup-to generated packets? > > I am not sure. > > When we have an in dup-to rule, the incoming packets in request > direction are dupped and tested with the out ruleset. The reply > packets for this state are also dupped, but not tested when they > leave the dup interface. > > This is inconsistent and cannot work statefully. Stateful filtering > with dupped packets does not make sense anyway. The only working > config is "pass out on dup-interface no state". > > Do we think this rule should be required? > > 1. No packet should leave an interface without a rule. > > if (pd->dir == PF_IN || s->rt == PF_DUPTO) { > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > 2. The config says we want a monitor port. We risk that the >original packet and the dupped packet match the same rule. >Stateful filtering cannot work, we do not expect reply packets >for the dups. > > if (pd->dir == PF_IN && s->rt != PF_DUPTO) { > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > 3. Some sort of problem was there before, but different. Don't >address it now. > another option would be to mark duped packet as GENERATED to short circuit pf_test() here: 6871 6872 if ((*m0)->m_pkthdr.pf.flags & PF_TAG_GENERATED) 6873 return (PF_PASS); 6874 perhaps excluding those changes from current diff is good. this seems to be separate issue anyway. thanks and regards sashan
Re: [External] : Re: pf route-to issues
Hello, On Tue, Jan 26, 2021 at 10:39:30AM +1000, David Gwynne wrote: > On Mon, Jan 25, 2021 at 04:19:11PM +0100, Alexander Bluhm wrote: > > On Fri, Jan 22, 2021 at 06:07:59PM +1000, David Gwynne wrote: > > > --- sys/conf/GENERIC 30 Sep 2020 14:51:17 - 1.273 > > > +++ sys/conf/GENERIC 22 Jan 2021 07:33:30 - > > > @@ -82,6 +82,7 @@ pseudo-device msts1 # MSTS line discipl > > > pseudo-deviceendrun 1 # EndRun line discipline > > > pseudo-devicevnd 4 # vnode disk devices > > > pseudo-deviceksyms 1 # kernel symbols device > > > +pseudo-devicekstat > > > #pseudo-device dt # Dynamic Tracer > > > > > > # clonable devices > > > > This is an unrelated chunk. > > oh yeah... > > > > +pf_route(struct pf_pdesc *pd, struct pf_state *s) > > ... > > > + if (pd->dir == PF_IN) { > > > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > > > Yes, this is the correct logic. When the packet comes in, pf > > overrides forwarding, tests the out rules, and sends it. For > > outgoing packets on out route-to rules we have already tested the > > rules. It also works for reply-to the other way around. > > yep. I'm suggesting to keep current if () in. and change it with follow up commit, which will adjust match rule for route-to/reply-to. would it be OK with you, David? > > > But what about dup-to? The packet is duplicated for both directions. > > I guess the main use case for dup-to is implementing a monitor port. > > There you have to pass packets stateless, otherwise it would not > > work anyway. The strange semantics is not related to this diff. > > are you saying i should skip pf_test for all dup-to generated packets? this makes perfect sense for me. > > > We are reaching a state where this diff can go in. I just startet > > a regress run with it. OK bluhm@ > > hopefully i fixed the pfctl error messages up so the regress tests arent > too unhappy. thanks and regards sashan
Re: [External] : Re: pf route-to issues
Hello, > > > goto bad; > > > > here we do 'goto bad', which does not call if_put(). > > yes it does. the whole chunk with the diff applied is: > > done: > if (s->rt != PF_DUPTO) > pd->m = NULL; > if_put(ifp); > rtfree(rt); > return; > > bad: > m_freem(m0); > goto done; > } > > bad drops the mbuf and then goes to done. yes you are absolutely right, I need glasses. thanks and regards sashan
Re: [External] : Re: pf route-to issues
Hello, > > i'll need help with "match on em0 route-to $if_em0_peer". or we can do > that later separately? may be can we keep this line in pf_route() untouched at least for now: 6041 6042 if (pd->kif->pfik_ifp != ifp) { 6043 if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) 6044 goto bad; 6045 else if (m0 == NULL) 6046 goto done; 6047 if (m0->m_len < sizeof(struct ip)) { 6048 DPFPRINTF(LOG_ERR, 6049 "%s: m0->m_len < sizeof(struct ip)", __func__); 6050 goto bad; 6051 } 6052 ip = mtod(m0, struct ip *); 6053 } 6054 I think if () at line 6042 does not hurt pfsync(4). This should be removed with commit, which will introduce the 'match ... route-to'. There should be more detailed explanation in my response to email from bluhm@. thanks and regards sashan
Re: [External] : Re: pf route-to issues
Hello, > > > > > > I'm not sure if proposed scenario real. Let's assume there > > is a PF box with three NICs running on this awkward set up > > > > em1 ... 192.168.1.10 > > > > em0 > > > > em2 ... 192.168.1.10 > > > > em0 is attached to LAN em1 and em2 are facing to internet which is > > reachable with two different physical lines. both lines are connected > > via > > equipment, which uses fixed IP address 192.168.1.10 and PF admin has > > no way to change that. > > in this scenario are em1 and em2 connected to the same ethernet > segment and ip subnet? em1 and em2 are connected to different wires (broadcast domains). and I agree this set up is really awkward. not sure if it will work because it also depends on how ARP table is organized. I think it works on Solaris, but I have not tried that with OpenBSD. > > > the 'address@interface' syntax is the only way to define rules: > > > > pass in on em0 from 172.16.0.0/16 route-to 192.168.1.10@em1 > > pass in on em0 from 172.17.0.0/16 route-to 192.168.1.10@em2 > > > > regardless of how much real such scenario is I believe it can > > currently work. > > this is a very awkward configuration. while i think what it's trying > to do is useful, how it is expressed relies on something i want to > break (fix?). yes I completely agree. I believe this 'awkward thinking' stems back to ipf (ipfilter). I found my ancient notes to find some details on this. My notes refer to ipf.conf(5) manpage [1]. If I understand the purpose of 'address@interface' syntax right, it allows us to forward packets using explicit pair [ next-hop IP, interface ]. Thus firewall can emulate a route, which might be missing in routing table. It just 'blindly' sends matching packet to next-hop using desired interface assuming that 'next-hop' host will do the right thing with such packet. > > one of the original reasons i wanted to break this kind of config > is because pfsync hasn't got a way to exchange interace information, > and different firewalls are going to have different interface > topologies anyway. one of the reasons to only use a destination/next > hop as the argument to route-to rules was so pfsync would work. yes I understand that. I'm not able to judge how many currently working configurations will remain broken after we kill 'address@interface' form. I'm sure many deployments will get fixed with simple tweak: echo 'address@interface'|cut -d @ -f 1 but I'm not sure how many people do still need to order desired NIC. > > i'm pretty sure this is broken at the moment because of bugs in the > routing code. it is possible to configure routes to 192.168.1.10 > via both em1 and em2 if net.inet.ip.multipath is set to 1, but im > sure the llinfo (arp and rtable) part of this kind of multipath > route setup does not work reliably. i guess i should try and get > my fixes for this into the tree. > > there are two alternate ways i can think of to do this. the first > is to configure an rtable for each interface: > > # route -T 1 add default 192.168.1.10 -ifp em1 > # route -T 2 add default 192.168.1.10 -ifp em2 > > then you could write rules like this: > > pass in on em0 from 172.16.0.0/16 rtable 1 > pass in on em0 from 172.17.0.0/16 rtable 2 > > the other is to add add routes "beyond" 192.168.1.10 and route-to > them: > > # route add 127.0.1.10 192.168.1.10 -ifp em1 > # route add 127.0.2.10 192.168.1.10 -ifp em2 > > then you can write rules like this: > > pass in on em0 from 172.16.0.0/16 route-to 127.0.1.10 > pass in on em0 from 172.17.0.0/16 route-to 127.0.2.10 > > this will likely hit the same bugs in the rtable/arp code i referred > to above though. > > also, note that i haven't tested either of these. > I think this sounds like a plan how to deal with edge cases, which can't get fixed with simple 'cut -d @ -f 1' thanks and regards sashan [1] https://www.freebsd.org/cgi/man.cgi?query=ipf.conf&sektion=5&apropos=0 (search for reply-to)
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 05:38:40PM +0100, Alexandr Nedvedicky wrote: > Hello, > > > On Mon, Jan 25, 2021 at 03:21:29PM +0100, Alexander Bluhm wrote: > > Hi, > > > > Some personal thoughts. I am happy when pf route-to gets simpler. > > Especially I have never understood what this address@interface > > syntax is used for. > > > > I cannot estimate what configuration is used by our cutomers in > > many installations. Simple syntax change address@interface -> > > address of next hob should be no problem. Slight semantic changes > > have to be dealt with. Current packet flow is complicated and may > > be inspired by old NAT behavior. As long it becomes more sane and > > easier to understand, we should change it. > > > I'm not sure if proposed scenario real. Let's assume there > is a PF box with three NICs running on this awkward set up > > em1 ... 192.168.1.10 > > em0 > > em2 ... 192.168.1.10 > > em0 is attached to LAN em1 and em2 are facing to internet which is > reachable with two different physical lines. both lines are connected via > equipment, which uses fixed IP address 192.168.1.10 and PF admin has > no way to change that. in this scenario are em1 and em2 connected to the same ethernet segment and ip subnet? > the 'address@interface' syntax is the only way to define rules: > > pass in on em0 from 172.16.0.0/16 route-to 192.168.1.10@em1 > pass in on em0 from 172.17.0.0/16 route-to 192.168.1.10@em2 > > regardless of how much real such scenario is I believe it can > currently work. this is a very awkward configuration. while i think what it's trying to do is useful, how it is expressed relies on something i want to break (fix?). one of the original reasons i wanted to break this kind of config is because pfsync hasn't got a way to exchange interace information, and different firewalls are going to have different interface topologies anyway. one of the reasons to only use a destination/next hop as the argument to route-to rules was so pfsync would work. i'm pretty sure this is broken at the moment because of bugs in the routing code. it is possible to configure routes to 192.168.1.10 via both em1 and em2 if net.inet.ip.multipath is set to 1, but im sure the llinfo (arp and rtable) part of this kind of multipath route setup does not work reliably. i guess i should try and get my fixes for this into the tree. there are two alternate ways i can think of to do this. the first is to configure an rtable for each interface: # route -T 1 add default 192.168.1.10 -ifp em1 # route -T 2 add default 192.168.1.10 -ifp em2 then you could write rules like this: pass in on em0 from 172.16.0.0/16 rtable 1 pass in on em0 from 172.17.0.0/16 rtable 2 the other is to add add routes "beyond" 192.168.1.10 and route-to them: # route add 127.0.1.10 192.168.1.10 -ifp em1 # route add 127.0.2.10 192.168.1.10 -ifp em2 then you can write rules like this: pass in on em0 from 172.16.0.0/16 route-to 127.0.1.10 pass in on em0 from 172.17.0.0/16 route-to 127.0.2.10 this will likely hit the same bugs in the rtable/arp code i referred to above though. also, note that i haven't tested either of these. cheers, dlg
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 06:17:02PM +0100, Alexandr Nedvedicky wrote: > Hello, > > pf_route() might leak a refence to ifp. oh no :( > > Index: sys/net/pf.c > > === > > RCS file: /cvs/src/sys/net/pf.c,v > > retrieving revision 1.1101 > > diff -u -p -r1.1101 pf.c > > --- sys/net/pf.c19 Jan 2021 22:22:23 - 1.1101 > > +++ sys/net/pf.c22 Jan 2021 07:33:31 - > > > > > @@ -5998,48 +5994,40 @@ pf_route(struct pf_pdesc *pd, struct pf_ > > > > ip = mtod(m0, struct ip *); > > > > > + > > + ifp = if_get(rt->rt_ifidx); > > if (ifp == NULL) > > goto bad; > > here we get a reference to ifp. yep. > > @@ -6082,9 +6060,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ > > */ > > if (ip->ip_off & htons(IP_DF)) { > > ipstat_inc(ips_cantfrag); > > - if (r->rt != PF_DUPTO) > > + if (s->rt != PF_DUPTO) > > pf_send_icmp(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, > > - ifp->if_mtu, pd->af, r, pd->rdomain); > > + ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain); > > goto bad; > > here we do 'goto bad', which does not call if_put(). yes it does. the whole chunk with the diff applied is: done: if (s->rt != PF_DUPTO) pd->m = NULL; if_put(ifp); rtfree(rt); return; bad: m_freem(m0); goto done; } bad drops the mbuf and then goes to done.
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 03:21:29PM +0100, Alexander Bluhm wrote: > Hi, > > Some personal thoughts. I am happy when pf route-to gets simpler. > Especially I have never understood what this address@interface > syntax is used for. even after staring at it for so long, i still don't get it. i do think it was a reimplementation of an ipfilter thing, but i don't think it makes a lot of sense in ipfilter either. > I cannot estimate what configuration is used by our cutomers in > many installations. Simple syntax change address@interface -> > address of next hob should be no problem. Slight semantic changes > have to be dealt with. Current packet flow is complicated and may > be inspired by old NAT behavior. As long it becomes more sane and > easier to understand, we should change it. route-to $destination, not $next_hop... the biggest change we have to agree on at the moment is whether we're changing the semantic of "pf runs when a packet goes over an interface" to "pf runs when a packet goes in or out of the stack". this affects whether pf_test runs again when route-to changes the interface. > But I don't like artificial restrictions. We don't know all use > cases. reply-to and route-to could be used for both in and out > rules. I have used them for strange divert-to on bridge setups. > It should stay that way. i don't think it's complicated to support route-to and reply-to on both in and out rules. we've already found that there's use cases for reply-to on inbound rules, doing things on bridges just adds to that. it could be used on tpmr(4) too... > It would be nice to keep state-less route-to. I have found a special > case with that in the code of our product. But it looks like dead > code, so I would not object to remove state-less route-to for now. ok. thank you. > bluhm
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 02:30:46PM +0100, Alexandr Nedvedicky wrote: > Hello, > > ... > > > > i dont understand the kif lifetimes though. can we just point a > > pdesc at an arbitrary kif or do we need ot reference count? > > > > as long as we don't release NET_LOCK() (or PF_LOCK() in near future), > the reference to kif should remain valid. > > kif is very own PF's abstraction of network interfaces. there are > two ways how one can create a kif: > > there is either rule loaded, which refers to interface, which > is not plumbed yet. > > the new interface gets plumbed and PF must have a kif for it. > > the reference count currently used makes sure we destroy kif when > either interface is gone or when the rule, which refers kif is > gone. > > hope I remember details above right. ok. sounds simple enough... > > > I think this might be way to go. > > > > > > My only concern now is that such change is subtle. I mean the > > > > > > pass out ... route-to > > > > > > will change behavior in sense that current code will dispatch > > > packet to new interface and run pf_test() again. Once your diff > > > will be in the same rule will be accepted, but will bring entirely > > > different behaviour: just dispatch packet to new interface. > > > > yeah. > > > > the counter example is that i was extremely surprised when i > > discovered that pf_test gets run again when the outgoing interface > > is changed with a route-to rule. > > surprised because you've forgot about current model? which is: > run pf_test() whenever packet crosses the interface? forgot isn't the right word. i was in the room when henning and mcbride reworked pf and came up with the stack and wire side key terminology and semantics, and i've spent a lot of time looking at the ip input and output code where pf_test is called close to the stack. it just wasn't obvious to me that pf filtered over an interface rather than filtered in and out of the stack. apart from route-to in pf, i'm not sure it is a meaningful difference either. > > there's subtlety either way, we're just figuring out which one we're > > going for. > > yes exactly. there are trade offs either way. > > > > > > I think this is acceptable. If this will cause a friction we can > > > always > > > adjust the code in follow up commit to allow state-less > > > route-to/reply-to > > > with no support from pfsync(4). > > > > if we're going to support route-to on match rules i think this will be > > easy to implement. > > > > I think there must be some broader consent on model change > from current which is run pf_test() for every NIC crossing > to new way, which is run pf_test() at most two times. agreed. > > > > > > lastly, the "argument" or address specified with route-to (and > > > > > > reply-to and dup-to) is a destination address, not a next-hop. this > > > > > > has been discussed on the lists a couple of times before, so i won't > > > > > > go over it again, except to reiterate that it allows pf to force > > > > > > "sticky" path selection while opening up the possibility for ecmp > > > > > > and failover for where that path traverses. > > > > > > > > > > I keep forgetting about it as I still stick to current > > > > > interpretation. > > > > > > > > > > > > > > > I've seen changes to pfctl. Diff below still allows rule: > > > > > > > > > > pass in on net0 from 192.168.1.0/24 to any route-to > > > > > 10.10.10.10@em0 > > > > > > > > Is there use case for the @interface syntax apart from the current > > > > route-to rules? If not, we can just delete it. > > > > > > perhaps I'm still not quite on the same page as you then. I also > > > had no time to entirely test you diff. The way I understand your > > > effort is to change route-to behavior such it will be using > > > a destination instead of next-hop@interface. Or are you planning > > > to keep current form ('route-to next-hop@interface') working? > > > > if we ignore route-to, what's the use case for the interface part of > > address@interface? it doesnt seem to be accepted as part of an address > > in other parts of the grammar: > > > > dlg@kbuild ~$ echo pass in from 192.168.0.0@vmx0 | sudo pfctl -nf - > > stdin:1: @if syntax not permitted in from or to > > stdin:1: skipping rule due to errors > > stdin:1: rule expands to no valid combination > > dlg@kbuild ~$ echo pass from 192.168.0.0@vmx0 | sudo pfctl -nf - > > stdin:1: @if syntax not permitted in from or to > > stdin:1: skipping rule due to errors > > stdin:1: rule expands to no valid combination > > dlg@kbuild ~$ echo pass nat-to 192.168.0.0@vmx0 | sudo pfctl -nf - > > stdin:1: @if not permitted > > stdin:1: nat-to and rdr-to require a direction > > stdin:1: skipping rule due to errors > > stdin:1: rule expands to no valid combination > > dlg@kbuild ~$ echo pass nat-to 192.168.0.0@vmx0 | sudo pfc
Re: [External] : Re: pf route-to issues
Hello, pf_route() might leak a refence to ifp. > Index: sys/net/pf.c > === > RCS file: /cvs/src/sys/net/pf.c,v > retrieving revision 1.1101 > diff -u -p -r1.1101 pf.c > --- sys/net/pf.c 19 Jan 2021 22:22:23 - 1.1101 > +++ sys/net/pf.c 22 Jan 2021 07:33:31 - > @@ -5998,48 +5994,40 @@ pf_route(struct pf_pdesc *pd, struct pf_ > > ip = mtod(m0, struct ip *); > > + > + ifp = if_get(rt->rt_ifidx); > if (ifp == NULL) > goto bad; here we get a reference to ifp. > > - if (pd->kif->pfik_ifp != ifp) { > + /* A locally generated packet may have invalid source address. */ > + if ((ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET && > + (ifp->if_flags & IFF_LOOPBACK) == 0) > + ip->ip_src = ifatoia(rt->rt_ifa)->ia_addr.sin_addr; > + > + if (pd->dir == PF_IN) { > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > goto bad; > else if (m0 == NULL) > @@ -6052,16 +6040,6 @@ pf_route(struct pf_pdesc *pd, struct pf_ > ip = mtod(m0, struct ip *); > } > > - rt = rtalloc(sintosa(dst), RT_RESOLVE, rtableid); > - if (!rtisvalid(rt)) { > - ipstat_inc(ips_noroute); > - goto bad; > - } > - /* A locally generated packet may have invalid source address. */ > - if ((ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET && > - (ifp->if_flags & IFF_LOOPBACK) == 0) > - ip->ip_src = ifatoia(rt->rt_ifa)->ia_addr.sin_addr; > - > in_proto_cksum_out(m0, ifp); > > if (ntohs(ip->ip_len) <= ifp->if_mtu) { > @@ -6082,9 +6060,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ >*/ > if (ip->ip_off & htons(IP_DF)) { > ipstat_inc(ips_cantfrag); > - if (r->rt != PF_DUPTO) > + if (s->rt != PF_DUPTO) > pf_send_icmp(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, > - ifp->if_mtu, pd->af, r, pd->rdomain); > + ifp->if_mtu, pd->af, s->rule.ptr, pd->rdomain); > goto bad; here we do 'goto bad', which does not call if_put(). > } > > @@ -6108,8 +6086,9 @@ pf_route(struct pf_pdesc *pd, struct pf_ > ipstat_inc(ips_fragmented); > > done: > - if (r->rt != PF_DUPTO) > + if (s->rt != PF_DUPTO) > pd->m = NULL; > + if_put(ifp); > rtfree(rt); > return; > pf_route6() suffers from the same issue. thanks and regards sashan
Re: [External] : Re: pf route-to issues
Hello, On Mon, Jan 25, 2021 at 03:21:29PM +0100, Alexander Bluhm wrote: > Hi, > > Some personal thoughts. I am happy when pf route-to gets simpler. > Especially I have never understood what this address@interface > syntax is used for. > > I cannot estimate what configuration is used by our cutomers in > many installations. Simple syntax change address@interface -> > address of next hob should be no problem. Slight semantic changes > have to be dealt with. Current packet flow is complicated and may > be inspired by old NAT behavior. As long it becomes more sane and > easier to understand, we should change it. I'm not sure if proposed scenario real. Let's assume there is a PF box with three NICs running on this awkward set up em1 ... 192.168.1.10 em0 em2 ... 192.168.1.10 em0 is attached to LAN em1 and em2 are facing to internet which is reachable with two different physical lines. both lines are connected via equipment, which uses fixed IP address 192.168.1.10 and PF admin has no way to change that. the 'address@interface' syntax is the only way to define rules: pass in on em0 from 172.16.0.0/16 route-to 192.168.1.10@em1 pass in on em0 from 172.17.0.0/16 route-to 192.168.1.10@em2 regardless of how much real such scenario is I believe it can currently work. > > But I don't like artificial restrictions. We don't know all use > cases. reply-to and route-to could be used for both in and out > rules. I have used them for strange divert-to on bridge setups. > It should stay that way. > OK I agree. regards sashan
Re: [External] : Re: pf route-to issues
hello, > > > +pf_route(struct pf_pdesc *pd, struct pf_state *s) > ... > > + if (pd->dir == PF_IN) { > > if (pf_test(AF_INET, PF_OUT, ifp, &m0) != PF_PASS) > > Yes, this is the correct logic. When the packet comes in, pf > overrides forwarding, tests the out rules, and sends it. For > outgoing packets on out route-to rules we have already tested the > rules. It also works for reply-to the other way around. the change above changes the behavior from 'do pf_test() when packet crosses interface' to 'call pf_test() at most two times'. the change essentially breaks the behavior I've mentioned earlier. table { 10.10.10.10, 172.16.1.1 } pass out on em0 from 192.168.1.0/24 to any route-to pass out on em1 from 192.168.1.0 to any nat-to (em1) pass out on em2 all the story goes as follows: destination/next-hop 10.10.10.10 is reached via em1 destination/next-hop 172.16.1.1 is reached via em2 for arbitrary reason we have to do NAT on em1 interface. with current PF things do work. trading current 'if (...)' for 'if (pd->dir == PF_IN)' breaks it. The NAT won't happen. I think this can be solved by using 'match ... route-to ...' rule suggested by David. However... I actually start to wonder does it hurt us to keep current behavior? Why do we want to change it? I'm getting worried this particular tid-bit will bring more harm than good. It also does not help much pfsync(4), if I understand things right. > > But what about dup-to? The packet is duplicated for both directions. > I guess the main use case for dup-to is implementing a monitor port. > There you have to pass packets stateless, otherwise it would not > work anyway. The strange semantics is not related to this diff. this is a good point. thanks and regards sahan
Re: [External] : Re: pf route-to issues
Hi, Some personal thoughts. I am happy when pf route-to gets simpler. Especially I have never understood what this address@interface syntax is used for. I cannot estimate what configuration is used by our cutomers in many installations. Simple syntax change address@interface -> address of next hob should be no problem. Slight semantic changes have to be dealt with. Current packet flow is complicated and may be inspired by old NAT behavior. As long it becomes more sane and easier to understand, we should change it. But I don't like artificial restrictions. We don't know all use cases. reply-to and route-to could be used for both in and out rules. I have used them for strange divert-to on bridge setups. It should stay that way. It would be nice to keep state-less route-to. I have found a special case with that in the code of our product. But it looks like dead code, so I would not object to remove state-less route-to for now. bluhm
Re: [External] : Re: pf route-to issues
Hello, > > > > If I understand the idea right, then basically 'match out on em0' > > figures out the new 'outbound interface' so either 'pass out on em1...' > > or > > 'pass out on em2...' will kick in. In other words: > > > > depending on the destination picked up from table, > > the route-to action will override the em0 interface to > > either em1 or em2. > > yes. > > i dont understand the kif lifetimes though. can we just point a > pdesc at an arbitrary kif or do we need ot reference count? > as long as we don't release NET_LOCK() (or PF_LOCK() in near future), the reference to kif should remain valid. kif is very own PF's abstraction of network interfaces. there are two ways how one can create a kif: there is either rule loaded, which refers to interface, which is not plumbed yet. the new interface gets plumbed and PF must have a kif for it. the reference count currently used makes sure we destroy kif when either interface is gone or when the rule, which refers kif is gone. hope I remember details above right. > > I think this might be way to go. > > > > My only concern now is that such change is subtle. I mean the > > > > pass out ... route-to > > > > will change behavior in sense that current code will dispatch > > packet to new interface and run pf_test() again. Once your diff > > will be in the same rule will be accepted, but will bring entirely > > different behaviour: just dispatch packet to new interface. > > yeah. > > the counter example is that i was extremely surprised when i > discovered that pf_test gets run again when the outgoing interface > is changed with a route-to rule. surprised because you've forgot about current model? which is: run pf_test() whenever packet crosses the interface? > > there's subtlety either way, we're just figuring out which one we're > going for. yes exactly. there are trade offs either way. > > I think this is acceptable. If this will cause a friction we can always > > adjust the code in follow up commit to allow state-less > > route-to/reply-to > > with no support from pfsync(4). > > if we're going to support route-to on match rules i think this will be > easy to implement. > I think there must be some broader consent on model change from current which is run pf_test() for every NIC crossing to new way, which is run pf_test() at most two times. > > > > > > > > > > > > > lastly, the "argument" or address specified with route-to (and > > > > > reply-to and dup-to) is a destination address, not a next-hop. this > > > > > has been discussed on the lists a couple of times before, so i won't > > > > > go over it again, except to reiterate that it allows pf to force > > > > > "sticky" path selection while opening up the possibility for ecmp > > > > > and failover for where that path traverses. > > > > > > > > I keep forgetting about it as I still stick to current > > > > interpretation. > > > > > > > > > > > > I've seen changes to pfctl. Diff below still allows rule: > > > > > > > > pass in on net0 from 192.168.1.0/24 to any route-to 10.10.10.10@em0 > > > > > > Is there use case for the @interface syntax apart from the current > > > route-to rules? If not, we can just delete it. > > > > perhaps I'm still not quite on the same page as you then. I also > > had no time to entirely test you diff. The way I understand your > > effort is to change route-to behavior such it will be using > > a destination instead of next-hop@interface. Or are you planning > > to keep current form ('route-to next-hop@interface') working? > > if we ignore route-to, what's the use case for the interface part of > address@interface? it doesnt seem to be accepted as part of an address > in other parts of the grammar: > > dlg@kbuild ~$ echo pass in from 192.168.0.0@vmx0 | sudo pfctl -nf - > stdin:1: @if syntax not permitted in from or to > stdin:1: skipping rule due to errors > stdin:1: rule expands to no valid combination > dlg@kbuild ~$ echo pass from 192.168.0.0@vmx0 | sudo pfctl -nf - > stdin:1: @if syntax not permitted in from or to > stdin:1: skipping rule due to errors > stdin:1: rule expands to no valid combination > dlg@kbuild ~$ echo pass nat-to 192.168.0.0@vmx0 | sudo pfctl -nf - > stdin:1: @if not permitted > stdin:1: nat-to and rdr-to require a direction > stdin:1: skipping rule due to errors > stdin:1: rule expands to no valid combination > dlg@kbuild ~$ echo pass nat-to 192.168.0.0@vmx0 | sudo pfctl -nf - > > if route-to isn't going to use it, can we cut the @interface part out of > fthe address parser? > that's also my understanding. the form 192.168.0.0@net0 is currently used by route-to et.al. only. So we can eventually let it go. however there is a strong objection from Sven, which wants it to keep it around. > > unless we don't
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 01:11:35PM +0100, Alexandr Nedvedicky wrote: > Hello, > > > > > > > > I understand that simple is better here, so I won't object > > > if we will lean towards simplified model above. However I still > > > would like to share my view on current PF. > > > > > > the way I understand how things (should) work currently is fairly > > > simple: > > > > > > we always run pf_test() as packet crosses interface. > > > packet can cross interface either in outbound or > > > inbound direction. > > > > That's how I understand the current code. I'm proposing that we change > > the semantics so they are: > > > > - we always run pf_test as a packet enters or leaves the network stack. > > - pf is able to filter or apply policy based on various attributes > > of the packet such as addresses and ports, but also metadata about > > the packet such as the current prio, or the interface it came > > from or is going to. > > - changing a packet or it's metadata does not cause a rerun of pf_test. > > - route-to on an incoming packet basically bypasses the default > > stack processing with a "fast route" out of the stack. > > > > > this way we can always create a complex route-to loops, > > > however it can also solve some route-to vs. NAT issues. > > > consider those fairly innocent rules: > > > > > > 8<---8<---8<--8< > > > table { 10.10.10.10, 172.16.1.1 } > > > > > > pass out on em0 from 192.168.1.0/24 to any route-to > > > pass out on em1 from 192.168.1.0 to any nat-to (em1) > > > pass out on em2 all > > > 8<---8<---8<--8< > > > > > > Rules above should currently work, but will stop if we will > > > go with simplified model. > > > > The entries in make the packet go out em1 and em2? > > yes they do. let's say 10.10.10.10 is reached over em1, 172.16.1.1 is > reached over em2. sorry I have not specified that in my earlier email. npz. > > > > > I'm ok with breaking configs like that. We don't run pf_test again for > > other changes to the packet, so if we do want to support something like > > that I think we should make the following work: > > > > # pf_pdesc kif is em0 > > match out on em0 from 192.168.1.0/24 to any route-to > > # pf_pdesc kif is now em1 > > pass out on em1 from 192.168.1.0 to any nat-to (em1) > > pass out on em2 all > > > > This is more in line with how NAT rules operate. > > If I understand the idea right, then basically 'match out on em0' > figures out the new 'outbound interface' so either 'pass out on em1...' or > 'pass out on em2...' will kick in. In other words: > > depending on the destination picked up from table, > the route-to action will override the em0 interface to > either em1 or em2. yes. i dont understand the kif lifetimes though. can we just point a pdesc at an arbitrary kif or do we need ot reference count? > I think this might be way to go. > > My only concern now is that such change is subtle. I mean the > > pass out ... route-to > > will change behavior in sense that current code will dispatch > packet to new interface and run pf_test() again. Once your diff > will be in the same rule will be accepted, but will bring entirely > different behaviour: just dispatch packet to new interface. yeah. the counter example is that i was extremely surprised when i discovered that pf_test gets run again when the outgoing interface is changed with a route-to rule. there's subtlety either way, we're just figuring out which one we're going for. > > > I'll be OK with your simplified model if it will make things > > > more explicit: > > > > > > route-to option should be applied on inbound rules > > > only > > > > This would restrict how we currently write rules. See below about how we > > would be using it. > > > > > reply-to option should be applied on outbound rule > > > only > > > > I'm using reply-to on inbound rules. On these boxes I have a service > > (it's a dns resolver running unbound) that is accessible only via > > gre(4) tunnels, and I need the replies to those connections to go > > out the same interface they came in on. I'm running an older version of > > my diff, so I can have rules like this to make it work: > > > > pass in quick on gre0 reply-to gre0:peer > > pass in quick on gre1 reply-to gre1:peer > > > > The DNS traffic isn't going through this box, the replies that > > unbound is generating match the state created by the inbound rule. > > > > If I'm remembering correctly, sthen@ had a similar use case. > > you are right, I did not think much of a local bound traffic. > in this case reply-to needs to be kept as-is. > > > > > > dup-to option can go either way (in/out) > > > > Yep. > > > > > does it make sense? IMO yes, because doing route-to > > >
Re: [External] : Re: pf route-to issues
Hello, > > > > I understand that simple is better here, so I won't object > > if we will lean towards simplified model above. However I still > > would like to share my view on current PF. > > > > the way I understand how things (should) work currently is fairly > > simple: > > > > we always run pf_test() as packet crosses interface. > > packet can cross interface either in outbound or > > inbound direction. > > That's how I understand the current code. I'm proposing that we change > the semantics so they are: > > - we always run pf_test as a packet enters or leaves the network stack. > - pf is able to filter or apply policy based on various attributes > of the packet such as addresses and ports, but also metadata about > the packet such as the current prio, or the interface it came > from or is going to. > - changing a packet or it's metadata does not cause a rerun of pf_test. > - route-to on an incoming packet basically bypasses the default > stack processing with a "fast route" out of the stack. > > > this way we can always create a complex route-to loops, > > however it can also solve some route-to vs. NAT issues. > > consider those fairly innocent rules: > > > > 8<---8<---8<--8< > > table { 10.10.10.10, 172.16.1.1 } > > > > pass out on em0 from 192.168.1.0/24 to any route-to > > pass out on em1 from 192.168.1.0 to any nat-to (em1) > > pass out on em2 all > > 8<---8<---8<--8< > > > > Rules above should currently work, but will stop if we will > > go with simplified model. > > The entries in make the packet go out em1 and em2? yes they do. let's say 10.10.10.10 is reached over em1, 172.16.1.1 is reached over em2. sorry I have not specified that in my earlier email. > > I'm ok with breaking configs like that. We don't run pf_test again for > other changes to the packet, so if we do want to support something like > that I think we should make the following work: > > # pf_pdesc kif is em0 > match out on em0 from 192.168.1.0/24 to any route-to > # pf_pdesc kif is now em1 > pass out on em1 from 192.168.1.0 to any nat-to (em1) > pass out on em2 all > > This is more in line with how NAT rules operate. If I understand the idea right, then basically 'match out on em0' figures out the new 'outbound interface' so either 'pass out on em1...' or 'pass out on em2...' will kick in. In other words: depending on the destination picked up from table, the route-to action will override the em0 interface to either em1 or em2. I think this might be way to go. My only concern now is that such change is subtle. I mean the pass out ... route-to will change behavior in sense that current code will dispatch packet to new interface and run pf_test() again. Once your diff will be in the same rule will be accepted, but will bring entirely different behaviour: just dispatch packet to new interface. > > > I'll be OK with your simplified model if it will make things > > more explicit: > > > > route-to option should be applied on inbound rules > > only > > This would restrict how we currently write rules. See below about how we > would be using it. > > > reply-to option should be applied on outbound rule > > only > > I'm using reply-to on inbound rules. On these boxes I have a service > (it's a dns resolver running unbound) that is accessible only via > gre(4) tunnels, and I need the replies to those connections to go > out the same interface they came in on. I'm running an older version of > my diff, so I can have rules like this to make it work: > > pass in quick on gre0 reply-to gre0:peer > pass in quick on gre1 reply-to gre1:peer > > The DNS traffic isn't going through this box, the replies that > unbound is generating match the state created by the inbound rule. > > If I'm remembering correctly, sthen@ had a similar use case. you are right, I did not think much of a local bound traffic. in this case reply-to needs to be kept as-is. > > > dup-to option can go either way (in/out) > > Yep. > > > does it make sense? IMO yes, because doing route-to > > on outbound path feels unnatural to me. > > I agree that it feels a bit unnatural, but so far all the route-to rules > I've been writing have been on pass out rules. That could be peculiar to > my setup, but we generally allow packets in on our external links, and > apply policy on the outbound interface heading towards the relevant > service. eg: > > block > pass in on $if_external > pass out on $if_webservers proto tcp to port { http https } > pass out on $if_relays proto { tcp udp } to port domain > > We'd be sprinkling route-to on these pass out rules to tie connections > to specific backends. > > > > > > > > > > > > >
Re: [External] : Re: pf route-to issues
On Sun, Jan 24, 2021 at 11:51 PM David Gwynne wrote: > On Mon, Jan 25, 2021 at 02:50:12AM +0100, Alexandr Nedvedicky wrote: > > Hello, > > > > > > > > ok. i don't know how to split up the rest of the change though. > > > > > > here's an updated diff that includes the rest of the kernel changes and > > > the pfctl and pf.conf tweaks. > > > > > > it's probably useful for me to try and explain at a high level what > > > i think the semantics should be, otherwise we might end up arguing > about > > > which bits of the current config i broke. > > > > > > so, from an extremely high level point of view, and apologies if > > > this is condescending, pf sits between the network stack and an > > > interface that a packet travels on. for connections handled by the > > > local box, this means packets come from the stack and get an output > > > interface selected by a route lookup, then pf checks it, and then > > > it goes out the selected interface. replies come into an interface, > > > get checked by pf, and then enter the stack. when forwarding, a > > > packet comes into an interface, pf checks it, the stack does a route > > > lookup to pick an interface, pf checks it again, and then it goes > > > out the interface. > > > > > > so what does it mean when route-to (or reply-to) gets involved? i'm > > > saying that when route-to is applied to a packet, pf takes the packet > > > away from the stack and immediately forwards it toward to specified > > > destination address. for a packet entering the system, ie, when the > > > packet is going from the interface into the stack, route-to should > > > pretend that it is forwarding the packet and basically push it > > > straight out an interface. however, like normal forwarding via the > > > stack, there might be some policy on packets leaving that interface > that > > > you want to apply, so pf should run pf_test in that situation so the > > > policy can be applied. this is especially useful if you need to apply > > > nat-to when packets leave a particular interface. > > > > > > however, if you route-to when a packet is on the way out of the > > > stack, i'm arguing that pf should not run again against that packet. > > > currently route-to rules run pf_test again if the interface the packet > > > is routed out of changes, which means pf runs multiple times against a > > > packet if rules keep changing which interface it goes out. this means > > > there's loop prevention in pf to mitigate against this, and weird > > > potentials for multiple states to be created when nat gets involved. > > > > > > for simplicity, both in terms of reasoning and code i think pf should > > > only be run once when a packet enters the system, and only once when it > > > leaves the system. the only reason i can come up with for running > > > pf_test multiple times when route-to changes the outgoing interface is > > > so you can check the packet with "pass out on $new_if" type rules. we > > > don't rerun pf again when nat/rdr changes addresses, so this feels > > > inconsistent to me. > > > > I understand that simple is better here, so I won't object > > if we will lean towards simplified model above. However I still > > would like to share my view on current PF. > > > > the way I understand how things (should) work currently is fairly > simple: > > > > we always run pf_test() as packet crosses interface. > > packet can cross interface either in outbound or > > inbound direction. > > That's how I understand the current code. I'm proposing that we change > the semantics so they are: > > - we always run pf_test as a packet enters or leaves the network stack. > - pf is able to filter or apply policy based on various attributes > of the packet such as addresses and ports, but also metadata about > the packet such as the current prio, or the interface it came > from or is going to. > - changing a packet or it's metadata does not cause a rerun of pf_test. > - route-to on an incoming packet basically bypasses the default > stack processing with a "fast route" out of the stack. > > > this way we can always create a complex route-to loops, > > however it can also solve some route-to vs. NAT issues. > > consider those fairly innocent rules: > > > > 8<---8<---8<--8< > > table { 10.10.10.10, 172.16.1.1 } > > > > pass out on em0 from 192.168.1.0/24 to any route-to > > pass out on em1 from 192.168.1.0 to any nat-to (em1) > > pass out on em2 all > > 8<---8<---8<--8< > > > > Rules above should currently work, but will stop if we will > > go with simplified model. > > The entries in make the packet go out em1 and em2? > > I'm ok with breaking configs like that. We don't run pf_test again for > other changes to the packet, so if we do want to support something like > that I think we should make the following work: > > # pf_pdesc kif is em0 >
Re: [External] : Re: pf route-to issues
On Mon, Jan 25, 2021 at 02:50:12AM +0100, Alexandr Nedvedicky wrote: > Hello, > > > > > ok. i don't know how to split up the rest of the change though. > > > > here's an updated diff that includes the rest of the kernel changes and > > the pfctl and pf.conf tweaks. > > > > it's probably useful for me to try and explain at a high level what > > i think the semantics should be, otherwise we might end up arguing about > > which bits of the current config i broke. > > > > so, from an extremely high level point of view, and apologies if > > this is condescending, pf sits between the network stack and an > > interface that a packet travels on. for connections handled by the > > local box, this means packets come from the stack and get an output > > interface selected by a route lookup, then pf checks it, and then > > it goes out the selected interface. replies come into an interface, > > get checked by pf, and then enter the stack. when forwarding, a > > packet comes into an interface, pf checks it, the stack does a route > > lookup to pick an interface, pf checks it again, and then it goes > > out the interface. > > > > so what does it mean when route-to (or reply-to) gets involved? i'm > > saying that when route-to is applied to a packet, pf takes the packet > > away from the stack and immediately forwards it toward to specified > > destination address. for a packet entering the system, ie, when the > > packet is going from the interface into the stack, route-to should > > pretend that it is forwarding the packet and basically push it > > straight out an interface. however, like normal forwarding via the > > stack, there might be some policy on packets leaving that interface that > > you want to apply, so pf should run pf_test in that situation so the > > policy can be applied. this is especially useful if you need to apply > > nat-to when packets leave a particular interface. > > > > however, if you route-to when a packet is on the way out of the > > stack, i'm arguing that pf should not run again against that packet. > > currently route-to rules run pf_test again if the interface the packet > > is routed out of changes, which means pf runs multiple times against a > > packet if rules keep changing which interface it goes out. this means > > there's loop prevention in pf to mitigate against this, and weird > > potentials for multiple states to be created when nat gets involved. > > > > for simplicity, both in terms of reasoning and code i think pf should > > only be run once when a packet enters the system, and only once when it > > leaves the system. the only reason i can come up with for running > > pf_test multiple times when route-to changes the outgoing interface is > > so you can check the packet with "pass out on $new_if" type rules. we > > don't rerun pf again when nat/rdr changes addresses, so this feels > > inconsistent to me. > > I understand that simple is better here, so I won't object > if we will lean towards simplified model above. However I still > would like to share my view on current PF. > > the way I understand how things (should) work currently is fairly simple: > > we always run pf_test() as packet crosses interface. > packet can cross interface either in outbound or > inbound direction. That's how I understand the current code. I'm proposing that we change the semantics so they are: - we always run pf_test as a packet enters or leaves the network stack. - pf is able to filter or apply policy based on various attributes of the packet such as addresses and ports, but also metadata about the packet such as the current prio, or the interface it came from or is going to. - changing a packet or it's metadata does not cause a rerun of pf_test. - route-to on an incoming packet basically bypasses the default stack processing with a "fast route" out of the stack. > this way we can always create a complex route-to loops, > however it can also solve some route-to vs. NAT issues. > consider those fairly innocent rules: > > 8<---8<---8<--8< > table { 10.10.10.10, 172.16.1.1 } > > pass out on em0 from 192.168.1.0/24 to any route-to > pass out on em1 from 192.168.1.0 to any nat-to (em1) > pass out on em2 all > 8<---8<---8<--8< > > Rules above should currently work, but will stop if we will > go with simplified model. The entries in make the packet go out em1 and em2? I'm ok with breaking configs like that. We don't run pf_test again for other changes to the packet, so if we do want to support something like that I think we should make the following work: # pf_pdesc kif is em0 match out on em0 from 192.168.1.0/24 to any route-to # pf_pdesc kif is now em1 pass out on em1 from 192.168.1.0 to any nat-to (em1) pass out on em2 all This is more in line with how NAT rules operate. > I'll be OK with
Re: [External] : Re: pf route-to issues
Hello, > > ok. i don't know how to split up the rest of the change though. > > here's an updated diff that includes the rest of the kernel changes and > the pfctl and pf.conf tweaks. > > it's probably useful for me to try and explain at a high level what > i think the semantics should be, otherwise we might end up arguing about > which bits of the current config i broke. > > so, from an extremely high level point of view, and apologies if > this is condescending, pf sits between the network stack and an > interface that a packet travels on. for connections handled by the > local box, this means packets come from the stack and get an output > interface selected by a route lookup, then pf checks it, and then > it goes out the selected interface. replies come into an interface, > get checked by pf, and then enter the stack. when forwarding, a > packet comes into an interface, pf checks it, the stack does a route > lookup to pick an interface, pf checks it again, and then it goes > out the interface. > > so what does it mean when route-to (or reply-to) gets involved? i'm > saying that when route-to is applied to a packet, pf takes the packet > away from the stack and immediately forwards it toward to specified > destination address. for a packet entering the system, ie, when the > packet is going from the interface into the stack, route-to should > pretend that it is forwarding the packet and basically push it > straight out an interface. however, like normal forwarding via the > stack, there might be some policy on packets leaving that interface that > you want to apply, so pf should run pf_test in that situation so the > policy can be applied. this is especially useful if you need to apply > nat-to when packets leave a particular interface. > > however, if you route-to when a packet is on the way out of the > stack, i'm arguing that pf should not run again against that packet. > currently route-to rules run pf_test again if the interface the packet > is routed out of changes, which means pf runs multiple times against a > packet if rules keep changing which interface it goes out. this means > there's loop prevention in pf to mitigate against this, and weird > potentials for multiple states to be created when nat gets involved. > > for simplicity, both in terms of reasoning and code i think pf should > only be run once when a packet enters the system, and only once when it > leaves the system. the only reason i can come up with for running > pf_test multiple times when route-to changes the outgoing interface is > so you can check the packet with "pass out on $new_if" type rules. we > don't rerun pf again when nat/rdr changes addresses, so this feels > inconsistent to me. I understand that simple is better here, so I won't object if we will lean towards simplified model above. However I still would like to share my view on current PF. the way I understand how things (should) work currently is fairly simple: we always run pf_test() as packet crosses interface. packet can cross interface either in outbound or inbound direction. this way we can always create a complex route-to loops, however it can also solve some route-to vs. NAT issues. consider those fairly innocent rules: 8<---8<---8<--8< table { 10.10.10.10, 172.16.1.1 } pass out on em0 from 192.168.1.0/24 to any route-to pass out on em1 from 192.168.1.0 to any nat-to (em1) pass out on em2 all 8<---8<---8<--8< Rules above should currently work, but will stop if we will go with simplified model. I'll be OK with your simplified model if it will make things more explicit: route-to option should be applied on inbound rules only reply-to option should be applied on outbound rule only dup-to option can go either way (in/out) does it make sense? IMO yes, because doing route-to on outbound path feels unnatural to me. > > this also breaks the ability to do route-to without states. is there a > reason to do that apart from the DSR type things? did we agree that > those use cases could be handled by sloppy states instead? If I remember correct we need to make 'keep state' mandatory for route-to so it can work well with pfsync(4), right? > > lastly, the "argument" or address specified with route-to (and > reply-to and dup-to) is a destination address, not a next-hop. this > has been discussed on the lists a couple of times before, so i won't > go over it again, except to reiterate that it allows pf to force > "sticky" path selection while opening up the possibility for ecmp > and failover for where that path traverses. I keep forgetting about it as I still stick to current interpretation. I've seen changes to pfctl. Diff below still allows rule: pass in on net0 from 192.168.1.0/24 to any route