On Mon, Jan 25, 2021 at 02:50:12AM +0100, Alexandr Nedvedicky wrote: > Hello, > > > > > ok. i don't know how to split up the rest of the change though. > > > > here's an updated diff that includes the rest of the kernel changes and > > the pfctl and pf.conf tweaks. > > > > it's probably useful for me to try and explain at a high level what > > i think the semantics should be, otherwise we might end up arguing about > > which bits of the current config i broke. > > > > so, from an extremely high level point of view, and apologies if > > this is condescending, pf sits between the network stack and an > > interface that a packet travels on. for connections handled by the > > local box, this means packets come from the stack and get an output > > interface selected by a route lookup, then pf checks it, and then > > it goes out the selected interface. replies come into an interface, > > get checked by pf, and then enter the stack. when forwarding, a > > packet comes into an interface, pf checks it, the stack does a route > > lookup to pick an interface, pf checks it again, and then it goes > > out the interface. > > > > so what does it mean when route-to (or reply-to) gets involved? i'm > > saying that when route-to is applied to a packet, pf takes the packet > > away from the stack and immediately forwards it toward to specified > > destination address. for a packet entering the system, ie, when the > > packet is going from the interface into the stack, route-to should > > pretend that it is forwarding the packet and basically push it > > straight out an interface. however, like normal forwarding via the > > stack, there might be some policy on packets leaving that interface that > > you want to apply, so pf should run pf_test in that situation so the > > policy can be applied. this is especially useful if you need to apply > > nat-to when packets leave a particular interface. > > > > however, if you route-to when a packet is on the way out of the > > stack, i'm arguing that pf should not run again against that packet. > > currently route-to rules run pf_test again if the interface the packet > > is routed out of changes, which means pf runs multiple times against a > > packet if rules keep changing which interface it goes out. this means > > there's loop prevention in pf to mitigate against this, and weird > > potentials for multiple states to be created when nat gets involved. > > > > for simplicity, both in terms of reasoning and code i think pf should > > only be run once when a packet enters the system, and only once when it > > leaves the system. the only reason i can come up with for running > > pf_test multiple times when route-to changes the outgoing interface is > > so you can check the packet with "pass out on $new_if" type rules. we > > don't rerun pf again when nat/rdr changes addresses, so this feels > > inconsistent to me. > > I understand that simple is better here, so I won't object > if we will lean towards simplified model above. However I still > would like to share my view on current PF. > > the way I understand how things (should) work currently is fairly simple: > > we always run pf_test() as packet crosses interface. > packet can cross interface either in outbound or > inbound direction.
That's how I understand the current code. I'm proposing that we change the semantics so they are: - we always run pf_test as a packet enters or leaves the network stack. - pf is able to filter or apply policy based on various attributes of the packet such as addresses and ports, but also metadata about the packet such as the current prio, or the interface it came from or is going to. - changing a packet or it's metadata does not cause a rerun of pf_test. - route-to on an incoming packet basically bypasses the default stack processing with a "fast route" out of the stack. > this way we can always create a complex route-to loops, > however it can also solve some route-to vs. NAT issues. > consider those fairly innocent rules: > > --------8<---------------8<---------------8<------------------8<-------- > table <hops> { 10.10.10.10, 172.16.1.1 } > > pass out on em0 from 192.168.1.0/24 to any route-to <hops> > pass out on em1 from 192.168.1.0 to any nat-to (em1) > pass out on em2 all > --------8<---------------8<---------------8<------------------8<-------- > > Rules above should currently work, but will stop if we will > go with simplified model. The entries in <hops> make the packet go out em1 and em2? I'm ok with breaking configs like that. We don't run pf_test again for other changes to the packet, so if we do want to support something like that I think we should make the following work: # pf_pdesc kif is em0 match out on em0 from 192.168.1.0/24 to any route-to <hops> # pf_pdesc kif is now em1 pass out on em1 from 192.168.1.0 to any nat-to (em1) pass out on em2 all This is more in line with how NAT rules operate. > I'll be OK with your simplified model if it will make things > more explicit: > > route-to option should be applied on inbound rules > only This would restrict how we currently write rules. See below about how we would be using it. > reply-to option should be applied on outbound rule > only I'm using reply-to on inbound rules. On these boxes I have a service (it's a dns resolver running unbound) that is accessible only via gre(4) tunnels, and I need the replies to those connections to go out the same interface they came in on. I'm running an older version of my diff, so I can have rules like this to make it work: pass in quick on gre0 reply-to gre0:peer pass in quick on gre1 reply-to gre1:peer The DNS traffic isn't going through this box, the replies that unbound is generating match the state created by the inbound rule. If I'm remembering correctly, sthen@ had a similar use case. > dup-to option can go either way (in/out) Yep. > does it make sense? IMO yes, because doing route-to > on outbound path feels unnatural to me. I agree that it feels a bit unnatural, but so far all the route-to rules I've been writing have been on pass out rules. That could be peculiar to my setup, but we generally allow packets in on our external links, and apply policy on the outbound interface heading towards the relevant service. eg: block pass in on $if_external pass out on $if_webservers proto tcp to port { http https } pass out on $if_relays proto { tcp udp } to port domain We'd be sprinkling route-to on these pass out rules to tie connections to specific backends. > > </snip> > > > > > this also breaks the ability to do route-to without states. is there a > > reason to do that apart from the DSR type things? did we agree that > > those use cases could be handled by sloppy states instead? > > If I remember correct we need to make 'keep state' mandatory > for route-to so it can work well with pfsync(4), right? That's correct. > > > > lastly, the "argument" or address specified with route-to (and > > reply-to and dup-to) is a destination address, not a next-hop. this > > has been discussed on the lists a couple of times before, so i won't > > go over it again, except to reiterate that it allows pf to force > > "sticky" path selection while opening up the possibility for ecmp > > and failover for where that path traverses. > > I keep forgetting about it as I still stick to current interpretation. > > > I've seen changes to pfctl. Diff below still allows rule: > > pass in on net0 from 192.168.1.0/24 to any route-to 10.10.10.10@em0 Is there use case for the @interface syntax apart from the current route-to rules? If not, we can just delete it. > it also allows rule: > > pass in on net0 from 192.168.1.0/24 to any route-to em0 > > I think we don't want support those two anymore, is that correct? em0 gets resolved to the addresses on the interface. It's a silly config, but it's not wrong. $ echo pass in on vmx0 from 192.168.1.0/24 to any route-to vmx0 | pfctl -vnf - pass in on vmx0 inet from 192.168.1.0/24 to any flags S/SA route-to 192.0.2.34 It does raise the question of what pf_route should do if it resolves something with RTF_LOCAL set. Or RTF_BLACKHOLE and RTF_REJECT for that matter. dlg > > thanks and > regards > sashan