Re: Issue with pf route-to and routing tables
On Mon, 15 Apr 2024, at 21:33, Thomas wrote: > Hi all, > > I'm greatly enjoying OpenBSD and have it on most of my devices as I try > to set up my "perfect lab". I would like some feedback / thoughts about > one behaviour which I don't quite get. > > I have a VM for the world facing side of my network. I have a wireguard > network to link it up to a home router and other devices. My wireguard > traffic is coming onto my VM through wg0. > > On my home router, I'm redirecting all wifi traffic to wg0 using the > routing tables like so: > default192.168.0.1 wg0 > IP_VM IP_Gatewaybse0 > 192.168.0.1 wg0 wg0 > > And natting outbound traffic on wg0 like so: > pass out on wg0 from $int_if:network nat-to wg0 > > I wanted to try out using route-to on my VM instead of using different > rdomain or just to try something else. I have another wireguard tunnel, > wg1 to relay my internal traffic further. > > I did not touch the routing tables at all and have something like: > pass in on wg0 inet from wg0:network to !wg0:network route-to wg1 > pass out on wg1 nat-to wg1 > > Works like a charm. Now what I don't get is that for troubleshooting > purposes, I needed to send some traffic to the world on my VM (instead > of onward through wg1) and I initially tried: > pass in log on wg0 inet from wg0:network to !wg0:network route-to vio0 > pass out log on $vio0 nat-to $vio0 > > Routing tables: > default IP_Gateway vio0 > _Gateway MAC_Gateway vio0 > > But this does not work. Removing "route-to vio0" does work, eg. > pass in log on wg0 inet from wg0:network to !wg0:network #route-to vio0 > pass out log on vio0 nat-to vio0 Never mind, I forgot to check this mailing list and read that I needed to put the source address on this line: pass in log on wg0 inet from wg0:network to !wg0:network route-to IP_GATEWAY I suppose that then the oddity is that this works with wg1 and may be a corner case of the wireguard interface as it's assigned xxx.xxx.xxx.xxx/32 by the VPN provider and so destination address = source address? One side question as I consider using rdomain. man 4 rdomain gives as an example: A pf.conf(5) snippet to block incoming port 80, and nat-to and move to rtable 0 on interface em1: block in on rdomain 4 proto tcp to any port 80 match out on rdomain 4 to !$internal_net nat-to (em1) rtable 0 Should it not be "match in" in the 2nd line? man 5 pf.conf reads: rtable number Used to select an alternate routing table for the routing lookup. Only effective before the route lookup happened, i.e. when filtering inbound. Or does it work because it's a match statement? Thanks all,
Issue with pf route-to and routing tables
Hi all, I'm greatly enjoying OpenBSD and have it on most of my devices as I try to set up my "perfect lab". I would like some feedback / thoughts about one behaviour which I don't quite get. I have a VM for the world facing side of my network. I have a wireguard network to link it up to a home router and other devices. My wireguard traffic is coming onto my VM through wg0. On my home router, I'm redirecting all wifi traffic to wg0 using the routing tables like so: default192.168.0.1 wg0 IP_VM IP_Gatewaybse0 192.168.0.1 wg0 wg0 And natting outbound traffic on wg0 like so: pass out on wg0 from $int_if:network nat-to wg0 I wanted to try out using route-to on my VM instead of using different rdomain or just to try something else. I have another wireguard tunnel, wg1 to relay my internal traffic further. I did not touch the routing tables at all and have something like: pass in on wg0 inet from wg0:network to !wg0:network route-to wg1 pass out on wg1 nat-to wg1 Works like a charm. Now what I don't get is that for troubleshooting purposes, I needed to send some traffic to the world on my VM (instead of onward through wg1) and I initially tried: pass in log on wg0 inet from wg0:network to !wg0:network route-to vio0 pass out log on $vio0 nat-to $vio0 Routing tables: default IP_Gateway vio0 _Gateway MAC_Gateway vio0 But this does not work. Removing "route-to vio0" does work, eg. pass in log on wg0 inet from wg0:network to !wg0:network #route-to vio0 pass out log on vio0 nat-to vio0 I'm guessing that this may have to be since it's routed "twice"? Eg. routed-to and a second time with the default route of the routing tables? So I understand why route-to is not necessary in this case, but I would think route-to should still work and that means I don't get how it's working? I've tried used pflog0 to check the above rules but cannot see any difference: in both cases, it's passing in on wg0 through vio0 and src IP is rewritten to VM public IP. I'm thinking of more complex rules to split traffic from wg0 between wg1 and vio0 based on the ports and using route-to vio0 seemed the easiest way to do so. Thanks in advance, Thomas
Re: pf nat64 rule not matching
> I don't think there is at present. There are no "only use v4" or "only > use v6" addresses modifiers, and pf isn't figuring out for itself that > it only makes sense to use addresses from the relevant family for > af-to translation addresses (although it _does_ do this for nat-to). Good to know. I was able to get this working by using ($wan) instead of ($wan:0), fwiw. > Ah I meant that the router should not use the local unbound dns64 > resolver for its own traffic - otherwise it won't be able to reach v4 > hosts because there won't be anything to handle the translation. > Either point it off-machine (ISP or public resolver) or run another > local resolver for its own traffic. Ah, that makes sense. I was totally doing this. *facepalm* I've changed it to use Quad9. Thanks for the follow-up! > Please keep replies on the mailing list. My bad! Still getting used to the `mail` client and how this mailing list operates in general, and I see now the default behavior is to do a reply-all that includes your personal email in addition to the mailing list. Apologies!
Re: pf nat64 rule not matching
On 2024-03-15, Evan Sherwood wrote: > > Is there a way to configure this without hard-coding my IPv4 address? > I do not think my IPv4 address from my ISP is static, thus my original > interest in the ($wan:0) form. I don't think there is at present. There are no "only use v4" or "only use v6" addresses modifiers, and pf isn't figuring out for itself that it only makes sense to use addresses from the relevant family for af-to translation addresses (although it _does_ do this for nat-to). >> Regarding the other rules and tests, the ::1 rule is wrong, packets >> outgoing on the network won't have a ::1 address, try "!received-on >> any", and packets sourced from the router itself won't hit the af-to >> rule so tests need to be from another machine (and probably best use >> different DNS servers not doing dns64 on the router). > > Thanks for this follow-up. You're right that I was trying to only target > traffic that originated from the router itself with this rule. I had > figured out that the tests needed to be from another machine, though > that did take me a while. > > What are the reasons for doing dns64 on a different machine? Ah I meant that the router should not use the local unbound dns64 resolver for its own traffic - otherwise it won't be able to reach v4 hosts because there won't be anything to handle the translation. Either point it off-machine (ISP or public resolver) or run another local resolver for its own traffic. -- Please keep replies on the mailing list.
Re: pf nat64 rule not matching
> Try changing ($wan:0) to $(wan) and see what happens. Huh, that worked! Thanks!
Re: pf nat64 rule not matching
Try changing ($wan:0) to $(wan) and see what happens.
Re: pf nat64 rule not matching
> Can you try if the same happens with a more specific rule (for > testing)? > > i.e.: > > pass in on igc3 inet6 from "put actual v6 prefix here" to 64:ff9b::/96 > af-to inet from "actual IP on igc0"/32 This worked! Specifically, I think the ($wan:0) was the problem. I could've sworn I tried this with the actual IP and it wasn't working before, but I might've deleted the inet6 at that point, so maybe I created a new problem then... which you also pointed out: > I am suspecting that the missing inet6 may lead to some confusion. Is there a way to configure this without hard-coding my IPv4 address? I do not think my IPv4 address from my ISP is static, thus my original interest in the ($wan:0) form. > Alternatively, remove the block rules; URPF may be an issue here, if > you lack a route for the /96. I had tried commenting out all of the block rules and saw no change. Tcpdump also showed no blocks, fwiw. > Regarding the other rules and tests, the ::1 rule is wrong, packets > outgoing on the network won't have a ::1 address, try "!received-on > any", and packets sourced from the router itself won't hit the af-to > rule so tests need to be from another machine (and probably best use > different DNS servers not doing dns64 on the router). Thanks for this follow-up. You're right that I was trying to only target traffic that originated from the router itself with this rule. I had figured out that the tests needed to be from another machine, though that did take me a while. What are the reasons for doing dns64 on a different machine?
Re: pf nat64 rule not matching
On 2024-03-15, Tobias Fiebig via misc wrote: > > Moin, >> # perform nat64 (NOT WORKING) >> pass in to 64:ff9b::/96 af-to inet from ($wan:0) > > Can you try if the same happens with a more specific rule (for > testing)? > > i.e.: > > pass in on igc3 inet6 from "put actual v6 prefix here" to 64:ff9b::/96 > af-to inet from "actual IP on igc0"/32 "actual IP on igc0" is a good idea. If I try a similar rule without () using an interface with v4+v6 addresses, pfctl rejects it due to af mismatch. > I am suspecting that the missing inet6 may lead to some confusion. > Alternatively, remove the block rules; URPF may be an issue here, if > you lack a route for the /96. "match log(matches)" and "tcpdump -neipflog0" is your friend for figuring out which rules are used. I suspect the urpf too. Regarding the other rules and tests, the ::1 rule is wrong, packets outgoing on the network won't have a ::1 address, try "!received-on any", and packets sourced from the router itself won't hit the af-to rule so tests need to be from another machine (and probably best use different DNS servers not doing dns64 on the router).
Re: pf nat64 rule not matching
Moin, > # perform nat64 (NOT WORKING) > pass in to 64:ff9b::/96 af-to inet from ($wan:0) Can you try if the same happens with a more specific rule (for testing)? i.e.: pass in on igc3 inet6 from "put actual v6 prefix here" to 64:ff9b::/96 af-to inet from "actual IP on igc0"/32 I am suspecting that the missing inet6 may lead to some confusion. Alternatively, remove the block rules; URPF may be an issue here, if you lack a route for the /96. A minimal (== based on the default pf.conf) config working for me: ``` # $OpenBSD: pf.conf,v 1.55 2017/12/03 20:40:04 sthen Exp $ # # See pf.conf(5) and /etc/examples/pf.conf set skip on lo block return# block stateless traffic pass# establish keep-state # By default, do not permit remote connections to X11 block return in on ! lo0 proto tcp to port 6000:6010 # Port build user does not need network block return out log proto {tcp udp} user _pbuild pass in on vio0 inet6 from 2a06:d1c0:deac:1:d5:64:a115:1 to 2a06:d1c7:a:4764::/96 af-to inet from 193.104.168.184/29 random ``` With best regards, Tobias
pf nat64 rule not matching
Hello, I'm trying to get a basic OpenBSD NAT64 router setup. I'm following along with these instructions: - https://blog.obtusenet.com/dns64-nat64-on-openbsd/ My unbound instance looks like it's correctly configured and returning correct IPv6 addresses, so that's good. # dig ipv4.google.com +short ipv4.l.google.com. 64:ff9b::8efa:bc0e However, the pf rule using af-to does not appear to do anything and I haven't been able to figure out why. When I try to ping6, I get 100% packet loss. I inspected packets through tcpdump (after adding "log" to everything in pf.conf) and nothing seems to be getting blocked, though it also appears the 64:ff9b::/96 address are not being translated either; I think the packets are passing through pf unchanged (the rule doesn't apply, but I don't know why). Here is my entire pf.conf: wan = "igc0" trusted = "igc1" untrusted = "igc2" iot = "igc3" cerberus_ssh = "36285" table persist file "/etc/martians" set block-policy drop set loginterface egress set skip on lo0 block in log quick from urpf-failed block in log quick on egress from to any block return out log quick on egress from any to block return log all pass # allow IPv6 PD from ISP pass in inet6 proto udp from fe80::/10 port dhcpv6-server to fe80::/10 port dhcpv6-client no state # allow ICMPv6 traffic (necessary for IPv6 to work) pass inet6 proto icmp6 all # perform nat64 (NOT WORKING) pass in to 64:ff9b::/96 af-to inet from ($wan:0) # allow outbound queries from local unbound and NTP pass out inet6 proto { tcp, udp } from ::1 to port { domain, ntp } # allow DNS & NTP queries from the iot network pass in on $iot proto { tcp, udp } from $iot:network to port { domain, ntp } # allow ssh, http, & https pass inet6 proto tcp to port { ssh, http, https, $cerberus_ssh } I have IP forwarding turned on: # sysctl | grep forwarding net.inet.ip.forwarding=1 net.inet.ip.mforwarding=0 net.inet6.ip6.forwarding=1 net.inet6.ip6.mforwarding=1 I have an IPv4 and IPv6 address for igc0 via autoconf. Here's a rough sketch of my network topology: +---+ | ISP modem | +---+ | | igc0 +---+ | cerberus (OpenBSD router) | +---+ igc1 igc2 igc3 | || | || ... ... +-+ | vulpes (OpenBSD client) | +-+ >From both vulpes and cerberus, ping6 ipv4.google.com hangs and never returns. I tried substituting ($wan:0) for my actual IPv4 address assigned to igc0, but I got no change in behavior. I read in the man page that :0 does not include aliases when used on an interface. When I print the rules out using pfctl -vvsr, it gets expanded to (igc0:0:1), which looks weird and I don't understand why. My understanding is that it should be "... af-to inet from IPV4_ADDRESS_OF_WAN_IF", but I don't know if (igc0:0:1) is the IPv4 address of igc0, and I can't figure out how to verify if that's right... or even if that's the problem in the first place and I'm chasing a red herring. I feel like I'm missing something, but I can't see it. The Book of PF doesn't have any information on NAT64 that I could see, and the man page for pf.conf shows an example of what I'm already doing with no additional instructions. I've found maybe 3 articles about NAT64 on OpenBSD through searching, but none give me any more context or clues beyond the one I mentioned earlier. I'd appreciate any help I could get! Evan Here's my dmesg: OpenBSD 7.4 (GENERIC.MP) #1397: Tue Oct 10 09:02:37 MDT 2023 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 8332189696 (7946MB) avail mem = 8059916288 (7686MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.5 @ 0x75d9f000 (122 entries) bios0: vendor American Megatrends International, LLC. version "ALN4L102" date 11/08/2023 bios0: Default string Default string efi0 at bios0: UEFI 2.8 efi0: American Megatrends rev 0x5001a acpi0 at bios0: ACPI 6.4 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP FIDT SSDT SSDT SSDT SSDT SSDT HPET APIC MCFG SSDT UEFI RTCT PSDS NHLT LPIT SSDT SSDT DBGP DBG2 SSDT DMAR FPDT SSDT SSDT SSDT SSDT TPM2 PHAT WSMT acpi0: wakeup devices PEGP(S4) PEGP(S4) PEGP(S4) SIO1(S3) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) RP12(S4) PXSX(S4) RP13(S4) PXSX(S4) RP14(S4) PXSX(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 1920 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbu
Re: 10gbps pf nat firewall ix to mcx
j...@openbsd.org [j...@openbsd.org] wrote: > On Sun, Feb 11, 2024 at 10:42:32AM -0800, Chris Cappuccio wrote: > > huh, after i migrated nat fw from 82599 (ix) with LRO on (default) to > > a CX4121A (mcx) flashed to latest nvidia firmware and now i'm getting > > 900mbps on single tcp throughput > > > (endpoints still using lro on em and ix) > em(4) does not support the LRO feature, just TSO with mglocker's diff. > > > and very consistently getting close to the full 1gbps > > thruoghput on single tcp connections now instead of slower and slightly > > varying results. guess i should go back and test ix with LRO off on > > the pf box. > > Sorry, I don't get your problem. You changed your firewall NICs from > ix(4) to mcx(4) and the throughput got slower? Or, the speed it varying > between 0.9 gbps and 1.0 gbps? got faster, notably faster and more consistent TCP performance as tested with an ix sender, through mcx firewall, to a 1Gbps em endpoint, 1500 byte normal mtu, all default settings across the board i would have to test more to understand what was going on, but this took me for surprise chris
Re: 10gbps pf nat firewall ix to mcx
On Sun, Feb 11, 2024 at 10:42:32AM -0800, Chris Cappuccio wrote: > huh, after i migrated nat fw from 82599 (ix) with LRO on (default) to > a CX4121A (mcx) flashed to latest nvidia firmware and now i'm getting > 900mbps on single tcp throughput > (endpoints still using lro on em and ix) em(4) does not support the LRO feature, just TSO with mglocker's diff. > and very consistently getting close to the full 1gbps > thruoghput on single tcp connections now instead of slower and slightly > varying results. guess i should go back and test ix with LRO off on > the pf box. Sorry, I don't get your problem. You changed your firewall NICs from ix(4) to mcx(4) and the throughput got slower? Or, the speed it varying between 0.9 gbps and 1.0 gbps?
10gbps pf nat firewall ix to mcx
huh, after i migrated nat fw from 82599 (ix) with LRO on (default) to a CX4121A (mcx) flashed to latest nvidia firmware and now i'm getting 900mbps on single tcp throughput (endpoints still using lro on em and ix) and very consistently getting close to the full 1gbps thruoghput on single tcp connections now instead of slower and slightly varying results. guess i should go back and test ix with LRO off on the pf box.
Allowing i2p bittorrent traffic in a transparently proxied enviroment with pf
I have setup a transparent Tor proxy with the following pf ruleset: https://paste.c-net.org/WharfSeasick It routes most importantly all TCP and DNS traffic through the Tor network. Now I want to have another rule for I2P bittorrent, meaning that there is a rule for traffic that must be routed through I2P AND must be bittorrent traffic AND doesn't go through Tor. I got the I2P and Not-Tor part insofar, that I established: pass out proto { tcp udp } _user i2pd but my problem is, that I can't be sure if this traffic is bittorrent or a hypothetical attacker. Ideally, I thought, it would be to have tag for bittorrent like I have for DNS and TCP. A tag is no gurantee, that traffic is legit, but it would be an approxmiation. If my understanding of tags is correct, it would be safer to assume traffic tagged "bittorrent" is really bittorrent, as opposed to traffic only having a certain port number. If I'm mistaken and tags aren't safer and more practical, is there any other solution? Is there any way to make a rule to ensure traffic passed out by this rule will be only bittorrent? Thanks in advance
Re: pf queues
> On Thu, Nov 30, 2023 at 03:55:49PM +0300, 4 wrote: >> >> "cbq can entirely be expressed in it" ok. so how do i set priorities for >> queues in hfsc for my local(not for a router above that knows nothing about >> my existence. tos is an absolutely unviable concept in the real world) >> pf-router? i don't see a word about it in man pf.conf >> > In my reply to the initial message in this thread, I gave you the references > that spell this out fairly clearly. > And you're dead wrong about the pf.conf man page. Unless of course you > are trying to look this up on a system that still runs something that > is by now roughly a decade out of date. i don't understand what you're pointing at, because "prio" and "hfsc" are different independent mechanisms, not two parts of one whole. in cbq these were two parts of the same mechanism, cbq could simultaneously slice and priotize traffic
Re: pf queues
On 2023/12/01 15:57, 4 wrote: > >But CBQ doesn't help anyway, you still have this same problem. > the problem when both from below and from above can be told to you "go and > fuck yourself" can't be solved, but cbq gives us two mechanisms we need- > priorities and traffic restriction. nothing more can be done. but and less > will not suit us If you still don't see how priorities in CBQ can't help, there's no point me replying any more.
Re: pf queues
> On 2023-12-01, 4 wrote: >I don't know why you are going on about SMT here. i'm talking about not sacrificing functionality for the sake of hypothetical performance. the slides say that using queues degrades performance by 10%. and you're saying there won't be anything in the queues until an overload event occurs. as i understand it, these are interrelated things ;) >And there is no way to tell when the upstream router has forwarded the packets. and we don't need to know that. the only way to find out when an overload "occurred" is to set some threshold value lower than the theoretical bandwidth of the interface and look when the actual speed on the interface exceeds this threshold. and then we will put packets in queues, but not early(so that our slaves don't get too tired, right?). but this has nothing with when overload actually happens but not in our imagination. in the most cases there is no bond between what we have assumed and what is actually happening(because there is no feedback. yes, there is ecn, but it doesn't work). i don't like this algorithm because it's a non-working algorithm. but an algorithm with priorities, when we ALWAYS(and not only when an imaginary overload occurred) put a packets in the queues, when we ALWAYS send packets with a higher priority first, and all the others only when there are no packets with a higher priority in the queue, this algorithm is working. i.e. we always use queues, despite the loss of 10% performance. what will happen on the overloaded upstream router is no our problem. our area of responsibility is to put more important for us packets into the our network card. but this requires a constantly working(and not only when an imaginary overload has occurred) priority mechanism. that's why i say that "prio" is much more useful than "hsfc". but it is also possible that traffic as important to us as ssh can take our entire channel, and we don't want that. and that's exactly where we need to limit the maximum queue speed. there may also be a situation where at least some number of packets should be guaranteed to go through some queue, for icmp as example, and here we need hsfc, since priorities alone cannot solve this problem. or we need cbq that could do it all at once. and i exist for all this to work well, it is i who must plan all this competently and prudently- this is my area of responsibility. and look, i need priorities and speed limits for this, but i don’t need to know how the upstream router is doing. if he has problems, he will send me confirmation of receipt less often or he will simply discard my packets. but that's his business, not mine. and in the same way my router will deal with clients on my local network. >BTW, HFSC with bandwidth and max set to the same values should be the same >thing as CBQ. except that the hfsc does not have a priority mechanism. ps: >But CBQ doesn't help anyway, you still have this same problem. the problem when both from below and from above can be told to you "go and fuck yourself" can't be solved, but cbq gives us two mechanisms we need- priorities and traffic restriction. nothing more can be done. but and less will not suit us
Re: pf queues
On Fri, 1 Dec 2023 04:56:40 +0300 4 wrote: > match proto icmp set prio(6 7) queue(6-fly 7-ack) > how is this supposed to work at all? i.e. packets are placed both in > prio's queues 6/7(in theory priorities and queues are the same > thing), and in hsfc's queues 6-fly/7-ack at once? I am not sure I understand what you don't understand here. Straight from manpage: https://man.openbsd.org/pf.conf#set~2 If two priorities are given, TCP ACKs with no data payload and packets which have a TOS of lowdelay will be assigned to the second one. https://man.openbsd.org/pf.conf#set~3 If two queues are given, packets which have a TOS of lowdelay and TCP ACKs with no data payload will be assigned to the second one. ICMP is not the best example, but syntax works. I guess the rule you quoted results in behaviour where all the ICMP packets get priority of 6 and get assigned to queue 6-fly, even though the idea was to have requests with priority of 6 assigned to queue 6-fly, and replies with priority of 7 to queue 7-ack. But then again perhaps it works the latter way, if icmp replies have TOS of lowdelay. If this was TCP, payload would get priority of 6 and assigned to queue 6-fly, while ACKs would get priority of 7 and assigned to queue 7-ack. Anyway, after years of usage, and lot of frustration in the beginning, I find current approach more flexible, because in HFSC queue and priority have to be the same, while in current pf we can set it to be exactly like HFSC, but also to have different priorities within the same queue, or different queue for same priority. At this point I only miss the ability to see prio values somewhere in monitoring tools like systat. The only way to get the answers is to test, write ruleset wisely, and observe systat. If someone knows of some others please let me know, I am by no means "an expert on pf queueing", just a guy who tries to tame his employer's network for quite some time now. Regards, -- Before enlightenment - chop wood, draw water. After enlightenment - chop wood, draw water. Marko Cupać https://www.mimar.rs/
Re: pf queues
On 2023-12-01, 4 wrote: >> On 2023-11-30, 4 wrote: >>> we can simply calculate such a basic thing as the flow rate by dividing the >>> number of bytes in the past packets by the time. we can control the speed >>> through delays in sending packets. this is one side of the question. as for >>> the sequence, priorities work here. yes, we will send packets with a higher >>> priority until there are no such packets left in a queue, and then we will >>> send packets from queues with a lower priority. priorities are a sequence, >>> not a share of the total piece of the pie, and we don't need to know >>> anything about the pie. > >> But unless you are sending more traffic than the *interface* speed, >> you will be sending it out on receipt, there won't be any delays in >> sending packets to the next-hop modem/router. > >> There won't *be* any packets in the queue on the PF machine to send in >> priority order. > > ok. that is, for the sake of some 10% performance(not so long ago Theo turned > off smt, and wanted to remove its support altogether. but smt it's > significantly more than 10% of performance) you use queues only when the > channel overload, that you are not able to reliably detect, but only assume > about its occurrence? there's nothing easier! just put packets in the queue > at all times :D I don't know why you are going on about SMT here. But some workloads are demonstrably *slower* if SMT is used (the scheduler just treats them as full cores, when it would probably be better to only permit threads of the same process to share SMTs on the same core). And of course there are the known problems that became very apparent with the CPU vulnerabilities that became widely known *after* OpenBSD disabled SMT by default. But anyway back to packets. The only constraint on transmitting packets from the OpenBSD machine is the network interface facing the next-hop router. Say that is a 1Gbps interface. Say you have 200Mbps of traffic to forward from other interfaces. And that the upstream connection can handle something between 100Mbps and 200Mbps but you don't know how much. And there is no way to tell when the upstream router has forwarded the packets. BTW, HFSC with bandwidth and max set to the same values should be the same thing as CBQ. But CBQ doesn't help anyway, you still have this same problem. The only thing I can think of that might possibly help is to delay all packets ("set delay") and use prio. I haven't tested to see if that actually works but maybe. If you want real controls on the PF box you need to cap to the *minimum* bandwidth and lose anything above that. Or cap somewhere between the two picked as a trade-off between lost capacity and not always doing anything useful.
Re: pf queues
> On 2023-11-30, 4 wrote: >> we can simply calculate such a basic thing as the flow rate by dividing the >> number of bytes in the past packets by the time. we can control the speed >> through delays in sending packets. this is one side of the question. as for >> the sequence, priorities work here. yes, we will send packets with a higher >> priority until there are no such packets left in a queue, and then we will >> send packets from queues with a lower priority. priorities are a sequence, >> not a share of the total piece of the pie, and we don't need to know >> anything about the pie. > But unless you are sending more traffic than the *interface* speed, > you will be sending it out on receipt, there won't be any delays in > sending packets to the next-hop modem/router. > There won't *be* any packets in the queue on the PF machine to send in > priority order. ok. that is, for the sake of some 10% performance(not so long ago Theo turned off smt, and wanted to remove its support altogether. but smt it's significantly more than 10% of performance) you use queues only when the channel overload, that you are not able to reliably detect, but only assume about its occurrence? there's nothing easier! just put packets in the queue at all times :D
Re: pf queues
On 2023-11-30, 4 wrote: > we can simply calculate such a basic thing as the flow rate by dividing the > number of bytes in the past packets by the time. we can control the speed > through delays in sending packets. this is one side of the question. as for > the sequence, priorities work here. yes, we will send packets with a higher > priority until there are no such packets left in a queue, and then we will > send packets from queues with a lower priority. priorities are a sequence, > not a share of the total piece of the pie, and we don't need to know anything > about the pie. But unless you are sending more traffic than the *interface* speed, you will be sending it out on receipt, there won't be any delays in sending packets to the next-hop modem/router. There won't *be* any packets in the queue on the PF machine to send in priority order.
Re: pf queues
> On Wed, 29 Nov 2023 00:12:02 +0300 > 4 wrote: >> i haven't used queues for a long time, but now there is a need. >> previously, queues had not only a hierarchy, but also a priority. now >> there is no priority, only the hierarchy exists. > It took me quite some time to wrap my head around this, having been > accustomed to HFSC up until 5.5. One can probably find a lot of my > emails in misc@ archives from that time. > Nowadays I am matching traffic to prio and queue by protocol and/or > destination port only. Anything not explicitly matched goes to lowest prio > queue and logged even when passed, so I can inspect if there are any > types of traffic which should be put into appropriate prio / queue. All > the ACKs except those in lowest prio queue get highest (7) priority, > stuff in lowest prio have lowest prio for ACKs as well. > # QUEUE MATCHES > match proto icmp set prio ( 6 7 ) queue ( 6-fly7-ack ) > tag queued match proto icmp set prio(6 7) queue(6-fly 7-ack) how is this supposed to work at all? i.e. packets are placed both in prio's queues 6/7(in theory priorities and queues are the same thing), and in hsfc's queues 6-fly/7-ack at once? i am surprised that this rule does not cause a syntax error. it looks interesting(i didn't know it was possible. is this definitely not a bug? :D), but still i don't understand the intent %\ i need to think about it and experiment. thank you, it was very valuable information!
Re: pf queues
> On 11/29/23 6:47 PM, Stuart Henderson wrote: >> On 2023-11-29, Daniel Ouellet wrote: >>>> yes, all this can be make without hierarchy, only with priorities(because >>>> hierarchy it's priorities), but who and why decided that eight would be >>>> enough? the one who created cbq- he created it for practical tasks. but >>>> this "hateful eight" and this "flat-earth"- i don't understand what use >>>> they are, they can't even solve such the simplified task :\ >>>> so what am i missing? >>> >>> man pf.conf >>> >>> Look for set tos. Just a few lines below set prio in the man age, >>> >>> You can have more then 8 if you need/have to. >> > Only useful if devices upstream of the PF router know their available >> bandwidth and can do some QoS themselves. >> > Same can be said for CoS as well. You can only control what's going out of > your own network. After that as soon as it reach your ISP or what not, you > have no clue if they reset everything or not. > At a minimum ToS can cross routers, CoS not so much unless it is build for it. > Either way, your QoS will kick in when bandwidth is starving, so if you don't > know that, what's the point... i do not understand how qos and all its components relate to my question, since first we need a working mechanism that would be able to restrict and prioritize traffic(i.e. cbq is needed), and only then we can put something into this mechanism based on qos values. i.e. that is qos here, in principle, cannot be a solution of the problem. we have the separate independent mechanism "prio", which can prioritize traffic with the limited opportunity(only eight queues), but does not know how to restrict him, and we have the separate independent mechanism "hsfc", which can restrict traffic, but does not know how to prioritize it(although it is claimed that it can, but i do not see how to do it). what happens on a provider's hardware is beyond parentheses and generally matters no more than the weather on Mars. so how the hell we can make cbq from hsfc? let's answer this question, because the slides claim that the answer exists
Re: pf queues
f wireguard tunnels on external interfaces etc. I once had the privilege to sit with Henning, author of 'pf megapatch' who introduced new queuing mechanism. I complained new stuff is not well documented, and asked if he could explain it better to me. He said something along the lines "I have no idea. It works for me. All I know is in the manpage and the code is available in CVS. Try if it works for you. If it doesn't and you know what should be improved send a patch". Upon hearing this, I was enlightened :) I hope above will be helpful. Best regards, -- Before enlightenment - chop wood, draw water. After enlightenment - chop wood, draw water. Marko Cupać https://www.mimar.rs/
Re: pf queues
On Thu, 2023-11-30 at 15:55 +0300, 4 wrote: > "cbq can entirely be expressed in it" ok. so how do i set priorities > for queues in hfsc You stack HFSC with link-share service curves with linkshare criterion 1:0 - or in pf.conf(5) terms: "bandwidth 1" and "bandwidth 0". Or you do not configure queuing at all, as the default one supports the "prio" argument. > for my local(not for a router above that knows nothing about my > existence. Your local interface will be at 1G or something similar. There is little chance, that there will be any queuing at all.
Re: pf queues
On 11/29/23 6:47 PM, Stuart Henderson wrote: On 2023-11-29, Daniel Ouellet wrote: yes, all this can be make without hierarchy, only with priorities(because hierarchy it's priorities), but who and why decided that eight would be enough? the one who created cbq- he created it for practical tasks. but this "hateful eight" and this "flat-earth"- i don't understand what use they are, they can't even solve such the simplified task :\ so what am i missing? man pf.conf Look for set tos. Just a few lines below set prio in the man age, You can have more then 8 if you need/have to. Only useful if devices upstream of the PF router know their available bandwidth and can do some QoS themselves. Same can be said for CoS as well. You can only control what's going out of your own network. After that as soon as it reach your ISP or what not, you have no clue if they reset everything or not. At a minimum ToS can cross routers, CoS not so much unless it is build for it. Either way, your QoS will kick in when bandwidth is starving, so if you don't know that, what's the point...
Re: pf queues
> On Thu, Nov 30, 2023 at 02:57:23PM +0300, 4 wrote: >> so what happened to cbq? why such the powerful and useful thing was removed? >> or Theo delete it precisely because it was too good for obsd? %D > Actually, the new queueing system was done by Henning, planned as far back > as (at least) 2012 (https://quigon.bsws.de/papers/2012/bsdcan/), finally > available to the general public in OpenBSD 5.5 two years later. > ALTQ support was removed from OpenBSD in time for the OpenBSD 5.6 release > (November 2014). > So, it's been a while and whatever you were running most certainly needed > an upgrade anyway. "cbq can entirely be expressed in it" ok. so how do i set priorities for queues in hfsc for my local(not for a router above that knows nothing about my existence. tos is an absolutely unviable concept in the real world) pf-router? i don't see a word about it in man pf.conf
Re: pf queues
On Thu, Nov 30, 2023 at 03:55:49PM +0300, 4 wrote: > > "cbq can entirely be expressed in it" ok. so how do i set priorities for > queues in hfsc for my local(not for a router above that knows nothing about > my existence. tos is an absolutely unviable concept in the real world) > pf-router? i don't see a word about it in man pf.conf > In my reply to the initial message in this thread, I gave you the references that spell this out fairly clearly. And you're dead wrong about the pf.conf man page. Unless of course you are trying to look this up on a system that still runs something that is by now roughly a decade out of date. -- Peter N. M. Hansteen, member of the first RFC 1149 implementation team https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/ "Remember to set the evil bit on all malicious network traffic" delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.
Re: pf queues
> On 2023-11-29, 4 wrote: >> here is a simple task, there are millions of such tasks. there is an >> internet connection, and although it is declared as symmetrical 100mbit >> it's 100 for download, but for upload it depends on the time of day, so >> we can forget about the channel width and focus on the only thing that >> matters- priorities. > But wait. If you don't know how much bandwidth is available, everything > else goes out the window. > If you don't know how much bw is available in total, you can't decide > how much to allocate to each connection, so even the basic bandwidth > control can't really work, let alone prioritising access to the > available capacity. > Priorities work when you are trying to transmit more out of an interface > than the bandwidth available on that interface. > Say you have a box running PF with a 1Gb interface to a > (router/modem/whatever) with an uplink of somewhere between 100-200Mb. > If you use only priorities in PF, in that case they can only take effect when you have >>1Gb of traffic to send out. > If you queue with a max bw 200Mb, but only 100Mb is available on the > line at times, during those times all that happens is you defer any > queueing decisions to the (router/modem). > The only way to get bandwidth control out of PF in that case is to > either limit to _less than the guaranteed minimum_ (say 100Mb in that > example), losing capacity at other times. Or if you have some way to > fetch the real line speed at various times and adjust the queue speed > in the ruleset. >> --| >> --normal >> ---| >> ---low >> but hierarchy is not enough, we need priorities, since each of these three >> queues can contain other queues. for example, the "high" queue may contain, >> in addition to the "normal" queue, "icmp" and "ssh" queues, which are more >> important than the "normal" queue, in which, for example, we will have http, >> ftp and other non-critical traffic. therefore, we assign priority 0 to the >> "normal" queue, priority 1 to the "ssh" queue and limit its maximum >> bandwidth to 10mb(so that ssh does not eat up the entire channel when >> copying files), and assign priority 2 to the "icmp" queue(icmp is more >> important than ssh). i.e. icmp packets will leave first, then ssh packets, >> and then packets from the "normal" queue and its subqueues(or they won't >> leave if we don't restrict ssh and it eats up the entire channel) > if PF doesn't know the real bandwidth, it _can't_ defer sending lower- > priority traffic until after higher-prio has been sent, because it doesn't > know if won't make it over the line... you're saying that it's impossible to manage traffic if we don't know the real bandwidth of the channel(in 99% of cases we don't know it, because it changes over time. tariffs with guaranteed speed are rare even in russia, and here things are much better with the availability and quality of the inet than the world average(speedtest have the statistics)), but in the end you say the way to do it. are you kidding me? :D we can simply calculate such a basic thing as the flow rate by dividing the number of bytes in the past packets by the time. we can control the speed through delays in sending packets. this is one side of the question. as for the sequence, priorities work here. yes, we will send packets with a higher priority until there are no such packets left in a queue, and then we will send packets from queues with a lower priority. priorities are a sequence, not a share of the total piece of the pie, and we don't need to know anything about the pie. as for the minimum guaranteed bandwidth, if it is set, then just send packets as they appear, assuming that they have the highest priority. send until the speed of such packets not exceeds the guaranteed, all packets above that should be sent based on the given priorities. this is not socialism, where everyone will be fed, this is capitalism, where you will starve and die if you do not belong to the priority elite %D (yes, yes, i know that socialism and capitalism are not about that, but in practice these are their distinctive features). but this is how it should be in the matter of packets traffic. so, where am i wrong and why do we need to know the current bandwidth of the channel?
Re: pf queues
On Thu, Nov 30, 2023 at 02:57:23PM +0300, 4 wrote: > so what happened to cbq? why such the powerful and useful thing was removed? > or Theo delete it precisely because it was too good for obsd? %D Actually, the new queueing system was done by Henning, planned as far back as (at least) 2012 (https://quigon.bsws.de/papers/2012/bsdcan/), finally available to the general public in OpenBSD 5.5 two years later. ALTQ support was removed from OpenBSD in time for the OpenBSD 5.6 release (November 2014). So, it's been a while and whatever you were running most certainly needed an upgrade anyway. -- Peter N. M. Hansteen, member of the first RFC 1149 implementation team https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/ "Remember to set the evil bit on all malicious network traffic" delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.
Re: pf queues
so what happened to cbq? why such the powerful and useful thing was removed? or Theo delete it precisely because it was too good for obsd? %D
Re: pf queues
On 2023-11-29, Daniel Ouellet wrote: >> yes, all this can be make without hierarchy, only with priorities(because >> hierarchy it's priorities), but who and why decided that eight would be >> enough? the one who created cbq- he created it for practical tasks. but this >> "hateful eight" and this "flat-earth"- i don't understand what use they are, >> they can't even solve such the simplified task :\ >> so what am i missing? > > man pf.conf > > Look for set tos. Just a few lines below set prio in the man age, > > You can have more then 8 if you need/have to. Only useful if devices upstream of the PF router know their available bandwidth and can do some QoS themselves.
Re: pf queues
On 2023-11-29, 4 wrote: > here is a simple task, there are millions of such tasks. there is an > internet connection, and although it is declared as symmetrical 100mbit > it's 100 for download, but for upload it depends on the time of day, so > we can forget about the channel width and focus on the only thing that > matters- priorities. But wait. If you don't know how much bandwidth is available, everything else goes out the window. If you don't know how much bw is available in total, you can't decide how much to allocate to each connection, so even the basic bandwidth control can't really work, let alone prioritising access to the available capacity. Priorities work when you are trying to transmit more out of an interface than the bandwidth available on that interface. Say you have a box running PF with a 1Gb interface to a (router/modem/whatever) with an uplink of somewhere between 100-200Mb. If you use only priorities in PF, in that case they can only take effect when you have >1Gb of traffic to send out. If you queue with a max bw 200Mb, but only 100Mb is available on the line at times, during those times all that happens is you defer any queueing decisions to the (router/modem). The only way to get bandwidth control out of PF in that case is to either limit to _less than the guaranteed minimum_ (say 100Mb in that example), losing capacity at other times. Or if you have some way to fetch the real line speed at various times and adjust the queue speed in the ruleset. > --| > --normal > ---| > ---low > but hierarchy is not enough, we need priorities, since each of these three > queues can contain other queues. for example, the "high" queue may contain, > in addition to the "normal" queue, "icmp" and "ssh" queues, which are more > important than the "normal" queue, in which, for example, we will have http, > ftp and other non-critical traffic. therefore, we assign priority 0 to the > "normal" queue, priority 1 to the "ssh" queue and limit its maximum bandwidth > to 10mb(so that ssh does not eat up the entire channel when copying files), > and assign priority 2 to the "icmp" queue(icmp is more important than ssh). > i.e. icmp packets will leave first, then ssh packets, and then packets from > the "normal" queue and its subqueues(or they won't leave if we don't restrict > ssh and it eats up the entire channel) if PF doesn't know the real bandwidth, it _can't_ defer sending lower- priority traffic until after higher-prio has been sent, because it doesn't know if won't make it over the line...
Re: pf queues
yes, all this can be make without hierarchy, only with priorities(because hierarchy it's priorities), but who and why decided that eight would be enough? the one who created cbq- he created it for practical tasks. but this "hateful eight" and this "flat-earth"- i don't understand what use they are, they can't even solve such the simplified task :\ so what am i missing? man pf.conf Look for set tos. Just a few lines below set prio in the man age, You can have more then 8 if you need/have to.
Re: pf queues
> On Wed, Nov 29, 2023 at 12:12:02AM +0300, 4 wrote: >> i haven't used queues for a long time, but now there is a need. previously, >> queues had not only a hierarchy, but also a priority. now there is no >> priority, only the hierarchy exists. i was surprised, but i thought that >> this is quite in the way of Theo, and it is possible to simplify the queue >> mechanism only to the hierarchy, meaning that if a queue standing higher in >> the hierarchy, and he priority is higher. but in order for it to work this >> way, it is necessary to allow assigning packets to any queue, and not just >> to the last one, because when you assign only to the last queue in the >> hierarchy, then in practice it means that you have no hierarchy and no >> queues. and although the rule with the assignment to a queue above the last >> one is not syntactically incorrect, but in practice the assignment is not >> performed, and the packets fall into the default(last) queue. am i missing >> something or is it really idiocy that humanity has not seen yet? >> > How long ago is it that you did anything with queues? > the older ALTQ system was replaced by a whole new system back in OpenBSD 5.5 > (or actually, altq lived on as oldqeueue through 5.6), and the syntax is both > very different and in most things much simpler to deal with. > The most extensive treatment available is in The Book of PF, 3rd edition > (actually the introduction of the new queues was the reason for doing that > revision). If for some reason the book is out of reach, you can likely > glean most of the useful information from the relevant slides in the > PF tutorial https://home.nuug.no/~peter/pftutorial/ with the traffic > shaping part starting at https://home.nuug.no/~peter/pftutorial/#68 looks like i'm phenomenally dumb :( queue rootq on $ext_if bandwidth 20M queue main parent rootq bandwidth 20479K min 1M max 20479K qlimit 100 queue qdef parent main bandwidth 9600K min 6000K max 18M default queue qweb parent main bandwidth 9600K min 6000K max 18M queue qpri parent main bandwidth 700K min 100K max 1200K queue qdns parent main bandwidth 200K min 12K burst 600K for 3000ms queue spamd parent rootq bandwidth 1K min 0K max 1K qlimit 300 -- this is a flat model. no hierarchy here, because no priorities. it looks as hierarchy exists, but this is "fake news" :\ i can't immediately come up with at least one task where such a thing would be needed.. probably no such task exist. pass proto tcp to port ssh set prio 6 -- hard coded eight queues/priorities and no bandwidth controls. but this case is at least is useful, because priorities is much more important than bandwidth limits. i have a feeling that the person who came up with this is Mad Hatter from the Wonderland :\ what was wrong with the cbq engine where all was in one? here is a simple task, there are millions of such tasks. there is an internet connection, and although it is declared as symmetrical 100mbit it's 100 for download, but for upload it depends on the time of day, so we can forget about the channel width and focus on the only thing that matters- priorities. we make three queues, hierarchically connect them to one another: root -| -high --| --normal ---| ---low but hierarchy is not enough, we need priorities, since each of these three queues can contain other queues. for example, the "high" queue may contain, in addition to the "normal" queue, "icmp" and "ssh" queues, which are more important than the "normal" queue, in which, for example, we will have http, ftp and other non-critical traffic. therefore, we assign priority 0 to the "normal" queue, priority 1 to the "ssh" queue and limit its maximum bandwidth to 10mb(so that ssh does not eat up the entire channel when copying files), and assign priority 2 to the "icmp" queue(icmp is more important than ssh). i.e. icmp packets will leave first, then ssh packets, and then packets from the "normal" queue and its subqueues(or they won't leave if we don't restrict ssh and it eats up the entire channel). now: root -| -high[normal(0),ssh(1),icmp(2)] --| --normal[low(0),default(1),http(2),ftp(2)] ---| ---low[bittorrent(0),putin(0),vodka(0)] yes, all this can be make without hierarchy, only with priorities(because hierarchy it's priorities), but who and why decided that eight would be enough? the one who created cbq- he created it for practical tasks. but this "hateful eight" and this "flat-earth"- i don't understand what use they are, they can't even solve such the simplified task :\ so what am i missing?
Re: pf queues
On Wed, Nov 29, 2023 at 12:12:02AM +0300, 4 wrote: > i haven't used queues for a long time, but now there is a need. previously, > queues had not only a hierarchy, but also a priority. now there is no > priority, only the hierarchy exists. i was surprised, but i thought that this > is quite in the way of Theo, and it is possible to simplify the queue > mechanism only to the hierarchy, meaning that if a queue standing higher in > the hierarchy, and he priority is higher. but in order for it to work this > way, it is necessary to allow assigning packets to any queue, and not just to > the last one, because when you assign only to the last queue in the > hierarchy, then in practice it means that you have no hierarchy and no > queues. and although the rule with the assignment to a queue above the last > one is not syntactically incorrect, but in practice the assignment is not > performed, and the packets fall into the default(last) queue. am i missing > something or is it really idiocy that humanity has not seen yet? > How long ago is it that you did anything with queues? the older ALTQ system was replaced by a whole new system back in OpenBSD 5.5 (or actually, altq lived on as oldqeueue through 5.6), and the syntax is both very different and in most things much simpler to deal with. The most extensive treatment available is in The Book of PF, 3rd edition (actually the introduction of the new queues was the reason for doing that revision). If for some reason the book is out of reach, you can likely glean most of the useful information from the relevant slides in the PF tutorial https://home.nuug.no/~peter/pftutorial/ with the traffic shaping part starting at https://home.nuug.no/~peter/pftutorial/#68 -- Peter N. M. Hansteen, member of the first RFC 1149 implementation team https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/ "Remember to set the evil bit on all malicious network traffic" delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.
pf queues
i haven't used queues for a long time, but now there is a need. previously, queues had not only a hierarchy, but also a priority. now there is no priority, only the hierarchy exists. i was surprised, but i thought that this is quite in the way of Theo, and it is possible to simplify the queue mechanism only to the hierarchy, meaning that if a queue standing higher in the hierarchy, and he priority is higher. but in order for it to work this way, it is necessary to allow assigning packets to any queue, and not just to the last one, because when you assign only to the last queue in the hierarchy, then in practice it means that you have no hierarchy and no queues. and although the rule with the assignment to a queue above the last one is not syntactically incorrect, but in practice the assignment is not performed, and the packets fall into the default(last) queue. am i missing something or is it really idiocy that humanity has not seen yet?
Re: PF Rules for Dual Upstream Gateways
On 2023-11-22, Ian Timothy wrote: > Hello, > > I have two ISPs where one connection is primary and the other is > low-bandwidth for temporary failover only. ifstated handles the failover by > simply changing the default gateway. But under normal conditions I want to be > able to connect via either connection at any time without changing the > default gateway. > > A long time ago under the old pf syntax I had this in /etc/pf.conf which > worked fine, and as far as I can remember was the only thing needed to enable > this desired behavior: > > pass in on $wan1_if reply-to ( $wan1_if $wan1_gw ) > pass in on $wan2_if reply-to ( $wan2_if $wan2_gw ) > > But I’ve not been able to find the right way to do this under the new pf > syntax. From what I’ve been able to find this is supposedly does the same > thing, but no success so far: > > pass in on $wan1_if reply-to ($wan1_if:peer) > pass in on $wan2_if reply-to ($wan2_if:peer) The :peer syntax is for point-to-point interfaces (e.g. pppoe, maybe umb). > What am I missing? Or this there a better way to do this? As long as the gateway is at a known address (not a changing address from DHCP) this should do: pass in on $wan1_if reply-to $wan1_gw pass in on $wan2_if reply-to $wan2_gw You can also have a setup with multiple rtables, but in the simple case, reply-to is often easier. -- Please keep replies on the mailing list.
PF Rules for Dual Upstream Gateways
Hello, I have two ISPs where one connection is primary and the other is low-bandwidth for temporary failover only. ifstated handles the failover by simply changing the default gateway. But under normal conditions I want to be able to connect via either connection at any time without changing the default gateway. A long time ago under the old pf syntax I had this in /etc/pf.conf which worked fine, and as far as I can remember was the only thing needed to enable this desired behavior: pass in on $wan1_if reply-to ( $wan1_if $wan1_gw ) pass in on $wan2_if reply-to ( $wan2_if $wan2_gw ) But I’ve not been able to find the right way to do this under the new pf syntax. From what I’ve been able to find this is supposedly does the same thing, but no success so far: pass in on $wan1_if reply-to ($wan1_if:peer) pass in on $wan2_if reply-to ($wan2_if:peer) What am I missing? Or this there a better way to do this?
Re: pf logging in ascii and send to remote syslog
Thnx, this seems toasting better..
Re: pf logging in ascii and send to remote syslog
On Sat, Nov 11, 2023 at 06:32:26PM +0100, Daniele B. wrote: > > "Peter N. M. Hansteen" wrote: > > > something like the good old > > https://home.nuug.no/~peter/pf/newest/log2syslog.html should still > > work, I think. > > > > - Peter > > > To disable pflogd completely what to you consider best: > > ifconfig pflog0 down > > or > > pflogd_flags="-f /dev/null" > > > = Daniele Bonini > rcctl disable pflogd ? --
Re: pf logging in ascii and send to remote syslog
"Peter N. M. Hansteen" wrote: > something like the good old > https://home.nuug.no/~peter/pf/newest/log2syslog.html should still > work, I think. > > - Peter To disable pflogd completely what to you consider best: ifconfig pflog0 down or pflogd_flags="-f /dev/null" = Daniele Bonini
Re: pf logging in ascii and send to remote syslog
On 11.11.2023. 12:13, Stuart Henderson wrote: > On 2023-11-11, Peter N. M. Hansteen wrote: >> On Fri, Nov 10, 2023 at 08:23:54PM +0100, Hrvoje Popovski wrote: >>> what would be best way to log pf logs in ascii and sent it to remote >>> syslog ? I'm aware of pflow but I need ascii pf logs on remote syslog >>> server. >> >> something like the good old >> https://home.nuug.no/~peter/pf/newest/log2syslog.html >> should still work, I think. > > Or > https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/www/faq/pf/logging.html?rev=1.68#syslog > > If you don't need _all_ pf logs converting to syslog, you can create a > separate interface "echo up | doas tee /etc/hostname.pflog1" and use > "log to pflog1" on selected rules. > Thank you Peter and Stuart that's exactly what I need ...
Re: pf logging in ascii and send to remote syslog
On 2023-11-11, Peter N. M. Hansteen wrote: > On Fri, Nov 10, 2023 at 08:23:54PM +0100, Hrvoje Popovski wrote: >> what would be best way to log pf logs in ascii and sent it to remote >> syslog ? I'm aware of pflow but I need ascii pf logs on remote syslog >> server. > > something like the good old > https://home.nuug.no/~peter/pf/newest/log2syslog.html > should still work, I think. Or https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/www/faq/pf/logging.html?rev=1.68#syslog If you don't need _all_ pf logs converting to syslog, you can create a separate interface "echo up | doas tee /etc/hostname.pflog1" and use "log to pflog1" on selected rules.
Re: pf logging in ascii and send to remote syslog
On Fri, Nov 10, 2023 at 08:23:54PM +0100, Hrvoje Popovski wrote: > what would be best way to log pf logs in ascii and sent it to remote > syslog ? I'm aware of pflow but I need ascii pf logs on remote syslog > server. something like the good old https://home.nuug.no/~peter/pf/newest/log2syslog.html should still work, I think. - Peter -- Peter N. M. Hansteen, member of the first RFC 1149 implementation team https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/ "Remember to set the evil bit on all malicious network traffic" delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.
pf logging in ascii and send to remote syslog
Hi all, what would be best way to log pf logs in ascii and sent it to remote syslog ? I'm aware of pflow but I need ascii pf logs on remote syslog server. I remember that it was on https://www.openbsd.org/faq/pf/logging.html and that that section was removed. Old version is on https://www.dragonflybsd.org/~aggelos/pf/logging.html Is there maybe a better way how to do ascii logging besides what is in the old version of logging.html? Thank you.
Re: The Book of PF: Physical copies to be available again soon
On Sat, Nov 04, 2023 at 10:52:01AM -0400, Jay Hart wrote: > > Peter, > > Any plans to update it? Questions of the type "Are you working on a new edition of your book about ?" or the more general "Are you working on a book about ?" or even "When is your next book coming out?" are never going to be answered truthfully, or at all, by any writer or publisher unless a definite publication date has been set and they are confident that all the myriad factors that determine the outcome of the project are firmly under control. If the real question is, "Would it be safe for me to start writing a PF book?" My answer is no. There is no guarantee that the effort you put in will give satisfactory-to-you returns in any form or fashion. Writing is a time sink and publishers may or may not be interested. On the other hand if you are asking, "Should I start writing a book on PF or a related subject?", my take is, please do, if you feel that it is a thing worth doing. But again, keep in mind that writing a book and getting it published will eat up several significantly more than bite-sized chunks of your time, but if you feel that your book needs to be written, please go ahead. The reason The Book of PF exists is that I had a general idea of what kind of PF book I would like to see existing, and a work in progress manuscript existed that I showed to anyone interested. Fortunately enough people relevant to getting the book actually published (and revised twice so far) agreed that this book needed to happen. When I get to the point that a new edition of The Book of PF or any other book relevant to OpenBSD that I am able to write is certain to be published at a specific time, this mailing list will be one of the first public forums that will receive notification. That much I will promise. All the best, Peter -- Peter N. M. Hansteen, member of the first RFC 1149 implementation team https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/ "Remember to set the evil bit on all malicious network traffic" delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.
Re: The Book of PF: Physical copies to be available again soon
Peter, Any plans to update it? R/, Jay > For those interested in physical copies of The Book of PF > (https://nostarch.com/pf3) > -- it has been out of print, only available in electronic formats for a while > -- > I just got word from No Starch Press (the publisher) that they are expecting > to > have a fresh batch arriving at their warehouse within the next few weeks. > > I will share any details with those interested when I have them. > > All the best, > Peter > > -- > Peter N. M. Hansteen, member of the first RFC 1149 implementation team > https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/ > "Remember to set the evil bit on all malicious network traffic" > delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds. > >
Re: Parallel PF
Hello Valdrin, I am also aware that attaching PF to more than one CPU will not be enough, and I think I have been misunderstood; I do not reproach about this. Just a curiosity on my part. As far as I learned from users who wrote me private messages, OpenBSD does not have a public RoadMap. Of course, I respect this. I was just asking if maybe a developer would come along and at least satisfy the curiosity of us users, even if he didn't give a date. I apologize again for being misunderstood. I never meant to be rude. As a user, I still don't think wishing multiple PF tasks is a bad thing. On the contrary, I think user experience will be given importance. On the other hand, I will upgrade my OpenBSD 7.3 firewall running in a VMware environment to 7.4. By the way, I will use ix(4) instead of vmx, thanks to SR-IOV technology, and I am sure I will get a performance increase. I would like to thank the OpenBSD team for developing this beautiful operating system. On Wed, Oct 25, 2023 at 5:18 PM Valdrin MUJA wrote: > Hello Sam, > > I don't have the answer to this question, but I can make a few comments on > my own behalf. Maybe it can give you an idea. > As far as I observed, it is not PF's turn yet. I guess what needs to be > done regarding cloned interfaces such as tun and the ethernet layer will be > done first. In fact, as far as I follow, there are some issues in the > UDP_input section. > Of course, I'm sure a lot will change when PF becomes mp-safe, but I > believe there is still time for that. > PF's performance can reach up to 10Gbps with the right CPU selection. Do > you have traffic that exceeds this? Maybe if you can provide specific > information there will be a chance for someone to help. > -- > *From:* owner-m...@openbsd.org on behalf of > Samuel Jayden > *Sent:* Tuesday, October 24, 2023 17:54 > *To:* Irreverent Monk > *Cc:* misc@openbsd.org > *Subject:* Re: Parallel PF > > I shared a naive user experience. I didn't mean to be rude. Anyway, thank > you for reading and responding. > > On Tue, Oct 24, 2023 at 5:46 PM Irreverent Monk > wrote: > > > The standard response is - show your code. If you sit down and think > > about it, isn't it rude to go to a project to tell them that they must > > prioritize what they are doing for what you want...? > > > > On Tue, Oct 24, 2023 at 6:40 AM Samuel Jayden < > samueljaydan1...@gmail.com> > > wrote: > > > >> Hello dear OpenBSD team, > >> > >> I'm sure that something like parallel IP forwarding and increasing > the > >> number of softnet kernel tasks to 4 is definitely being considered on > the > >> PF side too, but I would like to express my concern about timing. Do you > >> have any schedule for this? > >> > >> I think one of the common prayers of all OpenBSD users is that PF will > >> speed up. Thank you for reading and my best regards. > >> > >> -- > >> Sam > >> > > >
Re: Parallel PF
Hello Gábor, Of course, I am aware of OpenBSD's parallel forwarding implementation. The owner of this thread already mentioned this in his e-mail. I can reach 10Gbps speed via speedtest.net. Here my gateway is a Server with OpenBSD 7.3 installed... I also get similar values with Cisco-Trex. I can say that OpenBSD is more successful with 1518 byte TCP packets rather than 64 byte UDP packets. From: owner-m...@openbsd.org on behalf of Gábor LENCSE Sent: Wednesday, October 25, 2023 18:47 To: misc@openbsd.org Subject: Re: Parallel PF Hello Valdrin, 10/25/2023 4:18 PM keltezéssel, Valdrin MUJA írta: > Hello Sam, > > I don't have the answer to this question, but I can make a few comments on my > own behalf. Maybe it can give you an idea. > As far as I observed, it is not PF's turn yet. I guess what needs to be done > regarding cloned interfaces such as tun and the ethernet layer will be done > first. In fact, as far as I follow, there are some issues in the UDP_input > section. I have been somewhat surprised at this information. OpenBSD can use up to 4 softnet tasks for parallel IP packet forwarding since version 7.2. Please see "SMP Improvements" in page: https://www.openbsd.org/72.html > Of course, I'm sure a lot will change when PF becomes mp-safe, but I believe > there is still time for that. > PF's performance can reach up to 10Gbps with the right CPU selection. Expressing traffic in Gbps can be rather ambiguous. What frame size did you use? 64-byte or 1518-byte? The first one needs 14,880,952pps to saturate a 10Gbps link, whereas the second one can do it with 812,743pps. Please refer to: https://datatracker.ietf.org/doc/html/rfc5180#appendix-A.1 Best regards, Gábor
Re: Parallel PF
Hello Valdrin, 10/25/2023 4:18 PM keltezéssel, Valdrin MUJA írta: Hello Sam, I don't have the answer to this question, but I can make a few comments on my own behalf. Maybe it can give you an idea. As far as I observed, it is not PF's turn yet. I guess what needs to be done regarding cloned interfaces such as tun and the ethernet layer will be done first. In fact, as far as I follow, there are some issues in the UDP_input section. I have been somewhat surprised at this information. OpenBSD can use up to 4 softnet tasks for parallel IP packet forwarding since version 7.2. Please see "SMP Improvements" in page: https://www.openbsd.org/72.html Of course, I'm sure a lot will change when PF becomes mp-safe, but I believe there is still time for that. PF's performance can reach up to 10Gbps with the right CPU selection. Expressing traffic in Gbps can be rather ambiguous. What frame size did you use? 64-byte or 1518-byte? The first one needs 14,880,952pps to saturate a 10Gbps link, whereas the second one can do it with 812,743pps. Please refer to: https://datatracker.ietf.org/doc/html/rfc5180#appendix-A.1 Best regards, Gábor
Re: Parallel PF
Hello Sam, I don't have the answer to this question, but I can make a few comments on my own behalf. Maybe it can give you an idea. As far as I observed, it is not PF's turn yet. I guess what needs to be done regarding cloned interfaces such as tun and the ethernet layer will be done first. In fact, as far as I follow, there are some issues in the UDP_input section. Of course, I'm sure a lot will change when PF becomes mp-safe, but I believe there is still time for that. PF's performance can reach up to 10Gbps with the right CPU selection. Do you have traffic that exceeds this? Maybe if you can provide specific information there will be a chance for someone to help. From: owner-m...@openbsd.org on behalf of Samuel Jayden Sent: Tuesday, October 24, 2023 17:54 To: Irreverent Monk Cc: misc@openbsd.org Subject: Re: Parallel PF I shared a naive user experience. I didn't mean to be rude. Anyway, thank you for reading and responding. On Tue, Oct 24, 2023 at 5:46 PM Irreverent Monk wrote: > The standard response is - show your code. If you sit down and think > about it, isn't it rude to go to a project to tell them that they must > prioritize what they are doing for what you want...? > > On Tue, Oct 24, 2023 at 6:40 AM Samuel Jayden > wrote: > >> Hello dear OpenBSD team, >> >> I'm sure that something like parallel IP forwarding and increasing the >> number of softnet kernel tasks to 4 is definitely being considered on the >> PF side too, but I would like to express my concern about timing. Do you >> have any schedule for this? >> >> I think one of the common prayers of all OpenBSD users is that PF will >> speed up. Thank you for reading and my best regards. >> >> -- >> Sam >> >
Re: Parallel PF
I shared a naive user experience. I didn't mean to be rude. Anyway, thank you for reading and responding. On Tue, Oct 24, 2023 at 5:46 PM Irreverent Monk wrote: > The standard response is - show your code. If you sit down and think > about it, isn't it rude to go to a project to tell them that they must > prioritize what they are doing for what you want...? > > On Tue, Oct 24, 2023 at 6:40 AM Samuel Jayden > wrote: > >> Hello dear OpenBSD team, >> >> I'm sure that something like parallel IP forwarding and increasing the >> number of softnet kernel tasks to 4 is definitely being considered on the >> PF side too, but I would like to express my concern about timing. Do you >> have any schedule for this? >> >> I think one of the common prayers of all OpenBSD users is that PF will >> speed up. Thank you for reading and my best regards. >> >> -- >> Sam >> >
Re: Parallel PF
The standard response is - show your code. If you sit down and think about it, isn't it rude to go to a project to tell them that they must prioritize what they are doing for what you want...? On Tue, Oct 24, 2023 at 6:40 AM Samuel Jayden wrote: > Hello dear OpenBSD team, > > I'm sure that something like parallel IP forwarding and increasing the > number of softnet kernel tasks to 4 is definitely being considered on the > PF side too, but I would like to express my concern about timing. Do you > have any schedule for this? > > I think one of the common prayers of all OpenBSD users is that PF will > speed up. Thank you for reading and my best regards. > > -- > Sam >
Parallel PF
Hello dear OpenBSD team, I'm sure that something like parallel IP forwarding and increasing the number of softnet kernel tasks to 4 is definitely being considered on the PF side too, but I would like to express my concern about timing. Do you have any schedule for this? I think one of the common prayers of all OpenBSD users is that PF will speed up. Thank you for reading and my best regards. -- Sam
Feature request for pf: allow embedding IPv4 into source address of af-to IPv6 packets (like SIIT/EAMT/NAT46)
Congratulations on a successful 7.4 release! I'm writing with a gentle feature request for pf; I asked about this functionality a long time ago and have seen a few other related questions on the list since then. Now that I've played with another NAT64 implementation (Jool), I think I can articulate myself a little better. Summary: request to modify pf to support the following syntax to embed IPv4 address in IPv6 source addresses after af-to translation: pass in inet af-to inet6 from /96 to This would require modifying the "from" portion of the af-to to look for a mask of /96 (or smaller) on the from address and take the actions below. In the line above, everything you see is currently supported except for the "/96" on the "from" address. If the mask is greater or omitted (e.g. /128) then the current behavior is used, making it backwards-compatible when omitted. If a /96 on the "from" is detected, the packet is translated between families as currently, with one modification: the packet source address will have the IPv4 source address embedded in the lower 32 bits of the IPv6 source address. Example: In practice, this syntax would also typically include a matching portion for the original IPv4 destination address so traffic destined for a particular IPv4 destination can be forwarded to a specific IPv6 node. Thus, a typical invocation would be like: pass in inet to 192.0.2.10 af-to inet6 from 64:ff9b::/96 to 2001:db8:b::10 The line above would result in a packet with IPv4 source 203.0.113.42 and destined for 192.0.2.10 being translated to have IPv6 source 64:ff9b::cb00:712a and destination 2001:db8:b::10. Discussion: This would allow PF to support SIIT EAMT / SIIT-DC behavior (sometimes called "NAT46"). The use case here would be a dual-stack pf box at the network edge, and servers behind it with IPv6-only connectivity. IPv4 requests from the internet would hit the pf box, be translated and forwarded over IPv6 to the servers. Because the IPv6 address contains the embedded IPv4 source address, logging and analysis on the server would have access to the full source address, rather than all traffic being "squashed" to a single IPv6 source. It would be the responsibility of the network operator to ensure return traffic (to 64:ff9b::/96 in the above example) is routed back to the pf box. I feel pf is the best place to add this functionality (rather than relayd or other code) because it is already capable of performing the family translation, and only needs to have the address-embedding functionality added for source addresses (it already exists for destination addresses). Thus, the necessary concepts are already in place elsewhere in the code; they need to be replicated for source addresses. Specifically, pf can already map an IPv4 *destination* address into the lower 32 bits of an IPv6 address using af-to: pass in inet af-to inet6 from 2001:db8::1 to 2001:db8::/96 This supports CLAT-type functionality where IPv4 traffic needs to be sent to a PLAT (typically network edge). Additionally, pf can currently translate IPv4 traffic from any host to a specific destination: pass in inet to 192.0.2.10 af-to inet6 from 2001:db8:a::1 to 2001:db8:b::10 However, this relies on pf's state table much like traditional NAT44: all traffic is arbitrarily mapped to a new source address and the destination server sees only the pf box's address as the source of the traffic. This request is to enable IPv4 /96 embedding on the *source* address; nodes that come after the translation will be able to see the full IPv4 source address embedded in the IPv6 address. Because the entire IPv4 address space can be embedded in a single IPv6 /96 prefix, no information is lost and so the translation does not require state (the return traffic can be turned back into IPv4 by simply un-embedding the IPv4 address). However, I recognize that pf may only operate in a stateful manner due to the way af-to is implemented, and state may be desirable for other pf functionality. However, even without truly being "stateless", the address embedding would support the same functionality as true SIIT implementation. Syntax changes would be minimal; pfctl would need to recognize a /96 on the source "from" for the af-to and activate the embedding behavior. As embedding is already implemented for destinations of /96, I'm hoping there is some opportunity for reuse. If a /96 is not seen on the "from" specification, then pf's current behavior can be used. A small backwards compatibility issue exists in that the current af-to source specification allows addresses with a mask other than /128. However, so far as I can tell, any mask is ignored and only the specified address is used at the present time, so anyone specifying that in their config is using an unsupported mask option. Unfortun
Re: PF queue bandwidth limited to 32bit value
> On 15 Sep 2023, at 18:54, Stuart Henderson wrote: > > On 2023/09/15 13:40, Andy Lemin wrote: >> Hi Stuart, >> >> Seeing as it seems like everyone is too busy, and my workaround >> (not queue some flows on interfaces with queue defined) seems of no >> interest, > > well, it might be, but I'm not sure if it will fit with how > queues work.. Well I can only hope some more developers sees this :) > >> and my current hack to use queuing on Vlan interfaces is >> a very incomplete and restrictive workaround; Would you please be >> so kind as to provide me with a starting point in the source code >> and variable names to concentrate on, where I can start tracing from >> beginning to end for changing the scale from bits to bytes? > > maybe try hfsc.c, but overall there are quite a few files involved > in queue definition and use from start to finish. or going from the > other side start with how pfctl defines queues and follow through > from there. > Thank you, I will try (best effort as time permits), and see how far I get.. (probably not far ;)
Re: PF queue bandwidth limited to 32bit value
On 2023/09/15 13:40, Andy Lemin wrote: > Hi Stuart, > > Seeing as it seems like everyone is too busy, and my workaround > (not queue some flows on interfaces with queue defined) seems of no > interest, well, it might be, but I'm not sure if it will fit with how queues work.. > and my current hack to use queuing on Vlan interfaces is > a very incomplete and restrictive workaround; Would you please be > so kind as to provide me with a starting point in the source code > and variable names to concentrate on, where I can start tracing from > beginning to end for changing the scale from bits to bytes? maybe try hfsc.c, but overall there are quite a few files involved in queue definition and use from start to finish. or going from the other side start with how pfctl defines queues and follow through from there.
Re: PF queue bandwidth limited to 32bit value
Hi Stuart,Seeing as it seems like everyone is too busy, and my workaround (not queue some flows on interfaces with queue defined) seems of no interest, and my current hack to use queuing on Vlan interfaces is a very incomplete and restrictive workaround;Would you please be so kind as to provide me with a starting point in the source code and variable names to concentrate on, where I can start tracing from beginning to end for changing the scale from bits to bytes?Thanks :)AndyOn 14 Sep 2023, at 19:34, Andrew Lemin wrote:On Thu, Sep 14, 2023 at 7:23 PM Andrew Leminwrote:On Wed, Sep 13, 2023 at 8:35 PM Stuart Henderson wrote:On 2023-09-13, Andrew Lemin wrote: > I have noticed another issue while trying to implement a 'prio'-only > workaround (using only prio ordering for inter-VLAN traffic, and HSFC > queuing for internet traffic); > It is not possible to have internal inter-vlan traffic be solely priority > ordered with 'set prio', as the existence of 'queue' definitions on the > same internal vlan interfaces (required for internet flows), demands one > leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into > the 'default' queue despite queuing not being wanted, and so > unintentionally clamping all internal traffic to 4294M just because full > queuing is needed for internet traffic. If you enable queueing on an interface all traffic sent via that interface goes via one queue or another.Yes, that is indeed the very problem. Queueing is enabled on the inside interfaces, with bandwidth values set slightly below the ISP capacities (multiple ISP links as well), so that all things work well for all internal users.However this means that inter-vlan traffic from client networks to server networks are restricted to 4294Mbps for no reason.. It would make a huge difference to be able to allow local traffic to flow without being queued/restircted. (also, AIUI the correct place for queues is on the physical interface not the vlan, since that's where the bottleneck is... you can assign traffic to a queue name as it comes in on the vlan but I believe the actual queue definition should be on the physical iface).Hehe yes I know. Thanks for sharing though.I actually have very specific reasons for doing this (queues on the VLAN ifaces rather than phy) as there are multiple ISP connections for multiple VLANs, so the VLAN queues are set to restrict for the relevant ISP link etc.Also separate to the multiple ISPs (I wont bore you with why as it is not relevant here), the other reason for queueing on the VLANs is because it allows you to get closer to the 10Gbps figure..Ie, If you have queues on the 10Gbps PHY, you can only egress 4294Mbps to _all_ VLANs. But if you have queues per-VLAN iface, you can egress multiple times 4294Mbps on aggregate.Eg, vlans 10,11,12,13 on single mcx0 trunk. 10->11 can do 4294Mbps and 12->13 can do 4294Mbps, giving over 8Gbps egress in total on the PHY. It is dirty, but like I said, desperate for workarounds... :( "required for internet flows" - depends on your network layout.. the upstream feed doesn't have to go via the same interface as inter-vlan traffic.I'm not sure what you mean. All the internal networks/vlans are connected to local switches, and the switches have trunk to the firewall which hosts the default gateway for the VLANs and does inter-vlan routing.So all the clients go through the same VLANs/trunk/gateway for inter-vlan as they do for internet. Strict L3/4 filtering is required on inter-vlan traffic.I am honestly looking for support to recognise that this is a correct, valid and common setup, and so there is a genuine need to allow flows to not be queued on interfaces that have queues (which has many potential applications for many use cases, not just mine - so should be of interest to the developers?).Do you know why there has to be a default queue? Yes I know that traffic excluded from queues would take from the same interface the queueing is trying to manage, and potentially causes congestion. However with 10Gbps networking which is beyond common now, this does not matter when the queues are stuck at 4294MbpsDesperately trying to find workarounds that appeal.. Surely the need is a no brainer, and it is just a case of trying to encourage interest from a developer?Thanks :)
Re: PF queue bandwidth limited to 32bit value
On Thu, Sep 14, 2023 at 7:23 PM Andrew Lemin wrote: > > > On Wed, Sep 13, 2023 at 8:35 PM Stuart Henderson < > stu.li...@spacehopper.org> wrote: > >> On 2023-09-13, Andrew Lemin wrote: >> > I have noticed another issue while trying to implement a 'prio'-only >> > workaround (using only prio ordering for inter-VLAN traffic, and HSFC >> > queuing for internet traffic); >> > It is not possible to have internal inter-vlan traffic be solely >> priority >> > ordered with 'set prio', as the existence of 'queue' definitions on the >> > same internal vlan interfaces (required for internet flows), demands one >> > leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into >> > the 'default' queue despite queuing not being wanted, and so >> > unintentionally clamping all internal traffic to 4294M just because full >> > queuing is needed for internet traffic. >> >> If you enable queueing on an interface all traffic sent via that >> interface goes via one queue or another. >> > > Yes, that is indeed the very problem. Queueing is enabled on the inside > interfaces, with bandwidth values set slightly below the ISP capacities > (multiple ISP links as well), so that all things work well for all internal > users. > However this means that inter-vlan traffic from client networks to server > networks are restricted to 4294Mbps for no reason.. It would make a huge > difference to be able to allow local traffic to flow without being > queued/restircted. > > >> >> (also, AIUI the correct place for queues is on the physical interface >> not the vlan, since that's where the bottleneck is... you can assign >> traffic to a queue name as it comes in on the vlan but I believe the >> actual queue definition should be on the physical iface). >> > > Hehe yes I know. Thanks for sharing though. > I actually have very specific reasons for doing this (queues on the VLAN > ifaces rather than phy) as there are multiple ISP connections for multiple > VLANs, so the VLAN queues are set to restrict for the relevant ISP link etc. > Also separate to the multiple ISPs (I wont bore you with why as it is not relevant here), the other reason for queueing on the VLANs is because it allows you to get closer to the 10Gbps figure.. Ie, If you have queues on the 10Gbps PHY, you can only egress 4294Mbps to _all_ VLANs. But if you have queues per-VLAN iface, you can egress multiple times 4294Mbps on aggregate. Eg, vlans 10,11,12,13 on single mcx0 trunk. 10->11 can do 4294Mbps and 12->13 can do 4294Mbps, giving over 8Gbps egress in total on the PHY. It is dirty, but like I said, desperate for workarounds... :( > > >> >> "required for internet flows" - depends on your network layout.. the >> upstream feed doesn't have to go via the same interface as inter-vlan >> traffic. > > > I'm not sure what you mean. All the internal networks/vlans are connected > to local switches, and the switches have trunk to the firewall which hosts > the default gateway for the VLANs and does inter-vlan routing. > So all the clients go through the same VLANs/trunk/gateway for inter-vlan > as they do for internet. Strict L3/4 filtering is required on inter-vlan > traffic. > I am honestly looking for support to recognise that this is a correct, > valid and common setup, and so there is a genuine need to allow flows to > not be queued on interfaces that have queues (which has many potential > applications for many use cases, not just mine - so should be of interest > to the developers?). > > Do you know why there has to be a default queue? Yes I know that traffic > excluded from queues would take from the same interface the queueing is > trying to manage, and potentially causes congestion. However with 10Gbps > networking which is beyond common now, this does not matter when the queues > are stuck at 4294Mbps > > Desperately trying to find workarounds that appeal.. Surely the need is a > no brainer, and it is just a case of trying to encourage interest from a > developer? > > Thanks :) >
Re: PF queue bandwidth limited to 32bit value
On Wed, Sep 13, 2023 at 8:35 PM Stuart Henderson wrote: > On 2023-09-13, Andrew Lemin wrote: > > I have noticed another issue while trying to implement a 'prio'-only > > workaround (using only prio ordering for inter-VLAN traffic, and HSFC > > queuing for internet traffic); > > It is not possible to have internal inter-vlan traffic be solely priority > > ordered with 'set prio', as the existence of 'queue' definitions on the > > same internal vlan interfaces (required for internet flows), demands one > > leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into > > the 'default' queue despite queuing not being wanted, and so > > unintentionally clamping all internal traffic to 4294M just because full > > queuing is needed for internet traffic. > > If you enable queueing on an interface all traffic sent via that > interface goes via one queue or another. > Yes, that is indeed the very problem. Queueing is enabled on the inside interfaces, with bandwidth values set slightly below the ISP capacities (multiple ISP links as well), so that all things work well for all internal users. However this means that inter-vlan traffic from client networks to server networks are restricted to 4294Mbps for no reason.. It would make a huge difference to be able to allow local traffic to flow without being queued/restircted. > > (also, AIUI the correct place for queues is on the physical interface > not the vlan, since that's where the bottleneck is... you can assign > traffic to a queue name as it comes in on the vlan but I believe the > actual queue definition should be on the physical iface). > Hehe yes I know. Thanks for sharing though. I actually have very specific reasons for doing this (queues on the VLAN ifaces rather than phy) as there are multiple ISP connections for multiple VLANs, so the VLAN queues are set to restrict for the relevant ISP link etc. > > "required for internet flows" - depends on your network layout.. the > upstream feed doesn't have to go via the same interface as inter-vlan > traffic. I'm not sure what you mean. All the internal networks/vlans are connected to local switches, and the switches have trunk to the firewall which hosts the default gateway for the VLANs and does inter-vlan routing. So all the clients go through the same VLANs/trunk/gateway for inter-vlan as they do for internet. Strict L3/4 filtering is required on inter-vlan traffic. I am honestly looking for support to recognise that this is a correct, valid and common setup, and so there is a genuine need to allow flows to not be queued on interfaces that have queues (which has many potential applications for many use cases, not just mine - so should be of interest to the developers?). Do you know why there has to be a default queue? Yes I know that traffic excluded from queues would take from the same interface the queueing is trying to manage, and potentially causes congestion. However with 10Gbps networking which is beyond common now, this does not matter when the queues are stuck at 4294Mbps Desperately trying to find workarounds that appeal.. Surely the need is a no brainer, and it is just a case of trying to encourage interest from a developer? Thanks :)
Re: PF queue bandwidth limited to 32bit value
On Wed, Sep 13, 2023 at 8:22 PM Stuart Henderson wrote: > On 2023-09-12, Andrew Lemin wrote: > > A, thats clever! Having bandwidth queues up to 34,352M would > definitely > > provide runway for the next decade :) > > > > Do you think your idea is worth circulating on tech@ for further > > discussion? Queueing at bps resolution is rather redundant nowadays, even > > on the very slowest links. > > tech@ is more for diffs or technical questions rather than not-fleshed-out > quick ideas. Doing this would solve some problems with the "just change it > to 64-bit" mooted on the freebsd-pf list (not least with 32-bit archs), > but would still need finding all the places where the bandwidth values are > used and making sure they're updated to cope. > > Yes good point :) I am not in a position to undertake this myself at the moment. If none of the generous developers feel included to do this despite the broad value, I might have a go myself at some point (probably not able until next year sadly). "just change it to 64-bit" mooted on the freebsd-pf list - I have been unable to find this conversation. Do you have a link? > > -- > Please keep replies on the mailing list. > >
Re: PF queue bandwidth limited to 32bit value
On 2023-09-13, Andrew Lemin wrote: > I have noticed another issue while trying to implement a 'prio'-only > workaround (using only prio ordering for inter-VLAN traffic, and HSFC > queuing for internet traffic); > It is not possible to have internal inter-vlan traffic be solely priority > ordered with 'set prio', as the existence of 'queue' definitions on the > same internal vlan interfaces (required for internet flows), demands one > leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into > the 'default' queue despite queuing not being wanted, and so > unintentionally clamping all internal traffic to 4294M just because full > queuing is needed for internet traffic. If you enable queueing on an interface all traffic sent via that interface goes via one queue or another. (also, AIUI the correct place for queues is on the physical interface not the vlan, since that's where the bottleneck is... you can assign traffic to a queue name as it comes in on the vlan but I believe the actual queue definition should be on the physical iface). "required for internet flows" - depends on your network layout.. the upstream feed doesn't have to go via the same interface as inter-vlan traffic.
Re: PF queue bandwidth limited to 32bit value
On 2023-09-12, Andrew Lemin wrote: > A, thats clever! Having bandwidth queues up to 34,352M would definitely > provide runway for the next decade :) > > Do you think your idea is worth circulating on tech@ for further > discussion? Queueing at bps resolution is rather redundant nowadays, even > on the very slowest links. tech@ is more for diffs or technical questions rather than not-fleshed-out quick ideas. Doing this would solve some problems with the "just change it to 64-bit" mooted on the freebsd-pf list (not least with 32-bit archs), but would still need finding all the places where the bandwidth values are used and making sure they're updated to cope. -- Please keep replies on the mailing list.
Re: PF queue bandwidth limited to 32bit value
On Wed, Sep 13, 2023 at 3:43 AM Andrew Lemin wrote: > Hi Stuart. > > On Wed, Sep 13, 2023 at 12:25 AM Stuart Henderson < > stu.li...@spacehopper.org> wrote: > >> On 2023-09-12, Andrew Lemin wrote: >> > Hi all, >> > Hope this finds you well. >> > >> > I have discovered that PF's queueing is still limited to 32bit bandwidth >> > values. >> > >> > I don't know if this is a regression or not. >> >> It's not a regression, it has been capped at 32 bits afaik forever >> (certainly was like that when the separate classification via altq.conf >> was merged into PF config, in OpenBSD 3.3). >> > > Ah ok, it was talked about so much I thought it was part of it. Thanks for > clarifying. > > >> >> > I am sure one of the >> > objectives of the ALTQ rewrite into the new queuing system we have in >> > OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I >> am >> > imagining it.. >> >> I don't recall that though there were some hopes expressed by >> non-developers. >> > > Haha, it is definitely still wanted and needed. prio-only based ordering > is too limited > I have noticed another issue while trying to implement a 'prio'-only workaround (using only prio ordering for inter-VLAN traffic, and HSFC queuing for internet traffic); It is not possible to have internal inter-vlan traffic be solely priority ordered with 'set prio', as the existence of 'queue' definitions on the same internal vlan interfaces (required for internet flows), demands one leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into the 'default' queue despite queuing not being wanted, and so unintentionally clamping all internal traffic to 4294M just because full queuing is needed for internet traffic. In fact 'prio' is irrelevant, as with or without 'prio' because queue's are required for internet traffic, all internal traffic becomes bound by the 'default' HSFC queue. So I would propose that the mandate on the 'default' keyword is relaxed (or a new keyword is provided for match/pass rules to force flows to not be queued), and/or implement the uint32 scale in bytes, instead of bits? I personally believe both are valid and needed? > > >> >> > Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN >> routing >> > with 10Gbps trunks, and I cannot set the queue bandwidth higher than a >> > 32bit value? >> > >> > Setting the bandwidth value to 4295M results in a value overflow where >> > 'systat queues' shows it wrapped and starts from 0 again. And traffic is >> > indeed restricted to such values, so does not appear to be just a >> cosmetic >> > 'systat queues' issue. >> > >> > I am sure this must be a bug/regression, >> >> I'd say a not-implemented feature (and I have a feeling it is not >> going to be all that simple a thing to implement - though changing >> scales so the uint32 carries bytes instead of bits per second might >> not be _too_ terrible). >> > > Following the great work to SMP unlock in the VLAN interface, and recent > NIC optimisations (offloading and interrupt handling) in various drivers, > you can now push packet filtered 10Gbps with modern CPUs without breaking a > sweat.. > > A, thats clever! Having bandwidth queues up to 34,352M would > definitely provide runway for the next decade :) > > Do you think your idea is worth circulating on tech@ for further > discussion? Queueing at bps resolution is rather redundant nowadays, even > on the very slowest links. > > >> > 10Gbps on OpenBSD is trivial >> and >> > common nowadays.. >> >> While using interfaces with 10Gbps link speed on OpenBSD is trivial, >> actually pushing that much traffic (particularly with more complex >> processing e.g. things like bandwidth controls, and particularly with >> smaller packet sizes) not so much. >> >> >> -- >> Please keep replies on the mailing list. >> >> Thanks again, Andy.
Re: PF queue bandwidth limited to 32bit value
Hi Stuart. On Wed, Sep 13, 2023 at 12:25 AM Stuart Henderson wrote: > On 2023-09-12, Andrew Lemin wrote: > > Hi all, > > Hope this finds you well. > > > > I have discovered that PF's queueing is still limited to 32bit bandwidth > > values. > > > > I don't know if this is a regression or not. > > It's not a regression, it has been capped at 32 bits afaik forever > (certainly was like that when the separate classification via altq.conf > was merged into PF config, in OpenBSD 3.3). > Ah ok, it was talked about so much I thought it was part of it. Thanks for clarifying. > > > I am sure one of the > > objectives of the ALTQ rewrite into the new queuing system we have in > > OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I > am > > imagining it.. > > I don't recall that though there were some hopes expressed by > non-developers. > Haha, it is definitely still wanted and needed. prio-only based ordering is too limited > > > Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN > routing > > with 10Gbps trunks, and I cannot set the queue bandwidth higher than a > > 32bit value? > > > > Setting the bandwidth value to 4295M results in a value overflow where > > 'systat queues' shows it wrapped and starts from 0 again. And traffic is > > indeed restricted to such values, so does not appear to be just a > cosmetic > > 'systat queues' issue. > > > > I am sure this must be a bug/regression, > > I'd say a not-implemented feature (and I have a feeling it is not > going to be all that simple a thing to implement - though changing > scales so the uint32 carries bytes instead of bits per second might > not be _too_ terrible). > Following the great work to SMP unlock in the VLAN interface, and recent NIC optimisations (offloading and interrupt handling) in various drivers, you can now push packet filtered 10Gbps with modern CPUs without breaking a sweat.. A, thats clever! Having bandwidth queues up to 34,352M would definitely provide runway for the next decade :) Do you think your idea is worth circulating on tech@ for further discussion? Queueing at bps resolution is rather redundant nowadays, even on the very slowest links. > > 10Gbps on OpenBSD is trivial and > > common nowadays.. > > While using interfaces with 10Gbps link speed on OpenBSD is trivial, > actually pushing that much traffic (particularly with more complex > processing e.g. things like bandwidth controls, and particularly with > smaller packet sizes) not so much. > > > -- > Please keep replies on the mailing list. > >
Re: PF queue bandwidth limited to 32bit value
On 2023-09-12, Andrew Lemin wrote: > Hi all, > Hope this finds you well. > > I have discovered that PF's queueing is still limited to 32bit bandwidth > values. > > I don't know if this is a regression or not. It's not a regression, it has been capped at 32 bits afaik forever (certainly was like that when the separate classification via altq.conf was merged into PF config, in OpenBSD 3.3). > I am sure one of the > objectives of the ALTQ rewrite into the new queuing system we have in > OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I am > imagining it.. I don't recall that though there were some hopes expressed by non-developers. > Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN routing > with 10Gbps trunks, and I cannot set the queue bandwidth higher than a > 32bit value? > > Setting the bandwidth value to 4295M results in a value overflow where > 'systat queues' shows it wrapped and starts from 0 again. And traffic is > indeed restricted to such values, so does not appear to be just a cosmetic > 'systat queues' issue. > > I am sure this must be a bug/regression, I'd say a not-implemented feature (and I have a feeling it is not going to be all that simple a thing to implement - though changing scales so the uint32 carries bytes instead of bits per second might not be _too_ terrible). > 10Gbps on OpenBSD is trivial and > common nowadays.. While using interfaces with 10Gbps link speed on OpenBSD is trivial, actually pushing that much traffic (particularly with more complex processing e.g. things like bandwidth controls, and particularly with smaller packet sizes) not so much. -- Please keep replies on the mailing list.
PF queue bandwidth limited to 32bit value
Hi all, Hope this finds you well. I have discovered that PF's queueing is still limited to 32bit bandwidth values. I don't know if this is a regression or not. I am sure one of the objectives of the ALTQ rewrite into the new queuing system we have in OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I am imagining it.. Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN routing with 10Gbps trunks, and I cannot set the queue bandwidth higher than a 32bit value? Setting the bandwidth value to 4295M results in a value overflow where 'systat queues' shows it wrapped and starts from 0 again. And traffic is indeed restricted to such values, so does not appear to be just a cosmetic 'systat queues' issue. I am sure this must be a bug/regression, 10Gbps on OpenBSD is trivial and common nowadays.. Tested on OpenBSD 7.3 Thanks for checking my sanity :) Andy.
Re: pf state-table-induced instability
On Thu, Aug 31, 2023 at 04:10:06PM +0200, Gabor LENCSE wrote: > Dear David, > > Thank you very much for all the new information! > > I keep only those parts that I want to react. > > > > It is not a fundamental issue, but it seems to me that during my tests not > > > only four but five CPU cores were used by IP packet forwarding: > > the packet processing is done in kernel threads (task queues are built > > on threads), and those threads could be scheduled on any cpu. the > > pf purge processing runs in yet another thread. > > > > iirc, the schedule scans down the list of cpus looking for an idle > > one when it needs to run stuff, except to avoid cpu0 if possible. > > this is why you see most of the system time on cpus 1 to 5. > > Yes, I can confirm that any time I observed, CPU00 was not used by the > system tasks. > > However, I remembered that PF was disabled during my stateless tests, so I > think its purge could not be the one that used CPU05. Now I repeated the > experiment, first disabling PF as follows: disabling pf means it doesnt get run for packets in the network stack. however, the once the state purge processing is started it just keeps running. if you have zero states, there wont be much to process though. there will be other things running in the system that could account for the "extra" cpu utilisation. > dut# pfctl -d > pf disabled > > And I can still see FIVE CPU cores used by system tasks: the network stack runs in these threads. pf is just one part of the network stack. > > load averages:?? 0.69,?? 0.29, > 0.13 dut.cntrg > 14:41:06 > 36 processes: 35 idle, 1 on processor up 0 days 00:03:46 > CPU00 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 8.1% intr, > 91.7% idle > CPU01 states:?? 0.0% user,?? 0.0% nice, 61.1% sys,?? 9.5% spin, 9.5% intr, > 19.8% idle > CPU02 states:?? 0.0% user,?? 0.0% nice, 62.8% sys, 10.9% spin, 8.5% intr, > 17.8% idle > CPU03 states:?? 0.0% user,?? 0.0% nice, 54.7% sys,?? 9.1% spin, 10.1% intr, > 26.0% idle > CPU04 states:?? 0.0% user,?? 0.0% nice, 62.7% sys, 10.2% spin, 9.8% intr, > 17.4% idle > CPU05 states:?? 0.0% user,?? 0.0% nice, 51.7% sys,?? 9.1% spin, 7.6% intr, > 31.6% idle > CPU06 states:?? 0.2% user,?? 0.0% nice,?? 2.8% sys,?? 0.8% spin, 10.0% intr, > 86.1% idle > CPU07 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.2% intr, > 92.6% idle > CPU08 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.0% spin, 8.4% intr, > 91.6% idle > CPU09 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.0% spin, 9.2% intr, > 90.8% idle > CPU10 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 10.8% intr, > 89.0% idle > CPU11 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 9.2% intr, > 90.6% idle > CPU12 states:?? 0.0% user,?? 0.0% nice,?? 0.2% sys,?? 0.8% spin, 9.2% intr, > 89.8% idle > CPU13 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.2% intr, > 92.6% idle > CPU14 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.8% spin, 9.8% intr, > 89.4% idle > CPU15 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.8% intr, > 92.0% idle > Memory: Real: 34M/1546M act/tot Free: 122G Cache: 807M Swap: 0K/256M > > I suspect that top shows an average (in a few seconds time window) and > perhaps one of the cores from CPU01 to CPU04 are skipped (e.g. because it > was used by the "top" command?), this is why I can see system load on CPU05. > (There is even some low amount of system load on CPU06.) > > > > > *Is there any way to completely delete its entire content?* > > hrm. > > > > so i just read the code again. "pfctl -F states" goes through the whole > > state table and unlinks the states from the red-black trees used for > > packet processing, and then marks them as unlinked so the purge process > > can immediately claim then as soon as they're scanned. this means that > > in terms of packet processing the tree is empty. the memory (which is > > what the state limit applies to) won't be reclaimed until the purge > > processing takes them. > > > > if you just wait 10 or so seconds after "pfctl -F states" then both the > > tree and state limits should be back to 0. you can watch pfctl -si, > > "systat pf", or the pfstate row in "systat pool" to confirm. > > > > you can change the scan interval with "set timeout interval" in pf.conf > > from 10s. no one fiddles with that though, so i'd put it back between > > runs to be representative of real world performance. > > I usually wait 10s between the consecutive ste
Re: pf state-table-induced instability
Dear David, Thank you very much for all the new information! I keep only those parts that I want to react. It is not a fundamental issue, but it seems to me that during my tests not only four but five CPU cores were used by IP packet forwarding: the packet processing is done in kernel threads (task queues are built on threads), and those threads could be scheduled on any cpu. the pf purge processing runs in yet another thread. iirc, the schedule scans down the list of cpus looking for an idle one when it needs to run stuff, except to avoid cpu0 if possible. this is why you see most of the system time on cpus 1 to 5. Yes, I can confirm that any time I observed, CPU00 was not used by the system tasks. However, I remembered that PF was disabled during my stateless tests, so I think its purge could not be the one that used CPU05. Now I repeated the experiment, first disabling PF as follows: dut# pfctl -d pf disabled And I can still see FIVE CPU cores used by system tasks: load averages: 0.69, 0.29, 0.13 dut.cntrg 14:41:06 36 processes: 35 idle, 1 on processor up 0 days 00:03:46 CPU00 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 8.1% intr, 91.7% idle CPU01 states: 0.0% user, 0.0% nice, 61.1% sys, 9.5% spin, 9.5% intr, 19.8% idle CPU02 states: 0.0% user, 0.0% nice, 62.8% sys, 10.9% spin, 8.5% intr, 17.8% idle CPU03 states: 0.0% user, 0.0% nice, 54.7% sys, 9.1% spin, 10.1% intr, 26.0% idle CPU04 states: 0.0% user, 0.0% nice, 62.7% sys, 10.2% spin, 9.8% intr, 17.4% idle CPU05 states: 0.0% user, 0.0% nice, 51.7% sys, 9.1% spin, 7.6% intr, 31.6% idle CPU06 states: 0.2% user, 0.0% nice, 2.8% sys, 0.8% spin, 10.0% intr, 86.1% idle CPU07 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 7.2% intr, 92.6% idle CPU08 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 8.4% intr, 91.6% idle CPU09 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 9.2% intr, 90.8% idle CPU10 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 10.8% intr, 89.0% idle CPU11 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 9.2% intr, 90.6% idle CPU12 states: 0.0% user, 0.0% nice, 0.2% sys, 0.8% spin, 9.2% intr, 89.8% idle CPU13 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 7.2% intr, 92.6% idle CPU14 states: 0.0% user, 0.0% nice, 0.0% sys, 0.8% spin, 9.8% intr, 89.4% idle CPU15 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 7.8% intr, 92.0% idle Memory: Real: 34M/1546M act/tot Free: 122G Cache: 807M Swap: 0K/256M I suspect that top shows an average (in a few seconds time window) and perhaps one of the cores from CPU01 to CPU04 are skipped (e.g. because it was used by the "top" command?), this is why I can see system load on CPU05. (There is even some low amount of system load on CPU06.) *Is there any way to completely delete its entire content?* hrm. so i just read the code again. "pfctl -F states" goes through the whole state table and unlinks the states from the red-black trees used for packet processing, and then marks them as unlinked so the purge process can immediately claim then as soon as they're scanned. this means that in terms of packet processing the tree is empty. the memory (which is what the state limit applies to) won't be reclaimed until the purge processing takes them. if you just wait 10 or so seconds after "pfctl -F states" then both the tree and state limits should be back to 0. you can watch pfctl -si, "systat pf", or the pfstate row in "systat pool" to confirm. you can change the scan interval with "set timeout interval" in pf.conf from 10s. no one fiddles with that though, so i'd put it back between runs to be representative of real world performance. I usually wait 10s between the consecutive steps of the binary search of my measurements to give the system a chance to relax (trying to ensure that the steps are independent measurements). However, the timeout interval of PF was set to 1 hour (using "set timeout interval 3600"). You may ask, why? To have some well defined performance metrics, and to define repeatable and reproducible measurements, we use the following tests: - maximum connection establishment rate (during this test all test frames result in a new connection) - throughput with bidirectional traffic as required by RFC 2544 (during this test no test frames result in a new connection, neither connection time out happens -- a sufficiently high timeout could guarantee it) - connection tear down performance (first loading N number of connections and then deleting all connections in a single step and measuring the execution time of the deletion: connection tear down rate = N / deletion time of N connections) It is a good question, how well the above performance metrics can represent the "real word" performance of a stateful NAT64 implementation! If you are interes
Re: pf state-table-induced instability
On Wed, Aug 30, 2023 at 09:54:45AM +0200, Gabor LENCSE wrote: > Dear David, > > Thank you very much for your detailed answer! Now I have got the explanation > for seemingly rather strange things. :-) > > However, I have some further questions. Let me explain what I do now so that > you can more clearly see the background. > > I have recently enabled siitperf to use multiple IP addresses. (Siitperf is > an IPv4, IPv6,?? SIIT, and stateful NAT64/NAT44 bechmarking tool > implementing the measurements of RFC 2544, RFC 8219, and this draft: > https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-benchmarking-stateful > .) > > Currently I want to test (and demonstrate) the difference this improvement > has made. I have already covered the stateless case by measuring the IPv4 > and IPv6 packet forwarding performance of OpenBSD using > 1) the very same test frames following the test frame format defined in the > appendix of RFC 2544 > 2) using only pseudorandom port numbers required by RFC 4814 (resulted in no > performance improvement compared to case 1) > 3) using pseudorandom IP addresses from specified ranges (resulted in > significant performance improvement compared to case 1) > 4) using both pseudorandom IP addresses and port numbers (same results as in > case 3) > > Many thanks to OpenBSD developers for enabling multi-core IP packet > forwarding! > > https://www.openbsd.org/plus72.html says: "Activated parallel IP forwarding, > starting 4 softnet tasks but limiting the usage to the number of CPUs." > > It is not a fundamental issue, but it seems to me that during my tests not > only four but five CPU cores were used by IP packet forwarding: the packet processing is done in kernel threads (task queues are built on threads), and those threads could be scheduled on any cpu. the pf purge processing runs in yet another thread. iirc, the schedule scans down the list of cpus looking for an idle one when it needs to run stuff, except to avoid cpu0 if possible. this is why you see most of the system time on cpus 1 to 5. > > load averages:?? 1.34,?? 0.35, > 0.12 dut.cntrg > 20:10:15 > 36 processes: 35 idle, 1 on processor up 1 days 02:16:56 > CPU00 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 6.1% intr, > 93.7% idle > CPU01 states:?? 0.0% user,?? 0.0% nice, 55.8% sys,?? 7.2% spin, 5.2% intr, > 31.9% idle > CPU02 states:?? 0.0% user,?? 0.0% nice, 53.6% sys,?? 8.0% spin, 6.2% intr, > 32.1% idle > CPU03 states:?? 0.0% user,?? 0.0% nice, 48.3% sys,?? 7.2% spin, 6.2% intr, > 38.3% idle > CPU04 states:?? 0.0% user,?? 0.0% nice, 44.2% sys,?? 9.7% spin, 6.3% intr, > 39.8% idle > CPU05 states:?? 0.0% user,?? 0.0% nice, 33.5% sys,?? 5.8% spin, 6.4% intr, > 54.3% idle > CPU06 states:?? 0.0% user,?? 0.0% nice,?? 3.2% sys,?? 0.2% spin, 7.2% intr, > 89.4% idle > CPU07 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.8% spin, 6.0% intr, > 93.2% idle > CPU08 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 5.4% intr, > 94.4% idle > CPU09 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.2% intr, > 92.6% idle > CPU10 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 8.9% intr, > 90.9% idle > CPU11 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.6% intr, > 92.2% idle > CPU12 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.0% spin, 8.6% intr, > 91.4% idle > CPU13 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.4% spin, 6.1% intr, > 93.5% idle > CPU14 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 6.4% intr, > 93.4% idle > CPU15 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.4% spin, 4.8% intr, > 94.8% idle > Memory: Real: 34M/2041M act/tot Free: 122G Cache: 825M Swap: 0K/256M > > The above output of the "top" command show significant system load at CPU > cores form CPU1 to CPU5. > > *Has the number of softnet tasks been increased from 4 to 5?* no :) > What it more crucial for me, are the stateful NAT64 the measurements with > PF. > > My stateful NAT64 measurement are as follows. > > 1. Maximum connection establishment rate test uses a binary search to find > the highest rate, at which all connections can be established through the > stateful NAT64 gateway when all test frames create a new connection. > > 2. Throughput test also uses a binary search to find the highest rate > (called throughput) at which all test frames are forwarded by the stateful > NAT64 gateway using bidirectional traffic. (All test frames belong to an > already existing connection. This test requires to load the connections into > the connection tracking table of the stateful NAT64 gateway in a previous > step u
Re: pf state-table-induced instability
Dear David, Thank you very much for your detailed answer! Now I have got the explanation for seemingly rather strange things. :-) However, I have some further questions. Let me explain what I do now so that you can more clearly see the background. I have recently enabled siitperf to use multiple IP addresses. (Siitperf is an IPv4, IPv6, SIIT, and stateful NAT64/NAT44 bechmarking tool implementing the measurements of RFC 2544, RFC 8219, and this draft: https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-benchmarking-stateful .) Currently I want to test (and demonstrate) the difference this improvement has made. I have already covered the stateless case by measuring the IPv4 and IPv6 packet forwarding performance of OpenBSD using 1) the very same test frames following the test frame format defined in the appendix of RFC 2544 2) using only pseudorandom port numbers required by RFC 4814 (resulted in no performance improvement compared to case 1) 3) using pseudorandom IP addresses from specified ranges (resulted in significant performance improvement compared to case 1) 4) using both pseudorandom IP addresses and port numbers (same results as in case 3) Many thanks to OpenBSD developers for enabling multi-core IP packet forwarding! https://www.openbsd.org/plus72.html says: "Activated parallel IP forwarding, starting 4 softnet tasks but limiting the usage to the number of CPUs." It is not a fundamental issue, but it seems to me that during my tests not only four but five CPU cores were used by IP packet forwarding: load averages: 1.34, 0.35, 0.12 dut.cntrg 20:10:15 36 processes: 35 idle, 1 on processor up 1 days 02:16:56 CPU00 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 6.1% intr, 93.7% idle CPU01 states: 0.0% user, 0.0% nice, 55.8% sys, 7.2% spin, 5.2% intr, 31.9% idle CPU02 states: 0.0% user, 0.0% nice, 53.6% sys, 8.0% spin, 6.2% intr, 32.1% idle CPU03 states: 0.0% user, 0.0% nice, 48.3% sys, 7.2% spin, 6.2% intr, 38.3% idle CPU04 states: 0.0% user, 0.0% nice, 44.2% sys, 9.7% spin, 6.3% intr, 39.8% idle CPU05 states: 0.0% user, 0.0% nice, 33.5% sys, 5.8% spin, 6.4% intr, 54.3% idle CPU06 states: 0.0% user, 0.0% nice, 3.2% sys, 0.2% spin, 7.2% intr, 89.4% idle CPU07 states: 0.0% user, 0.0% nice, 0.0% sys, 0.8% spin, 6.0% intr, 93.2% idle CPU08 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 5.4% intr, 94.4% idle CPU09 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 7.2% intr, 92.6% idle CPU10 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 8.9% intr, 90.9% idle CPU11 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 7.6% intr, 92.2% idle CPU12 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 8.6% intr, 91.4% idle CPU13 states: 0.0% user, 0.0% nice, 0.0% sys, 0.4% spin, 6.1% intr, 93.5% idle CPU14 states: 0.0% user, 0.0% nice, 0.0% sys, 0.2% spin, 6.4% intr, 93.4% idle CPU15 states: 0.0% user, 0.0% nice, 0.0% sys, 0.4% spin, 4.8% intr, 94.8% idle Memory: Real: 34M/2041M act/tot Free: 122G Cache: 825M Swap: 0K/256M The above output of the "top" command show significant system load at CPU cores form CPU1 to CPU5. *Has the number of softnet tasks been increased from 4 to 5?* What it more crucial for me, are the stateful NAT64 the measurements with PF. My stateful NAT64 measurement are as follows. 1. Maximum connection establishment rate test uses a binary search to find the highest rate, at which all connections can be established through the stateful NAT64 gateway when all test frames create a new connection. 2. Throughput test also uses a binary search to find the highest rate (called throughput) at which all test frames are forwarded by the stateful NAT64 gateway using bidirectional traffic. (All test frames belong to an already existing connection. This test requires to load the connections into the connection tracking table of the stateful NAT64 gateway in a previous step using a safely lower rate than determined by the maximum connection establishment rate test.) And both tests need to repeat multiple times to acquire statistically reliable results. As for the explanation of the seemingly deteriorating performance of PF, now I understand from your explanation that the "pfctl -F states" command does not delete the content of the connection tracking table. *Is there any way to completely delete its entire content?* (E.g., under Linux, I can delete the connection tracking table of iptables or Jool by deleting the appropriate kernel module.) Of course, I can delete it by rebooting the server. However, currently I use a Dell PowerEdge R730 server, and its complete reboot (including stopping OpenBSD, initialization of the hardware, booting OpenBSD and some spare time) takes 5 minutes. This is a way too long overhead, if I need to do it between every single elementary steps (that is, t
Re: pf state-table-induced instability
On Mon, Aug 28, 2023 at 01:46:32PM +0200, Gabor LENCSE wrote: > Hi Lyndon, > > Sorry for my late reply. Please see my answers inline. > > On 8/24/2023 11:13 PM, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > > Gabor LENCSE writes: > > > > > If you are interested, you can find the results in Tables 18 - 20 of > > > this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009 > > Thanks for the pointer -- that's a very interesting paper. > > > > After giving it a quick read through, one thing immediately jumps > > out. The paper mentions (section A.4) a boost in performance after > > increasing the state table size limit. Not having looked at the > > relevant code, so I'm guessing here, but this is a classic indicator > > of a hashing algorithm falling apart when the table gets close to > > full. Could it be that simple? I need to go digging into the pf > > code for a closer look. > > Beware, I wrote it about iptables and not PF! > > As for iptables, it is really so simple. I have done a deeper analysis of > iptables performance as the function of its hash table size. It is > documented in another (open access) paper: > http://doi.org/10.36244/ICJ.2023.1.6 > > However, I am not familiar with the internals of the other two tested > stateful NAT64 implementations, Jool and OpenBSD PF. I have no idea, what > kind of data structures they use for storing the connections. openbsd uses a red-black tree to look up states. packets are parsed into a key that looks up states by address family, ips, ipproto, ports, etc, to find the relevant state. if a state isnt found, it falls through to ruleset evaluation, which is notionally a linked list, but has been optimised. > > You also describe how the performance degrades over time. This > > exactly matches the behaviour we see. Could the fix be as simple > > as cranking 'set limit states' up to, say, two milltion? There is > > one way to find out ... :-) > > As you could see, the highest number of connections was 40M, and the limit > of the states was set to 1000M. It worked well for me then with the PF of > OpenBSD 7.1. > > It would be interesting to find the root cause of the phenomenon, why the > performance of PF seems to deteriorate with time. E.g., somehow the internal > data structures of PF become "polluted" if many connections are established > and then deleted? my first guess is that you're starting to fight agains the pf state purge processing. pf tries to scan the entire state table every 10 seconds (by default) looking for expired states it can remove. this scan process runs every second, but it tries to cover the whole state table by 10 seconds. the more states you have the more time this takes, and this increases linearly with the number of states you have. until relatively recently (post 7.2), the scan and gc processing effectively stopped the world. at work we run with about 2 million states during business hours, and i was seeing the gc processing take up approx 70ms a second, during which packet processing didnt really happen. now the scan can happen without blocking pf packet processing. it still takes cpu time, so there is a point that processing packets and scanning for states will fight each other for time, but at least they're not fighting each other for locks now. > However, I have deleted the content of the state table after each elementary > measurement step using the "pfctl -F states" command. (I am sorry, this > command is missing from the paper, but it is there in my saved "del-pf" > file!) > > Perhaps PF developers could advise us, if the deletion of the states > generate a fresh state table or not. it marks the states as expired, and then the purge scan is able to take them and actually free them. > Could anyone help us in this question? > > Best regards, > > G??bor > > > > > I use binary search to find the highest lossless rate (throughput). > Especially w > > > > > > --lyndon >
Re: pf state-table-induced instability
Hi Lyndon, Sorry for my late reply. Please see my answers inline. On 8/24/2023 11:13 PM, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: Gabor LENCSE writes: If you are interested, you can find the results in Tables 18 - 20 of this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009 Thanks for the pointer -- that's a very interesting paper. After giving it a quick read through, one thing immediately jumps out. The paper mentions (section A.4) a boost in performance after increasing the state table size limit. Not having looked at the relevant code, so I'm guessing here, but this is a classic indicator of a hashing algorithm falling apart when the table gets close to full. Could it be that simple? I need to go digging into the pf code for a closer look. Beware, I wrote it about iptables and not PF! As for iptables, it is really so simple. I have done a deeper analysis of iptables performance as the function of its hash table size. It is documented in another (open access) paper: http://doi.org/10.36244/ICJ.2023.1.6 However, I am not familiar with the internals of the other two tested stateful NAT64 implementations, Jool and OpenBSD PF. I have no idea, what kind of data structures they use for storing the connections. You also describe how the performance degrades over time. This exactly matches the behaviour we see. Could the fix be as simple as cranking 'set limit states' up to, say, two milltion? There is one way to find out ... :-) As you could see, the highest number of connections was 40M, and the limit of the states was set to 1000M. It worked well for me then with the PF of OpenBSD 7.1. It would be interesting to find the root cause of the phenomenon, why the performance of PF seems to deteriorate with time. E.g., somehow the internal data structures of PF become "polluted" if many connections are established and then deleted? However, I have deleted the content of the state table after each elementary measurement step using the "pfctl -F states" command. (I am sorry, this command is missing from the paper, but it is there in my saved "del-pf" file!) Perhaps PF developers could advise us, if the deletion of the states generate a fresh state table or not. Could anyone help us in this question? Best regards, Gábor I use binary search to find the highest lossless rate (throughput). Especially w --lyndon
Re: pf state-table-induced instability
On Thu, Aug 24, 2023 at 12:31 PM Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > For over a year now we have been seeing instability on our firewalls > that seems to kick in when our state tables approach 200K entries. > The number varies, but it's a safe bet that once we cross the 180K > threshold, the machines start getting cranky. At 200K+ performance > visibly degrades, often leading to a complete lockup of the network > stack, or a spontaneous reboot. ... > Our pf settings are pretty simple: > > set optimization normal > set ruleset-optimization basic > set limit states 40 > set limit src-nodes 10 > set loginterface none > set skip on lo > set reassemble yes > > # Reduce the number of state table entries in FIN_WAIT_2 state. > set timeout tcp.finwait 4 I don't know if there is any relation, but, with 40 states defined, adaptive scaling should start to kick in at around 24 states.
Re: pf state-table-induced instability
On Thu, Aug 24, 2023 at 2:57 PM Gabor LENCSE wrote: > I used OpenBSD 7.1 PF during stateful NAT64 benchmarking measurements > from 400,000 to 40,000,000 states. (Of course, its connection setup and > packet forwarding performance degraded with the number of states, but > the degradation was not very drastic.) > > If you are interested, you can find the results in Tables 18 - 20 of > this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009 Seriously awesome paper with volumes of detail--thank you!
Re: pf state-table-induced instability
Gabor LENCSE writes: > If you are interested, you can find the results in Tables 18 - 20 of > this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009 Thanks for the pointer -- that's a very interesting paper. After giving it a quick read through, one thing immediately jumps out. The paper mentions (section A.4) a boost in performance after increasing the state table size limit. Not having looked at the relevant code, so I'm guessing here, but this is a classic indicator of a hashing algorithm falling apart when the table gets close to full. Could it be that simple? I need to go digging into the pf code for a closer look. You also describe how the performance degrades over time. This exactly matches the behaviour we see. Could the fix be as simple as cranking 'set limit states' up to, say, two milltion? There is one way to find out ... :-) --lyndon
Re: pf state-table-induced instability
Hi, But my immediate (and only -- please do NOT start a bikeshed on ruleset design!) question is: Is there a practical limit on the number of states pf can handle? I used OpenBSD 7.1 PF during stateful NAT64 benchmarking measurements from 400,000 to 40,000,000 states. (Of course, its connection setup and packet forwarding performance degraded with the number of states, but the degradation was not very drastic.) If you are interested, you can find the results in Tables 18 - 20 of this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009 Best regards, Gábor
pf state-table-induced instability
For over a year now we have been seeing instability on our firewalls that seems to kick in when our state tables approach 200K entries. The number varies, but it's a safe bet that once we cross the 180K threshold, the machines start getting cranky. At 200K+ performance visibly degrades, often leading to a complete lockup of the network stack, or a spontaneous reboot. The symptoms are varied, but the early onset indication is interactive response at the shell prompt gets stuttery. As it progresses, network traffic stops flowing and the network stack eventually just locks up. We also see the occasional: pmap_unwire: wiring for pmap 0xfd8e8a946528 va 0xc000d4d000 didn't change! logged on the console. The machines are not hurting for resources: load averages: 1.06, 1.12, 1.12 xxx 17:53:08 48 processes: 47 idle, 1 on processor up 6:06 CPU0: 0.0% user, 0.0% nice, 22.0% sys, 0.8% spin, 5.8% intr, 71.5% idle CPU1: 0.0% user, 0.0% nice, 27.7% sys, 1.2% spin, 5.2% intr, 65.9% idle CPU2: 0.0% user, 0.0% nice, 40.5% sys, 0.6% spin, 4.4% intr, 54.5% idle CPU3: 0.0% user, 0.0% nice, 1.4% sys, 0.0% spin, 6.8% intr, 91.8% idle Memory: Real: 110M/1722M act/tot Free: 60G Cache: 851M Swap: 0K/21G Our pf settings are pretty simple: set optimization normal set ruleset-optimization basic set limit states 40 set limit src-nodes 10 set loginterface none set skip on lo set reassemble yes # Reduce the number of state table entries in FIN_WAIT_2 state. set timeout tcp.finwait 4 (Note that the limit states 40 is a hold over from the 6.x days, where the default value was too small to handle our load.) vmstat reports this for pf state table memory usage: pfstate 320 584171770 202558 135845 117730 18115 25210 0 80 pfstkey 112 584171770 179214 35152 29744 5408 7208 0 80 pfstitem 24 584171220 179214 6952 5811 1141 1520 0 80 At this moment we're running with 210K state table entries. There seem to be an awful lot (>40%) of those in FIN_WAIT_2:FIN_WAIT_2 state -- I'm still trying to puzzle that one out. But my immediate (and only -- please do NOT start a bikeshed on ruleset design!) question is: Is there a practical limit on the number of states pf can handle? Our experiences says there is, and the number is around 180K. Prior to release 7.1 we didn't see anything like this at all. This started happening with the 7.1 release, and we noticed a real escalation in instability in 7.2. Enough so that we rolled the affected firewalls back to 7.1. That worked around the problem, until last night, when the firewall rebooted itself (at the time of least traffic load?!). Because of all this we have been avoiding upgrading any of the firewalls beyond 7.1 as we cannot afford the resulting downtime. Even carp didn't save us. We've had a couple of incidents where on firewall panics, carp fails over, then the 2nd firewall locks up. And this points out another issue. When the network stack freezes, the carp interfaces do not flip. I haven't figured that one out yet, either. Okay, so what's the point of all this blathering? I guess there are two things I'm wondering: 1) are there known limitations in the pf code that would explain this? 2) has anyone else seen this sort of behaviour on their firewalls? Thanks! --lyndon
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
(Sorry, I just realized I replied to just your email address, replying again to the mailing list this time.) On 2023年08月16日 10:05, Stuart Henderson wrote: > wireguard-tools is not required, everything you need for wg(4) is in > the base OS. Oh, I didn't know that. In that case, valid point. > After some OS upgrades, some packages (especially those interfacing > with the kernel for things like networking) will be broken until > packages are updated. > This is a problem if you rely on wg(4) to access the machine. Not sure how frequent this is, but this only happened for me once on a ThinkPad T43, and it was just a matter of running pkg_add -ui both before and after an OS upgrade. > chatgpt often makes the answer sound good but the answer is not > necessarily reliable, so still needs vetting by someone who understands > the area. better leave it to someone who understands in the first place. Yes, but in my case it was more about how to phrase it, not a matter of "what the fuck am I even talking about". I understood why, I just didn't know how to explain in a way that sounds reasonable. I still stand by that the answer itself is more important than the person (or thing) answering. I would have expected the OpenBSD userbase to be much more merit-based rather than leftist-leaning as seen in most other BSD's and Linux distro's nowadays. -- lain.
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
Hi, I appreciate the valuable advices you provided about pf rules in OpenBSD. I am currently away on a trip, but once I return, I will thoroughly test those rules and provide you with feedback. On Wed, Aug 16, 2023 at 3:50 PM Stuart Henderson wrote: > > On 2023-08-14, SOUBHEEK NATH wrote: > > 2. Please have a look at the configuration I have implemented. > > > > pass in quick on wg0 proto tcp from 10.0.8.3/32 to any port {22 80} > > block in on wg0 proto tcp from any to any port {22 80} > > block in quick on bwfm0 proto tcp from any to any port {22 80} > > > >This configuration is functioning well and your suggestions have > >greatly assisted me in achieving it. > > > >I would like to discuss my insights on this configuration and would > >appreciate your feedback on it. > > > >I. I use the word "quick" in the first line to prevent the "block" > >rules in the second line from taking precedence over it. > > That's one way to do it. Personally I don't like writing "quick" on all > these lines so I normally order them for "last match wins" rather than > "first match wins". This is mostly down to personal preference. > > >II. The second line effectively prevents any devices in the wireguard > >network from accessing ports 22 and 80. However, because the 'quick' > >command is used in the first line, the rule in the first line takes > >precedence and allows access to ports 22 and 80 for the machine with > >IP address 10.0.8.3. > > This also blocks forwarded traffic from machines on wg0 (other than > 10.0.8.3) to port 22/80 on the internet, not just to the machine running > PF. If this is what you want, that's ok, if not then you.may want "self" > instead of "any". > > > On Mon, Aug 14, 2023 at 7:35 AM lain. wrote: > >> > >> On 2023年08月13日 12:17, Stuart Henderson wrote: > >> > > > >> > > https://www.vultr.com/docs/install-wireguard-vpn-server-on-openbsd-7-0/ > >> > > >> > what a mess of things from the base OS and unneeded third-party tools. > >> > > >> List of tools: > >> wireguard-tools (required), nano (vim would have been enough), and the > >> rest is everything OpenBSD ships with. > > wireguard-tools is not required, everything you need for wg(4) is in > the base OS. > > >> Oh the horror, that's far too much, the sky is falling! > > After some OS upgrades, some packages (especially those interfacing > with the kernel for things like networking) will be broken until > packages are updated. > This is a problem if you rely on wg(4) to access the machine. > > I suggest replacing use of wireguard-tools with the native configuration > direct in hostname.wg0, see the wg(4) and ifconfig(8) manuals. > > >> > > On Sun, Aug 13, 2023 at 7:04 AM lain. wrote: > >> > >> > >> > >> I failed to come up with reasons for using a preshared key, so I've > >> > >> let > >> > >> ChatGPT generate reasons for me: > >> > > >> > oh $deitt please do not. > >> > > >> What matters is not who or what answered, what matters is the answer, > >> and the answer it provided is good, but I guess autists gonna autist. > > chatgpt often makes the answer sound good but the answer is not > necessarily reliable, so still needs vetting by someone who understands > the area. better leave it to someone who understands in the first place. > > if you want to quote something, there's a perfectly good explanation > in the wg(4) manual. > > -- > Please keep replies on the mailing list. >
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
On 2023-08-14, SOUBHEEK NATH wrote: > 2. Please have a look at the configuration I have implemented. > > pass in quick on wg0 proto tcp from 10.0.8.3/32 to any port {22 80} > block in on wg0 proto tcp from any to any port {22 80} > block in quick on bwfm0 proto tcp from any to any port {22 80} > >This configuration is functioning well and your suggestions have >greatly assisted me in achieving it. > >I would like to discuss my insights on this configuration and would >appreciate your feedback on it. > >I. I use the word "quick" in the first line to prevent the "block" >rules in the second line from taking precedence over it. That's one way to do it. Personally I don't like writing "quick" on all these lines so I normally order them for "last match wins" rather than "first match wins". This is mostly down to personal preference. >II. The second line effectively prevents any devices in the wireguard >network from accessing ports 22 and 80. However, because the 'quick' >command is used in the first line, the rule in the first line takes >precedence and allows access to ports 22 and 80 for the machine with >IP address 10.0.8.3. This also blocks forwarded traffic from machines on wg0 (other than 10.0.8.3) to port 22/80 on the internet, not just to the machine running PF. If this is what you want, that's ok, if not then you.may want "self" instead of "any". > On Mon, Aug 14, 2023 at 7:35 AM lain. wrote: >> >> On 2023年08月13日 12:17, Stuart Henderson wrote: >> > > >> > > https://www.vultr.com/docs/install-wireguard-vpn-server-on-openbsd-7-0/ >> > >> > what a mess of things from the base OS and unneeded third-party tools. >> > >> List of tools: >> wireguard-tools (required), nano (vim would have been enough), and the >> rest is everything OpenBSD ships with. wireguard-tools is not required, everything you need for wg(4) is in the base OS. >> Oh the horror, that's far too much, the sky is falling! After some OS upgrades, some packages (especially those interfacing with the kernel for things like networking) will be broken until packages are updated. This is a problem if you rely on wg(4) to access the machine. I suggest replacing use of wireguard-tools with the native configuration direct in hostname.wg0, see the wg(4) and ifconfig(8) manuals. >> > > On Sun, Aug 13, 2023 at 7:04 AM lain. wrote: >> > >> >> > >> I failed to come up with reasons for using a preshared key, so I've let >> > >> ChatGPT generate reasons for me: >> > >> > oh $deitt please do not. >> > >> What matters is not who or what answered, what matters is the answer, >> and the answer it provided is good, but I guess autists gonna autist. chatgpt often makes the answer sound good but the answer is not necessarily reliable, so still needs vetting by someone who understands the area. better leave it to someone who understands in the first place. if you want to quote something, there's a perfectly good explanation in the wg(4) manual. -- Please keep replies on the mailing list.
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
On Mon, Aug 14, 2023 at 05:54:55PM +0530, SOUBHEEK NATH said: 2. Please have a look at the configuration I have implemented. pass in quick on wg0 proto tcp from 10.0.8.3/32 to any port {22 80} block in on wg0 proto tcp from any to any port {22 80} block in quick on bwfm0 proto tcp from any to any port {22 80} [ snip ] I. I use the word "quick" in the first line to prevent the "block" rules in the second line from taking precedence over it. In general I prefer in my pf ruleset to block first and then explicitly allow things through. I find this causes far less mistakes. The very first rule in my ruleset is: ``block log all label "Default block"'' I try to avoid ``quick'' rules unless there is a really good reason to use them. They can introduce some unintended side-effects if you aren't careful and if you find yourself using many of them you probably should re-think your rules. For example, directly after the default block I also block bogon IP addresse from my WAN interface and I do it with quick so I don't accidentally unblock them later: ``block drop in quick log on egress inet from to any'' (I have a table populated with bogon addresses) You may wish to review the PF handbook, the filter section seems a good place to start. https://www.openbsd.org/faq/pf/filter.html -- Please direct replies to the list.
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
Hello, The solution you both provided, worked well. 1. I do not use nano! I use the vi editor for my tasks. 2. Please have a look at the configuration I have implemented. pass in quick on wg0 proto tcp from 10.0.8.3/32 to any port {22 80} block in on wg0 proto tcp from any to any port {22 80} block in quick on bwfm0 proto tcp from any to any port {22 80} This configuration is functioning well and your suggestions have greatly assisted me in achieving it. I would like to discuss my insights on this configuration and would appreciate your feedback on it. I. I use the word "quick" in the first line to prevent the "block" rules in the second line from taking precedence over it. II. The second line effectively prevents any devices in the wireguard network from accessing ports 22 and 80. However, because the 'quick' command is used in the first line, the rule in the first line takes precedence and allows access to ports 22 and 80 for the machine with IP address 10.0.8.3. III. The third line is used to prevent any devices outside of the wireguard network from accessing ports 22 and 80. I appreciate the time and effort you dedicated to this. Thank you so much. -- Soubheek Nath Fifth Estate Kolkata, India soubheekn...@gmail.com On Mon, Aug 14, 2023 at 7:35 AM lain. wrote: > > On 2023年08月13日 12:17, Stuart Henderson wrote: > > >https://www.vultr.com/docs/install-wireguard-vpn-server-on-openbsd-7-0/ > > > > what a mess of things from the base OS and unneeded third-party tools. > > > List of tools: > wireguard-tools (required), nano (vim would have been enough), and the > rest is everything OpenBSD ships with. > Oh the horror, that's far too much, the sky is falling! > > > > On Sun, Aug 13, 2023 at 7:04 AM lain. wrote: > > >> > > >> I failed to come up with reasons for using a preshared key, so I've let > > >> ChatGPT generate reasons for me: > > > > oh $deitt please do not. > > > What matters is not who or what answered, what matters is the answer, > and the answer it provided is good, but I guess autists gonna autist.
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
On 2023年08月13日 12:17, Stuart Henderson wrote: > >https://www.vultr.com/docs/install-wireguard-vpn-server-on-openbsd-7-0/ > > what a mess of things from the base OS and unneeded third-party tools. > List of tools: wireguard-tools (required), nano (vim would have been enough), and the rest is everything OpenBSD ships with. Oh the horror, that's far too much, the sky is falling! > > On Sun, Aug 13, 2023 at 7:04 AM lain. wrote: > >> > >> I failed to come up with reasons for using a preshared key, so I've let > >> ChatGPT generate reasons for me: > > oh $deitt please do not. > What matters is not who or what answered, what matters is the answer, and the answer it provided is good, but I guess autists gonna autist.
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
6. In that case, simply change "10.0.8.0/24" to "10.0.8.4/32". For explanation sake, .0/24 means "everything connected to this network", whereas ".4/32" means "only this specific machine", so does ".3/32", ".2/32", ".5/32", and so on. 7. If you've followed Vultr's post, you might consider changing the /etc/hostname.wg0 file to just this one liner: !/usr/local/bin/wg-quick up wg0 On 2023年08月13日 16:57, SOUBHEEK NATH wrote: > Hello Lain, > > I appreciate your feedback and the time you took to provide it. > > 1. I set up OpenBSD 7.3 on a Raspberry Pi 4B with 4GB of RAM, which is >running from a USB drive. > 2. This is not a production environment, it is solely for educational >purposes. > 3. The router is currently using its default settings and three other >devices are connected to it. > 4. The wireless router is currently using its default settings to > assign IP addresses to three other devices that are connected to it. >You are correct, with this setup and pf rule, the wireguard VPN >server is accessible from within the local area network. However, I >believe that in the future, I can use the same setup and pf rule to >remotely access the server's ports exclusively through the wireguard >VPN from outside the network. > 5. Your configuration is functioning correctly, allowing only devices >within the wireguard network to access ports 22 and 80, while >blocking access for others. > 6. However, I cannot allow only one device with the IP address 10.0.8.4. >All devices in the wireguard network are able to access ports 22 and >80. >I have attempted to use the following pf rule: > >set skip on lo > >block return# block stateless traffic >pass# establish keep-state > ># By default, do not permit remote connections to X11 >block return in on ! lo0 proto tcp to port 6000:6010 > ># Port build user does not need network > >pass in quick on wg0 proto tcp from 10.0.8.4 to any port {22, 80} >block in quick on egress proto tcp from any to any port {22, 80} > >block return out log proto {tcp udp} user _pbuild > >pass in on egress proto tcp from any to any port 22 > >pass out on egress inet from (wg0:network) nat-to (bwfm0:0) > >Based on my understanding of the OpenBSD PF-Packet filtering document >(https://www.openbsd.org/faq/pf/filter.html), the intention of this >pf rule is to allow only the IP address 10.0.8.4 to access ports 22 >and 80. However, currently both machines with IP addresses 10.0.8.2 >and 10.0.8.3 are able to access ports 22 and 80. > > 7. I have already falsified the private and public keys when submitting >this question. >I attempted to include 'Address = 10.0.8.1/32' in the wireguard >[Interface] block earlier as you suggested, but encountered an error. > >$ doas sh /etc/netstart wg0 >Line unrecognized: `Address=10.0.8.1/24' >Configuration parsing error > >I've gone through this link while setting up wireguard: >https://www.vultr.com/docs/install-wireguard-vpn-server-on-openbsd-7-0/ >Despite its absence, wireguard is functioning properly. > > 8. I greatly appreciate your suggestion regarding the PreShareKey in >wireguard configuration. It would be a valuable addition to my >knowledge and will benefit me in the future. > > Thanks again. > -- > Soubheek Nath > Fifth Estate > Kolkata, India > soubheekn...@gmail.com > > On Sun, Aug 13, 2023 at 7:04 AM lain. wrote: > > > > I failed to come up with reasons for using a preshared key, so I've let > > ChatGPT generate reasons for me: > > > > Certainly! WireGuard's use of a preshared key (PSK) adds an additional > > layer of symmetric encryption to the standard asymmetric encryption. Here's > > a brief explanation of the advantage: > > > > 1. **Symmetric vs. Asymmetric Encryption**: WireGuard primarily uses > > asymmetric encryption, where each party has a pair of keys (public and > > private). Symmetric encryption, on the other hand, utilizes the same key > > for both encryption and decryption. By adding a PSK, WireGuard incorporates > > both types of encryption. > > > > 2. **Additional Security Layer**: The PSK is mixed into the encryption > > process along with the standard public and private keys. Even if an > > attacker could somehow compromise the asymmetric part (though practically > > very difficult), they would still need the PSK to decrypt the communication. > > > > 3. **Protection Against Quantum Attacks**: Though still theore
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
>Based on my understanding of the OpenBSD PF-Packet filtering document >(https://www.openbsd.org/faq/pf/filter.html), the intention of this >pf rule is to allow only the IP address 10.0.8.4 to access ports 22 >and 80. However, currently both machines with IP addresses 10.0.8.2 >and 10.0.8.3 are able to access ports 22 and 80. Maybe try something like set skip on lo block log match out on bwfm0 inet received-on wg0 nat-to (bwfm0) pass out pass in on wg0 block log in to (self) pass proto tcp from 10.0.8.4 to port {22 80} I recommend ignoring the pf faq and use https://man.openbsd.org/pf.conf instead. >https://www.vultr.com/docs/install-wireguard-vpn-server-on-openbsd-7-0/ what a mess of things from the base OS and unneeded third-party tools. > On Sun, Aug 13, 2023 at 7:04 AM lain. wrote: >> >> I failed to come up with reasons for using a preshared key, so I've let >> ChatGPT generate reasons for me: oh $deitt please do not.
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
Hello Lain, I appreciate your feedback and the time you took to provide it. 1. I set up OpenBSD 7.3 on a Raspberry Pi 4B with 4GB of RAM, which is running from a USB drive. 2. This is not a production environment, it is solely for educational purposes. 3. The router is currently using its default settings and three other devices are connected to it. 4. The wireless router is currently using its default settings to assign IP addresses to three other devices that are connected to it. You are correct, with this setup and pf rule, the wireguard VPN server is accessible from within the local area network. However, I believe that in the future, I can use the same setup and pf rule to remotely access the server's ports exclusively through the wireguard VPN from outside the network. 5. Your configuration is functioning correctly, allowing only devices within the wireguard network to access ports 22 and 80, while blocking access for others. 6. However, I cannot allow only one device with the IP address 10.0.8.4. All devices in the wireguard network are able to access ports 22 and 80. I have attempted to use the following pf rule: set skip on lo block return# block stateless traffic pass# establish keep-state # By default, do not permit remote connections to X11 block return in on ! lo0 proto tcp to port 6000:6010 # Port build user does not need network pass in quick on wg0 proto tcp from 10.0.8.4 to any port {22, 80} block in quick on egress proto tcp from any to any port {22, 80} block return out log proto {tcp udp} user _pbuild pass in on egress proto tcp from any to any port 22 pass out on egress inet from (wg0:network) nat-to (bwfm0:0) Based on my understanding of the OpenBSD PF-Packet filtering document (https://www.openbsd.org/faq/pf/filter.html), the intention of this pf rule is to allow only the IP address 10.0.8.4 to access ports 22 and 80. However, currently both machines with IP addresses 10.0.8.2 and 10.0.8.3 are able to access ports 22 and 80. 7. I have already falsified the private and public keys when submitting this question. I attempted to include 'Address = 10.0.8.1/32' in the wireguard [Interface] block earlier as you suggested, but encountered an error. $ doas sh /etc/netstart wg0 Line unrecognized: `Address=10.0.8.1/24' Configuration parsing error I've gone through this link while setting up wireguard: https://www.vultr.com/docs/install-wireguard-vpn-server-on-openbsd-7-0/ Despite its absence, wireguard is functioning properly. 8. I greatly appreciate your suggestion regarding the PreShareKey in wireguard configuration. It would be a valuable addition to my knowledge and will benefit me in the future. Thanks again. -- Soubheek Nath Fifth Estate Kolkata, India soubheekn...@gmail.com On Sun, Aug 13, 2023 at 7:04 AM lain. wrote: > > I failed to come up with reasons for using a preshared key, so I've let > ChatGPT generate reasons for me: > > Certainly! WireGuard's use of a preshared key (PSK) adds an additional layer > of symmetric encryption to the standard asymmetric encryption. Here's a brief > explanation of the advantage: > > 1. **Symmetric vs. Asymmetric Encryption**: WireGuard primarily uses > asymmetric encryption, where each party has a pair of keys (public and > private). Symmetric encryption, on the other hand, utilizes the same key for > both encryption and decryption. By adding a PSK, WireGuard incorporates both > types of encryption. > > 2. **Additional Security Layer**: The PSK is mixed into the encryption > process along with the standard public and private keys. Even if an attacker > could somehow compromise the asymmetric part (though practically very > difficult), they would still need the PSK to decrypt the communication. > > 3. **Protection Against Quantum Attacks**: Though still theoretical at this > point, quantum computers could eventually break the Diffie-Hellman key > exchange used in many encryption protocols. By using a PSK, WireGuard adds > protection against this potential future vulnerability. > > 4. **Simplicity**: WireGuard's design is intended to be simple and easy to > implement. The use of a PSK aligns with this philosophy by providing a > straightforward way to bolster security. > > Here's an example of how you would generate and implement a preshared key in > WireGuard: > > Generate the PSK: > ```bash > wg genpsk > ``` > > You would then add the generated key to both the client and server > configurations: > > Server's `wg0.conf`: > ```ini > [Peer] > PublicKey = CLIENT_PUBLIC_KEY > PresharedKey = GENERATED_PRESHARED_KEY > AllowedIPs = CLIENT_IP/32 > ``` > > Client's `wg0.conf`: > ```ini > [Peer] > PublicKey = SERVER_PUBLIC_KEY > Presha
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
I failed to come up with reasons for using a preshared key, so I've let ChatGPT generate reasons for me: Certainly! WireGuard's use of a preshared key (PSK) adds an additional layer of symmetric encryption to the standard asymmetric encryption. Here's a brief explanation of the advantage: 1. **Symmetric vs. Asymmetric Encryption**: WireGuard primarily uses asymmetric encryption, where each party has a pair of keys (public and private). Symmetric encryption, on the other hand, utilizes the same key for both encryption and decryption. By adding a PSK, WireGuard incorporates both types of encryption. 2. **Additional Security Layer**: The PSK is mixed into the encryption process along with the standard public and private keys. Even if an attacker could somehow compromise the asymmetric part (though practically very difficult), they would still need the PSK to decrypt the communication. 3. **Protection Against Quantum Attacks**: Though still theoretical at this point, quantum computers could eventually break the Diffie-Hellman key exchange used in many encryption protocols. By using a PSK, WireGuard adds protection against this potential future vulnerability. 4. **Simplicity**: WireGuard's design is intended to be simple and easy to implement. The use of a PSK aligns with this philosophy by providing a straightforward way to bolster security. Here's an example of how you would generate and implement a preshared key in WireGuard: Generate the PSK: ```bash wg genpsk ``` You would then add the generated key to both the client and server configurations: Server's `wg0.conf`: ```ini [Peer] PublicKey = CLIENT_PUBLIC_KEY PresharedKey = GENERATED_PRESHARED_KEY AllowedIPs = CLIENT_IP/32 ``` Client's `wg0.conf`: ```ini [Peer] PublicKey = SERVER_PUBLIC_KEY PresharedKey = GENERATED_PRESHARED_KEY AllowedIPs = 0.0.0.0/0 Endpoint = SERVER_IP:PORT ``` In summary, adding a PSK provides an extra layer of security that complements the existing asymmetric encryption, protects against potential quantum attacks, and adheres to WireGuard's principles of simplicity and effectiveness. On 2023年08月13日 10:22, lain. wrote: > First off, unless you faked your private and public keys, please change > them as soon as possible. > You've just made yourself volunerable to cyber attacks! > > If I understand you correctly, you want to be able to SSH and HTTP only > over WireGuard, right? > In that case, on your WireGuard server: > > # Block access to SSH and HTTP from everyone except for your WireGuard network > pass in quick on wg0 proto tcp from 10.0.8.0/24 to any port {22, 80} > block in quick on egress proto tcp from any to any port {22, 80} > > From your specifications, it's not quite clear whether your network is > accessible from the outside or not, whether you're using a dynamic IP or > static IP, how your router is configured, and all else, because > requirements change depending on these details. > If you're using a dynamic IP, and both your server and clienbts are > within the same network, there's a good chance that this setup is > unnecessary, given that using a WireGuard VPN makes sense if the server > is remote and normally accessible from the outside, and you want to make > it only accessible from the inside. > > As for your WireGuard config, you might want to add the Address to your > "[Interface]" block like this for example: > Address = 10.0.8.1/24 > > Not necessarily required to get it working, but would still add an extra > layer of security if you generate a preshared key on each peer, then on > both your server and peers: > [Peer] > ... > PreSharedKey = (output) > ... > > To generate the preshared key (only do this on your peers): > wg genpsk > preshared.key > > On 2023年08月12日 20:30, SOUBHEEK NATH wrote: > > Dear OpenBSD Mailing List Community, > > > > I hope this email finds you well. I am writing to seek your expertise > > and guidance regarding a Wireguard VPN configuration and pf rules on my > > OpenBSD 7.3 system. I have successfully set up a Wireguard VPN using > > the provided interface configuration, and the VPN is operational as > > intended. However, I have encountered a challenge while attempting to > > implement pf rules to restrict access to SSH login and port number 80 > > based on specific IP addresses. > > > > Below is the pf rule settings I have applied: > > > > set skip on lo > > block return# block stateless traffic > > pass# establish keep-state > > > > # By default, do not permit remote connections to X11 > > block return in on ! lo0 proto tcp to port 6000:6010 > > > > # Port build user does not need network > > block return in quick on bwfm0 proto tcp from ! 192.168.0.229 to bwfm0 > > port s
Re: Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
First off, unless you faked your private and public keys, please change them as soon as possible. You've just made yourself volunerable to cyber attacks! If I understand you correctly, you want to be able to SSH and HTTP only over WireGuard, right? In that case, on your WireGuard server: # Block access to SSH and HTTP from everyone except for your WireGuard network pass in quick on wg0 proto tcp from 10.0.8.0/24 to any port {22, 80} block in quick on egress proto tcp from any to any port {22, 80} >From your specifications, it's not quite clear whether your network is accessible from the outside or not, whether you're using a dynamic IP or static IP, how your router is configured, and all else, because requirements change depending on these details. If you're using a dynamic IP, and both your server and clienbts are within the same network, there's a good chance that this setup is unnecessary, given that using a WireGuard VPN makes sense if the server is remote and normally accessible from the outside, and you want to make it only accessible from the inside. As for your WireGuard config, you might want to add the Address to your "[Interface]" block like this for example: Address = 10.0.8.1/24 Not necessarily required to get it working, but would still add an extra layer of security if you generate a preshared key on each peer, then on both your server and peers: [Peer] ... PreSharedKey = (output) ... To generate the preshared key (only do this on your peers): wg genpsk > preshared.key On 2023年08月12日 20:30, SOUBHEEK NATH wrote: > Dear OpenBSD Mailing List Community, > > I hope this email finds you well. I am writing to seek your expertise > and guidance regarding a Wireguard VPN configuration and pf rules on my > OpenBSD 7.3 system. I have successfully set up a Wireguard VPN using > the provided interface configuration, and the VPN is operational as > intended. However, I have encountered a challenge while attempting to > implement pf rules to restrict access to SSH login and port number 80 > based on specific IP addresses. > > Below is the pf rule settings I have applied: > > set skip on lo > block return# block stateless traffic > pass# establish keep-state > > # By default, do not permit remote connections to X11 > block return in on ! lo0 proto tcp to port 6000:6010 > > # Port build user does not need network > block return in quick on bwfm0 proto tcp from ! 192.168.0.229 to bwfm0 > port ssh > block return in quick on wg0 proto udp from ! 10.0.8.2 to wg0 port 80 > block return in quick on bwfm0 proto tcp from ! 192.168.0.229 to bwfm0 > port 80 > block return out log proto {tcp udp} user _pbuild > > pass in on egress proto tcp from any to any port 22 > > pass out on egress inet from (wg0:network) nat-to (bwfm0:0) > > The objective of these rules is to restrict SSH login and access to port > 80 exclusively for the machine with the IP address 192.168.0.229 when > the OpenBSD system is connected to the bwfm0 network interface. While > the rule for SSH login and IP address 192.168.0.229 is functioning as > expected, I have encountered an issue with the rule pertaining to port > 80 and IP address 10.0.8.2, which is allocated by Wireguard (wg0) > during active Wireguard connections. > > The problem arises when attempting to enforce the restriction on port 80 > with IP address 10.0.8.2. Despite the pf rule in place, it seems that > Wireguard is overriding the restriction. For instance, devices with > assigned IP addresses such as 10.0.8.3 or 10.0.8.4, which are within > the Wireguard network, can access both SSH login and port 80, contrary > to the intended restriction. > > I am providing the Wireguard configuration below for your reference: > > [Interface] > ListenPort = 51820 > PrivateKey = oPernzzF+Kl499z2TMU6wDdrDpnDN6/e630Q= > > [Peer] > PublicKey = yyhY5Blx+PxCHu/wK7QgrXHQ34RmTi//zynVA= > AllowedIPs = 10.0.8.2/32 > PersistentKeepalive = 25 > > [Peer] > PublicKey = dQO6ACctkgepDtWxGrHuGFdvaO9qfrL4mmjA= > AllowedIPs = 10.0.8.3/32 > PersistentKeepalive = 25 > > I would greatly appreciate your insights, suggestions, and expertise in > resolving this issue. Your assistance will be invaluable in helping me > achieve the desired access restrictions while maintaining the > functionality of the Wireguard VPN. > > Thank you for your time and consideration. > -- > Soubheek Nath > Fifth Estate > Kolkata, India > soubheekn...@gmail.com > -- lain. Did you know that? 90% of all emails sent on a daily basis are being sent in plain text, and it's super easy to intercept emails as they flow over the internet? Never send passwords, tokens, personal information, or other volunerable information without proper PGP encryption! If you're writing
Assistance Needed with Wireguard VPN Configuration and pf Rules on OpenBSD 7.3
Dear OpenBSD Mailing List Community, I hope this email finds you well. I am writing to seek your expertise and guidance regarding a Wireguard VPN configuration and pf rules on my OpenBSD 7.3 system. I have successfully set up a Wireguard VPN using the provided interface configuration, and the VPN is operational as intended. However, I have encountered a challenge while attempting to implement pf rules to restrict access to SSH login and port number 80 based on specific IP addresses. Below is the pf rule settings I have applied: set skip on lo block return# block stateless traffic pass# establish keep-state # By default, do not permit remote connections to X11 block return in on ! lo0 proto tcp to port 6000:6010 # Port build user does not need network block return in quick on bwfm0 proto tcp from ! 192.168.0.229 to bwfm0 port ssh block return in quick on wg0 proto udp from ! 10.0.8.2 to wg0 port 80 block return in quick on bwfm0 proto tcp from ! 192.168.0.229 to bwfm0 port 80 block return out log proto {tcp udp} user _pbuild pass in on egress proto tcp from any to any port 22 pass out on egress inet from (wg0:network) nat-to (bwfm0:0) The objective of these rules is to restrict SSH login and access to port 80 exclusively for the machine with the IP address 192.168.0.229 when the OpenBSD system is connected to the bwfm0 network interface. While the rule for SSH login and IP address 192.168.0.229 is functioning as expected, I have encountered an issue with the rule pertaining to port 80 and IP address 10.0.8.2, which is allocated by Wireguard (wg0) during active Wireguard connections. The problem arises when attempting to enforce the restriction on port 80 with IP address 10.0.8.2. Despite the pf rule in place, it seems that Wireguard is overriding the restriction. For instance, devices with assigned IP addresses such as 10.0.8.3 or 10.0.8.4, which are within the Wireguard network, can access both SSH login and port 80, contrary to the intended restriction. I am providing the Wireguard configuration below for your reference: [Interface] ListenPort = 51820 PrivateKey = oPernzzF+Kl499z2TMU6wDdrDpnDN6/e630Q= [Peer] PublicKey = yyhY5Blx+PxCHu/wK7QgrXHQ34RmTi//zynVA= AllowedIPs = 10.0.8.2/32 PersistentKeepalive = 25 [Peer] PublicKey = dQO6ACctkgepDtWxGrHuGFdvaO9qfrL4mmjA= AllowedIPs = 10.0.8.3/32 PersistentKeepalive = 25 I would greatly appreciate your insights, suggestions, and expertise in resolving this issue. Your assistance will be invaluable in helping me achieve the desired access restrictions while maintaining the functionality of the Wireguard VPN. Thank you for your time and consideration. -- Soubheek Nath Fifth Estate Kolkata, India soubheekn...@gmail.com
Re: PF rate limiting options valid for UDP?
On Thu, Jul 20, 2023 at 05:52:07PM +, mabi wrote: > --- Original Message --- > On Wednesday, July 19th, 2023 at 10:58 PM, Stuart Henderson > wrote: > > > For rules that pass traffic to your authoritative DNS servers, > > I don't think you need much longer than the time taken to answer a > > query. So could be quite a bit less. > > Right good point, I will add custom state timeouts for this specific UDP pass > rule on port 53. > > > Usually carp/ospf will enter the state table before the machines start > > seeing large amounts of packets and stay there, which is what you would > > normally want. If the state table is full, you have more problem > > opening new connections that require state to be added than you do > > maintaining existing ones. > > > > fwiw I typically use this on ospf+carp machines, "pass quick proto > > {carp, ospf} keep state (no-sync) set prio 7" > > That's very interesting, I never realized there was a simple priority system > ready to use in PF without the need of setting up any queues. Probably the > "set prio 7" option on OSPF+CARP pass rules will juts do the trick and I will > definitely also implement this. > > > DNS server software is written with this type of traffic in mind, and > > has more information available (from inside the DNS request packet) > > to make a decision about what to do with it, than is available in a > > general-purpose packet filter like PF. > > > > Also it stores the tracking information in data structures that have > > been chosen to make sense for this use (and common DNS servers default > > to masking on common subnet sizes, reducing the amount they have to > > store compared to tracking the full IP address). > > > > http://man.openbsd.org/nsd.conf#rrl > > https://bind9.readthedocs.io/en/latest/reference.html#response-rate-limiting > > https://www.knot-dns.cz/docs/2.4/html/reference.html#module-rrl > > Too bad I use PowereDNS, it does not seem to offer much parameters related to > rate-limiting for UDP but for TCP I found at least max-tcp-connections. Maybe > it's time for a change as Gabor mentions his tests in his reply (thanks > btw!)... > In a typical PowerDNS setup the task of rate limiting is done by dnsdist. -Otto
Re: PF rate limiting options valid for UDP?
--- Original Message --- On Wednesday, July 19th, 2023 at 10:58 PM, Stuart Henderson wrote: > For rules that pass traffic to your authoritative DNS servers, > I don't think you need much longer than the time taken to answer a > query. So could be quite a bit less. Right good point, I will add custom state timeouts for this specific UDP pass rule on port 53. > Usually carp/ospf will enter the state table before the machines start > seeing large amounts of packets and stay there, which is what you would > normally want. If the state table is full, you have more problem > opening new connections that require state to be added than you do > maintaining existing ones. > > fwiw I typically use this on ospf+carp machines, "pass quick proto > {carp, ospf} keep state (no-sync) set prio 7" That's very interesting, I never realized there was a simple priority system ready to use in PF without the need of setting up any queues. Probably the "set prio 7" option on OSPF+CARP pass rules will juts do the trick and I will definitely also implement this. > DNS server software is written with this type of traffic in mind, and > has more information available (from inside the DNS request packet) > to make a decision about what to do with it, than is available in a > general-purpose packet filter like PF. > > Also it stores the tracking information in data structures that have > been chosen to make sense for this use (and common DNS servers default > to masking on common subnet sizes, reducing the amount they have to > store compared to tracking the full IP address). > > http://man.openbsd.org/nsd.conf#rrl > https://bind9.readthedocs.io/en/latest/reference.html#response-rate-limiting > https://www.knot-dns.cz/docs/2.4/html/reference.html#module-rrl Too bad I use PowerDNS, it does not seem to offer much parameters related to rate-limiting for UDP but for TCP I found at least max-tcp-connections. Maybe it's time for a change as Gabor mentions his tests in his reply (thanks btw!)...
Re: PF rate limiting options valid for UDP?
Hi, Are you already using your DNS server's response rate limiting features? Not yet, as I still believe I should stop as much as possible such traffic at the firewall before it even reaches the network behind my firewall. So at the software/daemon/service level it would be my last line of defense. If your hardware is powerful enough (e.g. at least 10Gbps Ethernet and the authoritative DNS server has let us say 32 CPU cores) you could also try fending off the DoS attack simply by using NSD or Knot DNS instead of BIND. According to my measurements, they both outperformed BIND by a factor or 10. If you are interested, you can find all the details in my open access paper: G. Lencse, "Benchmarking Authoritative DNS Servers", /IEEE Access/, vol. 8. pp. 130224-130238, July 2020. https://doi.org/10.1109/ACCESS.2020.3009141 Best regards, Gábor
Re: PF rate limiting options valid for UDP?
On 2023/07/19 19:54, mabi wrote: > --- Original Message --- > On Wednesday, July 19th, 2023 at 9:32 PM, Stuart Henderson > wrote: > > > If PF is struggling as it is, there's a good chance it will buckle > > completely if it has to do source tracking too > > That is also something I thought might be the case :| > > > Did you already tweak timeouts for the rule passing UDP DNS traffic? > > Defaults are 60s/30s/60s for udp.first, udp.single and udp.multiple > > respectively, that is much too high for a very busy DNS server - > > you can set them on the specific rule itself rather than changing > > defaults for all rules. For an auth server which is expected to > > respond quickly they can be cranked way down. > > Yes, this at least I did since quite some time now and use the following > timeout settings: > > set timeout udp.first 20 > set timeout udp.multiple 20 > set timeout udp.single 10 > > Do you think I could go even lower? When I check the PF state entries during > such a DDoS I see mostly states with the "SINGLE" state. For rules that pass traffic to your authoritative DNS servers, I don't think you need much longer than the time taken to answer a query. So could be quite a bit less. > > (If that is still too many states, I wonder if your network might > > actually be happier if you "pass quick proto udp to $server port 53 no > > state" and "pass quick proto udp from $server port 53 no state" right at > > the top of the ruleset). > > That's actually an excellent idea to bypass PF states and hence consume less > resources... Next thing to try out. I was also thinking I should use "no > state" with CARP and OSPF rules in pf.conf so that in case the PF state table > entries is full it does not prevent such important protocols to function. > What do you think, would that also work? Usually carp/ospf will enter the state table before the machines start seeing large amounts of packets and stay there, which is what you would normally want. If the state table is full, you have more problem opening new connections that require state to be added than you do maintaining existing ones. fwiw I typically use this on ospf+carp machines, "pass quick proto {carp, ospf} keep state (no-sync) set prio 7" > > Are you already using your DNS server's response rate limiting features? > > Not yet, as I still believe I should stop as much as possible such traffic at > the firewall before it even reaches the network behind my firewall. So at the > software/daemon/service level it would be my last line of defense. DNS server software is written with this type of traffic in mind, and has more information available (from inside the DNS request packet) to make a decision about what to do with it, than is available in a general-purpose packet filter like PF. Also it stores the tracking information in data structures that have been chosen to make sense for this use (and common DNS servers default to masking on common subnet sizes, reducing the amount they have to store compared to tracking the full IP address). http://man.openbsd.org/nsd.conf#rrl https://bind9.readthedocs.io/en/latest/reference.html#response-rate-limiting https://www.knot-dns.cz/docs/2.4/html/reference.html#module-rrl
Re: PF rate limiting options valid for UDP?
--- Original Message --- On Wednesday, July 19th, 2023 at 9:32 PM, Stuart Henderson wrote: > If PF is struggling as it is, there's a good chance it will buckle > completely if it has to do source tracking too That is also something I thought might be the case :| > Did you already tweak timeouts for the rule passing UDP DNS traffic? > Defaults are 60s/30s/60s for udp.first, udp.single and udp.multiple > respectively, that is much too high for a very busy DNS server - > you can set them on the specific rule itself rather than changing > defaults for all rules. For an auth server which is expected to > respond quickly they can be cranked way down. Yes, this at least I did since quite some time now and use the following timeout settings: set timeout udp.first 20 set timeout udp.multiple 20 set timeout udp.single 10 Do you think I could go even lower? When I check the PF state entries during such a DDoS I see mostly states with the "SINGLE" state. > (If that is still too many states, I wonder if your network might > actually be happier if you "pass quick proto udp to $server port 53 no > state" and "pass quick proto udp from $server port 53 no state" right at > the top of the ruleset). That's actually an excellent idea to bypass PF states and hence consume less resources... Next thing to try out. I was also thinking I should use "no state" with CARP and OSPF rules in pf.conf so that in case the PF state table entries is full it does not prevent such important protocols to function. What do you think, would that also work? > Are you already using your DNS server's response rate limiting features? Not yet, as I still believe I should stop as much as possible such traffic at the firewall before it even reaches the network behind my firewall. So at the software/daemon/service level it would be my last line of defense.
Re: PF rate limiting options valid for UDP?
On 2023/07/19 19:13, mabi wrote: > --- Original Message --- > On Wednesday, July 19th, 2023 at 12:40 PM, Stuart Henderson > wrote: > > > I don't think you understood what I wrote then - they are the > > opposite of helpful here. > > No, I do understand what you wrote but I should have explained my case > in more details. Behind my OpenBSD firewall I have two authoritative DNS > servers and because of recent DDoS originating from >12k IPs against UDP > port 53 on these two servers the whole network behind the firewall gets > unresponsive or has a high packet loss because there is over 2 million > states in the PF states table during the attack. So in my specific case > I don't care that cloudflare or other external DNS servers can not query > my DNS authoritative servers for a few seconds or minutes but I do care > a lot that my whole rest of my network and servers behind the OpenBSD > firewall stays responsive. It's a trade-off I can totally accept and > welcome. Furthermore when I have so many state entries due to a DDoS on > UDP port 53, CARP breaks as well as the OSPF sessions with my border > routers because it can not communicate properly within the defined > timeouts. If PF is struggling as it is, there's a good chance it will buckle completely if it has to do source tracking too Did you already tweak timeouts for the rule passing UDP DNS traffic? Defaults are 60s/30s/60s for udp.first, udp.single and udp.multiple respectively, that is much too high for a very busy DNS server - you can set them on the specific rule itself rather than changing defaults for all rules. For an auth server which is expected to respond quickly they can be cranked way down. (If that is still too many states, I wonder if your network might actually be happier if you "pass quick proto udp to $server port 53 no state" and "pass quick proto udp from $server port 53 no state" right at the top of the ruleset). Are you already using your DNS server's response rate limiting features?
Re: PF rate limiting options valid for UDP?
--- Original Message --- On Wednesday, July 19th, 2023 at 12:40 PM, Stuart Henderson wrote: > I don't think you understood what I wrote then - they are the > opposite of helpful here. No, I do understand what you wrote but I should have explained my case in more details. Behind my OpenBSD firewall I have two authoritative DNS servers and because of recent DDoS originating from >12k IPs against UDP port 53 on these two servers the whole network behind the firewall gets unresponsive or has a high packet loss because there is over 2 million states in the PF states table during the attack. So in my specific case I don't care that cloudflare or other external DNS servers can not query my DNS authoritative servers for a few seconds or minutes but I do care a lot that my whole rest of my network and servers behind the OpenBSD firewall stays responsive. It's a trade-off I can totally accept and welcome. Furthermore when I have so many state entries due to a DDoS on UDP port 53, CARP breaks as well as the OSPF sessions with my border routers because it can not communicate properly within the defined timeouts.
Re: PF rate limiting options valid for UDP?
On 19/07/2023 13:31, Stuart Henderson wrote: > On 2023-07-19, Kapetanakis Giannis wrote: >> Maybe even better, can it run under relayd (redirect) on top of carp? > That's just rdr-to behind the scenes, no problem with that, though if > you want to do per IP rate limiting alongside load-balancing you might > want "mode source-hash" rather than the default round-robin or one of > the random options. > > (I wouldn't recommend sticky-address, because then you get into more > complex paths inside PF because it has to maintain source-tracking > information). I don't think source tracking is that important in this case scenario. relayd will only have one host, which will be the dnsdist listening on localhost (on each load balancer). dnsdist will have whatever it can support with stickiness/source-tracking. pf rdr-to could also be an option, but then you loose the carp demotion which relayd provides. thanks G
Re: PF rate limiting options valid for UDP?
On 2023-07-19, mabi wrote: > --- Original Message --- > On Tuesday, July 18th, 2023 at 10:59 PM, Stuart Henderson > wrote: > > >> PF's state-tracking options are only for TCP. (Blocking an IP >> based on number of connections from easily spoofed UDP is a good >> way to let third parties prevent your machine from communicating >> with IPs that may well get in the way i.e. trigger a "self DoS"). > > What a pitty, these kind of rate limiting options for UDP would have been > quite useful. I don't think you understood what I wrote then - they are the opposite of helpful here. Say you are running a DNS recursive resolver with such protection; if someone were to send you spoofed high rate packets from the IPs of the root servers, some big gTLD/ccTLD servers, or big DNS hosters (cloudflare or someone), your lookups will be quite broken. Likewise for an authoritative server: send packets with source IPs of some large DNS recursive resolvers and you then won't be sending replies to legitimate requests from those resolvers. The difference with TCP is that someone sending packets needs to be able to see the response to those packets in order to carry out the handshake. That's not needed for UDP where a single packet in one direction is all that's needed.