Re: pf nat64 rule not matching

2024-03-15 Thread Evan Sherwood
> I don't think there is at present. There are no "only use v4" or "only
> use v6" addresses modifiers, and pf isn't figuring out for itself that
> it only makes sense to use addresses from the relevant family for
> af-to translation addresses (although it _does_ do this for nat-to).

Good to know.

I was able to get this working by using ($wan) instead of ($wan:0),
fwiw.

> Ah I meant that the router should not use the local unbound dns64
> resolver for its own traffic - otherwise it won't be able to reach v4
> hosts because there won't be anything to handle the translation.
> Either point it off-machine (ISP or public resolver) or run another
> local resolver for its own traffic.

Ah, that makes sense. I was totally doing this. *facepalm*

I've changed it to use Quad9. Thanks for the follow-up!

> Please keep replies on the mailing list.

My bad! Still getting used to the `mail` client and how this mailing
list operates in general, and I see now the default behavior is to do a
reply-all that includes your personal email in addition to the mailing
list. Apologies!



Re: pf nat64 rule not matching

2024-03-15 Thread Stuart Henderson
On 2024-03-15, Evan Sherwood  wrote:
>
> Is there a way to configure this without hard-coding my IPv4 address?
> I do not think my IPv4 address from my ISP is static, thus my original
> interest in the ($wan:0) form.

I don't think there is at present. There are no "only use v4" or "only
use v6" addresses modifiers, and pf isn't figuring out for itself that
it only makes sense to use addresses from the relevant family for af-to
translation addresses (although it _does_ do this for nat-to).

>> Regarding the other rules and tests, the ::1 rule is wrong, packets
>> outgoing on the network won't have a ::1 address, try "!received-on
>> any", and packets sourced from the router itself won't hit the af-to
>> rule so tests need to be from another machine (and probably best use
>> different DNS servers not doing dns64 on the router).
>
> Thanks for this follow-up. You're right that I was trying to only target
> traffic that originated from the router itself with this rule. I had
> figured out that the tests needed to be from another machine, though
> that did take me a while.
>
> What are the reasons for doing dns64 on a different machine?

Ah I meant that the router should not use the local unbound dns64
resolver for its own traffic - otherwise it won't be able to reach v4
hosts because there won't be anything to handle the translation.
Either point it off-machine (ISP or public resolver) or run another
local resolver for its own traffic.

-- 
Please keep replies on the mailing list.



Re: pf nat64 rule not matching

2024-03-15 Thread Evan Sherwood
> Try changing ($wan:0) to $(wan) and see what happens.

Huh, that worked! Thanks!



Re: pf nat64 rule not matching

2024-03-15 Thread Lyndon Nerenberg (VE7TFX/VE6BBM)
Try changing ($wan:0) to $(wan) and see what happens.



Re: pf nat64 rule not matching

2024-03-15 Thread Evan Sherwood
> Can you try if the same happens with a more specific rule (for
> testing)?
>
> i.e.:
>
> pass in on igc3 inet6 from "put actual v6 prefix here" to 64:ff9b::/96
> af-to inet from "actual IP on igc0"/32

This worked! Specifically, I think the ($wan:0) was the problem. I
could've sworn I tried this with the actual IP and it wasn't working
before, but I might've deleted the inet6 at that point, so maybe I
created a new problem then... which you also pointed out:

> I am suspecting that the missing inet6 may lead to some confusion.

Is there a way to configure this without hard-coding my IPv4 address?
I do not think my IPv4 address from my ISP is static, thus my original
interest in the ($wan:0) form.

> Alternatively, remove the block rules; URPF may be an issue here, if
> you lack a route for the /96.

I had tried commenting out all of the block rules and saw no change.
Tcpdump also showed no blocks, fwiw.

> Regarding the other rules and tests, the ::1 rule is wrong, packets
> outgoing on the network won't have a ::1 address, try "!received-on
> any", and packets sourced from the router itself won't hit the af-to
> rule so tests need to be from another machine (and probably best use
> different DNS servers not doing dns64 on the router).

Thanks for this follow-up. You're right that I was trying to only target
traffic that originated from the router itself with this rule. I had
figured out that the tests needed to be from another machine, though
that did take me a while.

What are the reasons for doing dns64 on a different machine?



Re: pf nat64 rule not matching

2024-03-15 Thread Stuart Henderson via misc
On 2024-03-15, Tobias Fiebig via misc  wrote:
>
> Moin,
>>     # perform nat64 (NOT WORKING)
>>     pass in to 64:ff9b::/96 af-to inet from ($wan:0)
>
> Can you try if the same happens with a more specific rule (for
> testing)?
>
> i.e.:
>
> pass in on igc3 inet6 from "put actual v6 prefix here" to 64:ff9b::/96
> af-to inet from "actual IP on igc0"/32

"actual IP on igc0" is a good idea. If I try a similar rule without ()
using an interface with v4+v6 addresses, pfctl rejects it due to af
mismatch.

> I am suspecting that the missing inet6 may lead to some confusion.
> Alternatively, remove the block rules; URPF may be an issue here, if
> you lack a route for the /96.

"match log(matches)" and "tcpdump -neipflog0" is your friend for
figuring out which rules are used. I suspect the urpf too.

Regarding the other rules and tests, the ::1 rule is wrong, packets
outgoing on the network won't have a ::1 address, try "!received-on
any", and packets sourced from the router itself won't hit the af-to
rule so tests need to be from another machine (and probably best use
different DNS servers not doing dns64 on the router).




Re: pf nat64 rule not matching

2024-03-15 Thread Tobias Fiebig via misc


Moin,
>     # perform nat64 (NOT WORKING)
>     pass in to 64:ff9b::/96 af-to inet from ($wan:0)

Can you try if the same happens with a more specific rule (for
testing)?

i.e.:

pass in on igc3 inet6 from "put actual v6 prefix here" to 64:ff9b::/96
af-to inet from "actual IP on igc0"/32

I am suspecting that the missing inet6 may lead to some confusion.
Alternatively, remove the block rules; URPF may be an issue here, if
you lack a route for the /96.

A minimal (== based on the default pf.conf) config working  for me:

```
#   $OpenBSD: pf.conf,v 1.55 2017/12/03 20:40:04 sthen Exp $
#
# See pf.conf(5) and /etc/examples/pf.conf

set skip on lo

block return# block stateless traffic
pass# establish keep-state

# By default, do not permit remote connections to X11
block return in on ! lo0 proto tcp to port 6000:6010

# Port build user does not need network
block return out log proto {tcp udp} user _pbuild

pass in on vio0 inet6 from 2a06:d1c0:deac:1:d5:64:a115:1 to
2a06:d1c7:a:4764::/96 af-to inet from 193.104.168.184/29 random
```

With best regards,
Tobias



Re: pf queues

2023-12-01 Thread 4
> On Thu, Nov 30, 2023 at 03:55:49PM +0300, 4 wrote:
>> 
>> "cbq can entirely be expressed in it" ok. so how do i set priorities for 
>> queues in hfsc for my local(not for a router above that knows nothing about 
>> my existence. tos is an absolutely unviable concept in the real world) 
>> pf-router? i don't see a word about it in man pf.conf
>> 

> In my reply to the initial message in this thread, I gave you the references
> that spell this out fairly clearly.

> And you're dead wrong about the pf.conf man page. Unless of course you
> are trying to look this up on a system that still runs something that
> is by now roughly a decade out of date.

i don't understand what you're pointing at, because "prio" and "hfsc" are 
different independent mechanisms, not two parts of one whole. in cbq these were 
two parts of the same mechanism, cbq could simultaneously slice and priotize 
traffic



Re: pf queues

2023-12-01 Thread Stuart Henderson
On 2023/12/01 15:57, 4 wrote:
> >But CBQ doesn't help anyway, you still have this same problem.
> the problem when both from below and from above can be told to you "go and 
> fuck yourself" can't be solved, but cbq gives us two mechanisms we need- 
> priorities and traffic restriction. nothing more can be done. but and less 
> will not suit us

If you still don't see how priorities in CBQ can't help, there's no
point me replying any more.



Re: pf queues

2023-12-01 Thread 4
> On 2023-12-01, 4  wrote:

>I don't know why you are going on about SMT here.
i'm talking about not sacrificing functionality for the sake of hypothetical 
performance. the slides say that using queues degrades performance by 10%. and 
you're saying there won't be anything in the queues until an overload event 
occurs. as i understand it, these are interrelated things ;)

>And there is no way to tell when the upstream router has forwarded the packets.
and we don't need to know that. the only way to find out when an overload 
"occurred" is to set some threshold value lower than the theoretical bandwidth 
of the interface and look when the actual speed on the interface exceeds this 
threshold. and then we will put packets in queues, but not early(so that our 
slaves don't get too tired, right?). but this has nothing with when overload 
actually happens but not in our imagination. in the most cases there is no bond 
between what we have assumed and what is actually happening(because there is no 
feedback. yes, there is ecn, but it doesn't work). 
i don't like this algorithm because it's a non-working algorithm. but an 
algorithm with priorities, when we ALWAYS(and not only when an imaginary 
overload occurred) put a packets in the queues, when we ALWAYS send packets 
with a higher priority first, and all the others only when there are no packets 
with a higher priority in the queue, this algorithm is working. i.e. we always 
use queues, despite the loss of 10% performance. what will happen on the 
overloaded upstream router is no our problem. our area of responsibility is to 
put more important for us packets into the our network card. but this requires 
a constantly working(and not only when an imaginary overload has occurred) 
priority mechanism. that's why i say that "prio" is much more useful than 
"hsfc". 
but it is also possible that traffic as important to us as ssh can take our 
entire channel, and we don't want that. and that's exactly where we need to 
limit the maximum queue speed. there may also be a situation where at least 
some number of packets should be guaranteed to go through some queue, for icmp 
as example, and here we need hsfc, since priorities alone cannot solve this 
problem. or we need cbq that could do it all at once. and i exist for all this 
to work well, it is i who must plan all this competently and prudently- this is 
my area of responsibility. and look, i need priorities and speed limits for 
this, but i don’t need to know how the upstream router is doing. if he has 
problems, he will send me confirmation of receipt less often or he will simply 
discard my packets. but that's his business, not mine. and in the same way my 
router will deal with clients on my local network.

>BTW, HFSC with bandwidth and max set to the same values should be the same 
>thing as CBQ.
except that the hfsc does not have a priority mechanism.

ps:
>But CBQ doesn't help anyway, you still have this same problem.
the problem when both from below and from above can be told to you "go and fuck 
yourself" can't be solved, but cbq gives us two mechanisms we need- priorities 
and traffic restriction. nothing more can be done. but and less will not suit us



Re: pf queues

2023-12-01 Thread Marko Cupać
On Fri, 1 Dec 2023 04:56:40 +0300
4  wrote:

> match proto icmp set prio(6 7) queue(6-fly 7-ack)
> how is this supposed to work at all? i.e. packets are placed both in
> prio's queues 6/7(in theory priorities and queues are the same
> thing), and in hsfc's queues 6-fly/7-ack at once?

I am not sure I understand what you don't understand here.

Straight from manpage:
https://man.openbsd.org/pf.conf#set~2

If two priorities are given, TCP ACKs with no data payload and packets
which have a TOS of lowdelay will be assigned to the second one.

https://man.openbsd.org/pf.conf#set~3

If two queues are given, packets which have a TOS of lowdelay and TCP
ACKs with no data payload will be assigned to the second one.

ICMP is not the best example, but syntax works. I guess the rule you
quoted results in behaviour where all the ICMP packets get priority of
6 and get assigned to queue 6-fly, even though the idea was to have
requests with priority of 6 assigned to queue 6-fly, and replies with
priority of 7 to queue 7-ack. But then again perhaps it works the
latter way, if icmp replies have TOS of lowdelay.

If this was TCP, payload would get priority of 6 and assigned to queue
6-fly, while ACKs would get priority of 7 and assigned to queue 7-ack.

Anyway, after years of usage, and lot of frustration in the beginning, I
find current approach more flexible, because in HFSC queue and priority
have to be the same, while in current pf we can set it to be exactly
like HFSC, but also to have different priorities within the same queue,
or different queue for same priority. At this point I only miss the
ability to see prio values somewhere in monitoring tools like systat.

The only way to get the answers is to test, write ruleset wisely, and
observe systat. If someone knows of some others please let me know, I
am by no means "an expert on pf queueing", just a guy who tries to tame
his employer's network for quite some time now.

Regards,

-- 
Before enlightenment - chop wood, draw water.
After  enlightenment - chop wood, draw water.

Marko Cupać
https://www.mimar.rs/



Re: pf queues

2023-12-01 Thread Stuart Henderson
On 2023-12-01, 4  wrote:
>> On 2023-11-30, 4  wrote:
>>> we can simply calculate such a basic thing as the flow rate by dividing the 
>>> number of bytes in the past packets by the time. we can control the speed 
>>> through delays in sending packets. this is one side of the question. as for 
>>> the sequence, priorities work here. yes, we will send packets with a higher 
>>> priority until there are no such packets left in a queue, and then we will 
>>> send packets from queues with a lower priority. priorities are a sequence, 
>>> not a share of the total piece of the pie, and we don't need to know 
>>> anything about the pie. 
>
>> But unless you are sending more traffic than the *interface* speed,
>> you will be sending it out on receipt, there won't be any delays in
>> sending packets to the next-hop modem/router.
>
>> There won't *be* any packets in the queue on the PF machine to send in
>> priority order.
>
> ok. that is, for the sake of some 10% performance(not so long ago Theo turned 
> off smt, and wanted to remove its support altogether. but smt it's 
> significantly more than 10% of performance) you use queues only when the 
> channel overload, that you are not able to reliably detect, but only assume 
> about its occurrence? there's nothing easier! just put packets in the queue 
> at all times :D

I don't know why you are going on about SMT here. But some workloads
are demonstrably *slower* if SMT is used (the scheduler just treats
them as full cores, when it would probably be better to only permit
threads of the same process to share SMTs on the same core). And of
course there are the known problems that became very apparent with the
CPU vulnerabilities that became widely known *after* OpenBSD disabled
SMT by default. But anyway back to packets.

The only constraint on transmitting packets from the OpenBSD machine
is the network interface facing the next-hop router. Say that is a
1Gbps interface. Say you have 200Mbps of traffic to forward from other
interfaces. And that the upstream connection can handle something
between 100Mbps and 200Mbps but you don't know how much. And there is no
way to tell when the upstream router has forwarded the packets.

BTW, HFSC with bandwidth and max set to the same values should be the
same thing as CBQ. But CBQ doesn't help anyway, you still have this same
problem. The only thing I can think of that might possibly help is to
delay all packets ("set delay") and use prio. I haven't tested to see
if that actually works but maybe. If you want real controls on the PF
box you need to cap to the *minimum* bandwidth and lose anything above
that. Or cap somewhere between the two picked as a trade-off between
lost capacity and not always doing anything useful.



Re: pf queues

2023-12-01 Thread 4
> On 2023-11-30, 4  wrote:
>> we can simply calculate such a basic thing as the flow rate by dividing the 
>> number of bytes in the past packets by the time. we can control the speed 
>> through delays in sending packets. this is one side of the question. as for 
>> the sequence, priorities work here. yes, we will send packets with a higher 
>> priority until there are no such packets left in a queue, and then we will 
>> send packets from queues with a lower priority. priorities are a sequence, 
>> not a share of the total piece of the pie, and we don't need to know 
>> anything about the pie. 

> But unless you are sending more traffic than the *interface* speed,
> you will be sending it out on receipt, there won't be any delays in
> sending packets to the next-hop modem/router.

> There won't *be* any packets in the queue on the PF machine to send in
> priority order.

ok. that is, for the sake of some 10% performance(not so long ago Theo turned 
off smt, and wanted to remove its support altogether. but smt it's 
significantly more than 10% of performance) you use queues only when the 
channel overload, that you are not able to reliably detect, but only assume 
about its occurrence? there's nothing easier! just put packets in the queue at 
all times :D



Re: pf queues

2023-12-01 Thread Stuart Henderson
On 2023-11-30, 4  wrote:
> we can simply calculate such a basic thing as the flow rate by dividing the 
> number of bytes in the past packets by the time. we can control the speed 
> through delays in sending packets. this is one side of the question. as for 
> the sequence, priorities work here. yes, we will send packets with a higher 
> priority until there are no such packets left in a queue, and then we will 
> send packets from queues with a lower priority. priorities are a sequence, 
> not a share of the total piece of the pie, and we don't need to know anything 
> about the pie. 

But unless you are sending more traffic than the *interface* speed,
you will be sending it out on receipt, there won't be any delays in
sending packets to the next-hop modem/router.

There won't *be* any packets in the queue on the PF machine to send in
priority order.




Re: pf queues

2023-11-30 Thread 4
> On Wed, 29 Nov 2023 00:12:02 +0300
> 4  wrote:

>> i haven't used queues for a long time, but now there is a need.
>> previously, queues had not only a hierarchy, but also a priority. now
>> there is no priority, only the hierarchy exists.

> It took me quite some time to wrap my head around this, having been
> accustomed to HFSC up until 5.5. One can probably find a lot of my
> emails in misc@ archives from that time.

> Nowadays I am matching traffic to prio and queue by protocol and/or
> destination port only. Anything not explicitly matched goes to lowest prio
> queue and logged even when passed, so I can inspect if there are any
> types of traffic which should be put into appropriate prio / queue. All
> the ACKs except those in lowest prio queue get highest (7) priority,
> stuff in lowest prio have lowest prio for ACKs as well.

> # QUEUE MATCHES
> match proto icmp   set prio ( 6 7 ) queue ( 6-fly7-ack ) 
> tag queued


match proto icmp set prio(6 7) queue(6-fly 7-ack)
how is this supposed to work at all? i.e. packets are placed both in prio's 
queues 6/7(in theory priorities and queues are the same thing), and in hsfc's 
queues 6-fly/7-ack at once? i am surprised that this rule does not cause a 
syntax error. it looks interesting(i didn't know it was possible. is this 
definitely not a bug? :D), but still i don't understand the intent %\ i need to 
think about it and experiment. thank you, it was very valuable information!



Re: pf queues

2023-11-30 Thread 4
> On 11/29/23 6:47 PM, Stuart Henderson wrote:
>> On 2023-11-29, Daniel Ouellet  wrote:
 yes, all this can be make without hierarchy, only with priorities(because 
 hierarchy it's priorities), but who and why decided that eight would be 
 enough? the one who created cbq- he created it for practical tasks. but 
 this "hateful eight" and this "flat-earth"- i don't understand what use 
 they are, they can't even solve such the simplified task :\
 so what am i missing?
>>>
>>> man pf.conf
>>>
>>> Look for set tos. Just a few lines below set prio in the man age,
>>>
>>> You can have more then 8 if you need/have to.
>> > Only useful if devices upstream of the PF router know their available
>> bandwidth and can do some QoS themselves.
>> 
> Same can be said for CoS as well. You can only control what's going out of 
> your own network. After that as soon as it reach your ISP or what not, you 
> have no clue if they reset everything or not.

> At a minimum ToS can cross routers, CoS not so much unless it is build for it.

> Either way, your QoS will kick in when bandwidth is starving, so if you don't 
> know that, what's the point...

i do not understand how qos and all its components relate to my question, since 
first we need a working mechanism that would be able to restrict and prioritize 
traffic(i.e. cbq is needed), and only then we can put something into this 
mechanism based on qos values. i.e. that is qos here, in principle, cannot be a 
solution of the problem.
we have the separate independent mechanism "prio", which can prioritize traffic 
with the limited opportunity(only eight queues), but does not know how to 
restrict him, and we have the separate independent mechanism "hsfc", which can 
restrict traffic, but does not know how to prioritize it(although it is claimed 
that it can, but i do not see how to do it). what happens on a provider's 
hardware is beyond parentheses and generally matters no more than the weather 
on Mars.
so how the hell we can make cbq from hsfc? let's answer this question, because 
the slides claim that the answer exists



Re: pf queues

2023-11-30 Thread Marko Cupać
On Wed, 29 Nov 2023 00:12:02 +0300
4  wrote:

> i haven't used queues for a long time, but now there is a need.
> previously, queues had not only a hierarchy, but also a priority. now
> there is no priority, only the hierarchy exists.

It took me quite some time to wrap my head around this, having been
accustomed to HFSC up until 5.5. One can probably find a lot of my
emails in misc@ archives from that time.

Nowadays I am matching traffic to prio and queue by protocol and/or
destination port only. Anything not explicitly matched goes to lowest prio
queue and logged even when passed, so I can inspect if there are any
types of traffic which should be put into appropriate prio / queue. All
the ACKs except those in lowest prio queue get highest (7) priority,
stuff in lowest prio have lowest prio for ACKs as well.

# QUEUE MATCHES
match proto icmp   set prio ( 6 7 ) queue ( 6-fly7-ack ) 
tag queued
match proto ospf   set prio ( 6 7 ) queue ( 6-fly7-ack ) 
tag queued
match proto tcp  to port $q_flyset prio ( 6 7 ) queue ( 6-fly7-ack ) 
tag queued
match proto udp  to port $q_flyset prio ( 6 7 ) queue ( 6-fly7-ack ) 
tag queued
match proto tcp  to port $q_sprint set prio ( 5 7 ) queue ( 5-sprint 7-ack ) 
tag queued
match proto udp  to port $q_sprint set prio ( 5 7 ) queue ( 5-sprint 7-ack ) 
tag queued
match proto tcp  to port $q_runset prio ( 4 7 ) queue ( 4-run7-ack ) 
tag queued
match proto udp  to port $q_runset prio ( 4 7 ) queue ( 4-run7-ack ) 
tag queued
match proto tcp  to port $q_walk   set prio ( 2 7 ) queue ( 2-walk   7-ack ) 
tag queued
match proto udp  to port $q_walk   set prio ( 2 7 ) queue ( 2-walk   7-ack ) 
tag queued
match proto tcp  to port $q_crawl  set prio ( 1 7 ) queue ( 1-crawl  7-ack ) 
tag queued
match proto udp  to port $q_crawl  set prio ( 1 7 ) queue ( 1-crawl  7-ack ) 
tag queued
match log! tagged queued   set prio0queue (0-other )

I used to hear a lot of "Don't queue inbound (to the interface)". I
heard a few "you can't queue it inbound (to the interface) - you
already received it. But response to that inbound request will be
assigned to appropriate queue." In my observation the latter appears to
be correct approach. Hence, no 'match out proto...' but 'match
proto...'.

I used to hear a lot of "Don't queue (traffic received on Internet
interface) outbound on LAN interface. You already received packets,
why drop it?". But this is necessary if one wants to give full
bandwidth to e.g. p2p when no other traffic is on the wire, but
throttle it in lieu of DNS or web or IM traffic, once it kicks in.

So here's how my queues look like for a home gateway with 300/30Mbit
link:

# QUEUES
queue wan on $if_ext  bandwidth  24M min   24M max  24M
queue 7-ackparent wan bandwidth   6M min 1500K max  18M qlimit  512
queue 6-flyparent wan bandwidth   2M min  500K max  18M qlimit  512
queue 5-sprint parent wan bandwidth   2M min  500K max  18M qlimit  512
queue 4-runparent wan bandwidth   2M min  500K max  18M qlimit  512
queue 2-walk   parent wan bandwidth   2M min  500K max  18M qlimit 2048
queue 1-crawl  parent wan bandwidth   1M min  250K max  18M qlimit 2048
queue 0-other  parent wan bandwidth   1M min  250K max  18M qlimit 4096 
default

queue lan on $if_int  bandwidth 240M min  240M max 240M
queue 7-ackparent lan bandwidth  60M min   15M max 180M qlimit  512
queue 6-flyparent lan bandwidth  20M min5M max 180M qlimit  512
queue 5-sprint parent lan bandwidth  20M min5M max 180M qlimit  512
queue 4-runparent lan bandwidth  20M min5M max 180M qlimit  512
queue 2-walk   parent lan bandwidth  20M min5M max 180M qlimit 2048
queue 1-crawl  parent lan bandwidth  10M min 2500K max 180M qlimit 2048
queue 0-other  parent lan bandwidth  10M min 2500K max 180M qlimit 4096 
default

Make sure to measure your real download and upload bandwidths, allocate
no more than 90% of it (at worst of its times if it varies).

I get satisfactory results when:
- I set bandwidth, min and max to the same value on parent queue
- Sum of child queue bandwidths does not exceed parent queue bandwidth
- I allocate child queue min values to 1/4 of child queue bandwidth
  values
- I allocate child queue max values to (parent queue - sum of child
  queue min values) or less
- I raise qlimit somewhat for high priority queues, and quite a bit for
  low priority queues
- I flush all the states with 'pfctl -f states' after I do changes to
  ruleset at initial stage of early testing
- I keep an eye on 'systat queues / states / rules' to understand
  exactly which rule triggers assignment to which queue.

Now all of the above is fine for home gateway with just "internet" and
"lan". Things get much more complicated if there are multiple VLANs on
internal interface, GRE / GIF of wireguard tunnels on external
interfaces etc.

I once had 

Re: pf queues

2023-11-30 Thread David Dahlberg
On Thu, 2023-11-30 at 15:55 +0300, 4 wrote:
> "cbq can entirely be expressed in it" ok. so how do i set priorities
> for queues in hfsc

You stack HFSC with link-share service curves with linkshare criterion
1:0 - or in pf.conf(5) terms: "bandwidth 1" and "bandwidth 0".
Or you do not configure queuing at all, as the default one supports the
"prio" argument.

>  for my local(not for a router above that knows nothing about my
> existence.

Your local interface will be at 1G or something similar. There is little
chance, that there will be any queuing at all.



Re: pf queues

2023-11-30 Thread Daniel Ouellet




On 11/29/23 6:47 PM, Stuart Henderson wrote:

On 2023-11-29, Daniel Ouellet  wrote:

yes, all this can be make without hierarchy, only with priorities(because hierarchy it's 
priorities), but who and why decided that eight would be enough? the one who created cbq- he 
created it for practical tasks. but this "hateful eight" and this "flat-earth"- 
i don't understand what use they are, they can't even solve such the simplified task :\
so what am i missing?


man pf.conf

Look for set tos. Just a few lines below set prio in the man age,

You can have more then 8 if you need/have to.


Only useful if devices upstream of the PF router know their available
bandwidth and can do some QoS themselves.



Same can be said for CoS as well. You can only control what's going out 
of your own network. After that as soon as it reach your ISP or what 
not, you have no clue if they reset everything or not.


At a minimum ToS can cross routers, CoS not so much unless it is build 
for it.


Either way, your QoS will kick in when bandwidth is starving, so if you 
don't know that, what's the point...




Re: pf queues

2023-11-30 Thread 4
> On Thu, Nov 30, 2023 at 02:57:23PM +0300, 4 wrote:
>> so what happened to cbq? why such the powerful and useful thing was removed? 
>> or Theo delete it precisely because it was too good for obsd? %D

> Actually, the new queueing system was done by Henning, planned as far back
> as (at least) 2012 (https://quigon.bsws.de/papers/2012/bsdcan/), finally 
> available to the general public in OpenBSD 5.5 two years later. 

> ALTQ support was removed from OpenBSD in time for the OpenBSD 5.6 release
> (November 2014).

> So, it's been a while and whatever you were running most certainly needed
> an upgrade anyway. 

"cbq can entirely be expressed in it" ok. so how do i set priorities for queues 
in hfsc for my local(not for a router above that knows nothing about my 
existence. tos is an absolutely unviable concept in the real world) pf-router? 
i don't see a word about it in man pf.conf



Re: pf queues

2023-11-30 Thread Peter N. M. Hansteen
On Thu, Nov 30, 2023 at 03:55:49PM +0300, 4 wrote:
> 
> "cbq can entirely be expressed in it" ok. so how do i set priorities for 
> queues in hfsc for my local(not for a router above that knows nothing about 
> my existence. tos is an absolutely unviable concept in the real world) 
> pf-router? i don't see a word about it in man pf.conf
> 

In my reply to the initial message in this thread, I gave you the references
that spell this out fairly clearly.

And you're dead wrong about the pf.conf man page. Unless of course you
are trying to look this up on a system that still runs something that
is by now roughly a decade out of date.

-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.



Re: pf queues

2023-11-30 Thread 4
> On 2023-11-29, 4  wrote:
>> here is a simple task, there are millions of such tasks. there is an
>> internet connection, and although it is declared as symmetrical 100mbit
>> it's 100 for download, but for upload it depends on the time of day, so
>> we can forget about the channel width and focus on the only thing that
>> matters- priorities.

> But wait. If you don't know how much bandwidth is available, everything
> else goes out the window.

> If you don't know how much bw is available in total, you can't decide
> how much to allocate to each connection, so even the basic bandwidth
> control can't really work, let alone prioritising access to the
> available capacity.

> Priorities work when you are trying to transmit more out of an interface
> than the bandwidth available on that interface.

> Say you have a box running PF with a 1Gb interface to a
> (router/modem/whatever) with an uplink of somewhere between 100-200Mb.

> If you use only priorities in PF, in that case they can only take effect
when you have >>1Gb of traffic to send out.

> If you queue with a max bw 200Mb, but only 100Mb is available on the
> line at times, during those times all that happens is you defer any
> queueing decisions to the (router/modem).

> The only way to get bandwidth control out of PF in that case is to
> either limit to _less than the guaranteed minimum_ (say 100Mb in that
> example), losing capacity at other times. Or if you have some way to
> fetch the real line speed at various times and adjust the queue speed
> in the ruleset.

>> --|
>> --normal
>> ---|
>> ---low
>> but hierarchy is not enough, we need priorities, since each of these three 
>> queues can contain other queues. for example, the "high" queue may contain, 
>> in addition to the "normal" queue, "icmp" and "ssh" queues, which are more 
>> important than the "normal" queue, in which, for example, we will have http, 
>> ftp and other non-critical traffic. therefore, we assign priority 0 to the 
>> "normal" queue, priority 1 to the "ssh" queue and limit its maximum 
>> bandwidth to 10mb(so that ssh does not eat up the entire channel when 
>> copying files), and assign priority 2 to the "icmp" queue(icmp is more 
>> important than ssh). i.e. icmp packets will leave first, then ssh packets, 
>> and then packets from the "normal" queue and its subqueues(or they won't 
>> leave if we don't restrict ssh and it eats up the entire channel)

> if PF doesn't know the real bandwidth, it _can't_ defer sending lower-
> priority traffic until after higher-prio has been sent, because it doesn't
> know if won't make it over the line...

you're saying that it's impossible to manage traffic if we don't know the real 
bandwidth of the channel(in 99% of cases we don't know it, because it changes 
over time. tariffs with guaranteed speed are rare even in russia, and here 
things are much better with the availability and quality of the inet than the 
world average(speedtest have the statistics)), but in the end you say the way 
to do it. are you kidding me? :D
we can simply calculate such a basic thing as the flow rate by dividing the 
number of bytes in the past packets by the time. we can control the speed 
through delays in sending packets. this is one side of the question. as for the 
sequence, priorities work here. yes, we will send packets with a higher 
priority until there are no such packets left in a queue, and then we will send 
packets from queues with a lower priority. priorities are a sequence, not a 
share of the total piece of the pie, and we don't need to know anything about 
the pie. 
as for the minimum guaranteed bandwidth, if it is set, then just send packets 
as they appear, assuming that they have the highest priority. send until the 
speed of such packets not exceeds the guaranteed, all packets above that should 
be sent based on the given priorities. this is not socialism, where everyone 
will be fed, this is capitalism, where you will starve and die if you do not 
belong to the priority elite %D (yes, yes, i know that socialism and capitalism 
are not about that, but in practice these are their distinctive features). but 
this is how it should be in the matter of packets traffic. 
so, where am i wrong and why do we need to know the current bandwidth of the 
channel?



Re: pf queues

2023-11-30 Thread Peter N. M. Hansteen
On Thu, Nov 30, 2023 at 02:57:23PM +0300, 4 wrote:
> so what happened to cbq? why such the powerful and useful thing was removed? 
> or Theo delete it precisely because it was too good for obsd? %D

Actually, the new queueing system was done by Henning, planned as far back
as (at least) 2012 (https://quigon.bsws.de/papers/2012/bsdcan/), finally 
available to the general public in OpenBSD 5.5 two years later. 

ALTQ support was removed from OpenBSD in time for the OpenBSD 5.6 release
(November 2014).

So, it's been a while and whatever you were running most certainly needed
an upgrade anyway. 

-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.



Re: pf queues

2023-11-30 Thread 4
so what happened to cbq? why such the powerful and useful thing was removed? or 
Theo delete it precisely because it was too good for obsd? %D



Re: pf queues

2023-11-29 Thread Stuart Henderson
On 2023-11-29, Daniel Ouellet  wrote:
>> yes, all this can be make without hierarchy, only with priorities(because 
>> hierarchy it's priorities), but who and why decided that eight would be 
>> enough? the one who created cbq- he created it for practical tasks. but this 
>> "hateful eight" and this "flat-earth"- i don't understand what use they are, 
>> they can't even solve such the simplified task :\
>> so what am i missing?
>
> man pf.conf
>
> Look for set tos. Just a few lines below set prio in the man age,
>
> You can have more then 8 if you need/have to.

Only useful if devices upstream of the PF router know their available
bandwidth and can do some QoS themselves.



Re: pf queues

2023-11-29 Thread Stuart Henderson
On 2023-11-29, 4  wrote:
> here is a simple task, there are millions of such tasks. there is an
> internet connection, and although it is declared as symmetrical 100mbit
> it's 100 for download, but for upload it depends on the time of day, so
> we can forget about the channel width and focus on the only thing that
> matters- priorities.

But wait. If you don't know how much bandwidth is available, everything
else goes out the window.

If you don't know how much bw is available in total, you can't decide
how much to allocate to each connection, so even the basic bandwidth
control can't really work, let alone prioritising access to the
available capacity.

Priorities work when you are trying to transmit more out of an interface
than the bandwidth available on that interface.

Say you have a box running PF with a 1Gb interface to a
(router/modem/whatever) with an uplink of somewhere between 100-200Mb.

If you use only priorities in PF, in that case they can only take effect
when you have >1Gb of traffic to send out.

If you queue with a max bw 200Mb, but only 100Mb is available on the
line at times, during those times all that happens is you defer any
queueing decisions to the (router/modem).

The only way to get bandwidth control out of PF in that case is to
either limit to _less than the guaranteed minimum_ (say 100Mb in that
example), losing capacity at other times. Or if you have some way to
fetch the real line speed at various times and adjust the queue speed
in the ruleset.

> --|
> --normal
> ---|
> ---low
> but hierarchy is not enough, we need priorities, since each of these three 
> queues can contain other queues. for example, the "high" queue may contain, 
> in addition to the "normal" queue, "icmp" and "ssh" queues, which are more 
> important than the "normal" queue, in which, for example, we will have http, 
> ftp and other non-critical traffic. therefore, we assign priority 0 to the 
> "normal" queue, priority 1 to the "ssh" queue and limit its maximum bandwidth 
> to 10mb(so that ssh does not eat up the entire channel when copying files), 
> and assign priority 2 to the "icmp" queue(icmp is more important than ssh). 
> i.e. icmp packets will leave first, then ssh packets, and then packets from 
> the "normal" queue and its subqueues(or they won't leave if we don't restrict 
> ssh and it eats up the entire channel)

if PF doesn't know the real bandwidth, it _can't_ defer sending lower-
priority traffic until after higher-prio has been sent, because it doesn't
know if won't make it over the line...




Re: pf queues

2023-11-29 Thread Daniel Ouellet

yes, all this can be make without hierarchy, only with priorities(because hierarchy it's 
priorities), but who and why decided that eight would be enough? the one who created cbq- he 
created it for practical tasks. but this "hateful eight" and this "flat-earth"- 
i don't understand what use they are, they can't even solve such the simplified task :\
so what am i missing?


man pf.conf

Look for set tos. Just a few lines below set prio in the man age,

You can have more then 8 if you need/have to.



Re: pf queues

2023-11-29 Thread 4


> On Wed, Nov 29, 2023 at 12:12:02AM +0300, 4 wrote:
>> i haven't used queues for a long time, but now there is a need. previously, 
>> queues had not only a hierarchy, but also a priority. now there is no 
>> priority, only the hierarchy exists. i was surprised, but i thought that 
>> this is quite in the way of Theo, and it is possible to simplify the queue 
>> mechanism only to the hierarchy, meaning that if a queue standing higher in 
>> the hierarchy, and he priority is higher. but in order for it to work this 
>> way, it is necessary to allow assigning packets to any queue, and not just 
>> to the last one, because when you assign only to the last queue in the 
>> hierarchy, then in practice it means that you have no hierarchy and no 
>> queues. and although the rule with the assignment to a queue above the last 
>> one is not syntactically incorrect, but in practice the assignment is not 
>> performed, and the packets fall into the default(last) queue. am i missing 
>> something or is it really idiocy that humanity has not seen yet?
>> 
> How long ago is it that you did anything with queues?

> the older ALTQ system was replaced by a whole new system back in OpenBSD 5.5
> (or actually, altq lived on as oldqeueue through 5.6), and the syntax is both
> very different and in most things much simpler to deal with.

> The most extensive treatment available is in The Book of PF, 3rd edition
> (actually the introduction of the new queues was the reason for doing that
> revision). If for some reason the book is out of reach, you can likely
> glean most of the useful information from the relevant slides in the
> PF tutorial https://home.nuug.no/~peter/pftutorial/ with the traffic
> shaping part starting at https://home.nuug.no/~peter/pftutorial/#68

looks like i'm phenomenally dumb :(

queue rootq on $ext_if bandwidth 20M
queue main parent rootq bandwidth 20479K min 1M max 20479K qlimit 100
 queue qdef parent main bandwidth 9600K min 6000K max 18M default
 queue qweb parent main bandwidth 9600K min 6000K max 18M
 queue qpri parent main bandwidth 700K min 100K max 1200K
 queue qdns parent main bandwidth 200K min 12K burst 600K for 3000ms
queue spamd parent rootq bandwidth 1K min 0K max 1K qlimit 300
--
this is a flat model. no hierarchy here, because no priorities. it looks as 
hierarchy exists, but this is "fake news" :\ i can't immediately come up with 
at least one task where such a thing would be needed.. probably no such task 
exist.

pass proto tcp to port ssh set prio 6
--
hard coded eight queues/priorities and no bandwidth controls. but this case is 
at least is useful, because priorities is much more important than bandwidth 
limits.

i have a feeling that the person who came up with this is Mad Hatter from the 
Wonderland :\ what was wrong with the cbq engine where all was in one?
here is a simple task, there are millions of such tasks. there is an internet 
connection, and although it is declared as symmetrical 100mbit it's 100 for 
download, but for upload it depends on the time of day, so we can forget about 
the channel width and focus on the only thing that matters- priorities. we make 
three queues, hierarchically connect them to one another:
root
-|
-high
--|
--normal
---|
---low
but hierarchy is not enough, we need priorities, since each of these three 
queues can contain other queues. for example, the "high" queue may contain, in 
addition to the "normal" queue, "icmp" and "ssh" queues, which are more 
important than the "normal" queue, in which, for example, we will have http, 
ftp and other non-critical traffic. therefore, we assign priority 0 to the 
"normal" queue, priority 1 to the "ssh" queue and limit its maximum bandwidth 
to 10mb(so that ssh does not eat up the entire channel when copying files), and 
assign priority 2 to the "icmp" queue(icmp is more important than ssh). i.e. 
icmp packets will leave first, then ssh packets, and then packets from the 
"normal" queue and its subqueues(or they won't leave if we don't restrict ssh 
and it eats up the entire channel). now:
root
-|
-high[normal(0),ssh(1),icmp(2)]
--|
--normal[low(0),default(1),http(2),ftp(2)]
---|
---low[bittorrent(0),putin(0),vodka(0)]
yes, all this can be make without hierarchy, only with priorities(because 
hierarchy it's priorities), but who and why decided that eight would be enough? 
the one who created cbq- he created it for practical tasks. but this "hateful 
eight" and this "flat-earth"- i don't understand what use they are, they can't 
even solve such the simplified task :\
so what am i missing?



Re: pf queues

2023-11-28 Thread Peter N. M. Hansteen
On Wed, Nov 29, 2023 at 12:12:02AM +0300, 4 wrote:
> i haven't used queues for a long time, but now there is a need. previously, 
> queues had not only a hierarchy, but also a priority. now there is no 
> priority, only the hierarchy exists. i was surprised, but i thought that this 
> is quite in the way of Theo, and it is possible to simplify the queue 
> mechanism only to the hierarchy, meaning that if a queue standing higher in 
> the hierarchy, and he priority is higher. but in order for it to work this 
> way, it is necessary to allow assigning packets to any queue, and not just to 
> the last one, because when you assign only to the last queue in the 
> hierarchy, then in practice it means that you have no hierarchy and no 
> queues. and although the rule with the assignment to a queue above the last 
> one is not syntactically incorrect, but in practice the assignment is not 
> performed, and the packets fall into the default(last) queue. am i missing 
> something or is it really idiocy that humanity has not seen yet?
> 
How long ago is it that you did anything with queues?

the older ALTQ system was replaced by a whole new system back in OpenBSD 5.5
(or actually, altq lived on as oldqeueue through 5.6), and the syntax is both
very different and in most things much simpler to deal with.

The most extensive treatment available is in The Book of PF, 3rd edition
(actually the introduction of the new queues was the reason for doing that
revision). If for some reason the book is out of reach, you can likely
glean most of the useful information from the relevant slides in the
PF tutorial https://home.nuug.no/~peter/pftutorial/ with the traffic
shaping part starting at https://home.nuug.no/~peter/pftutorial/#68


-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.



Re: PF Rules for Dual Upstream Gateways

2023-11-23 Thread Stuart Henderson
On 2023-11-22, Ian Timothy  wrote:
> Hello,
>
> I have two ISPs where one connection is primary and the other is 
> low-bandwidth for temporary failover only. ifstated handles the failover by 
> simply changing the default gateway. But under normal conditions I want to be 
> able to connect via either connection at any time without changing the 
> default gateway.
>
> A long time ago under the old pf syntax I had this in /etc/pf.conf which 
> worked fine, and as far as I can remember was the only thing needed to enable 
> this desired behavior:
>
> pass in on $wan1_if reply-to ( $wan1_if $wan1_gw )
> pass in on $wan2_if reply-to ( $wan2_if $wan2_gw )
>
> But I’ve not been able to find the right way to do this under the new pf 
> syntax. From what I’ve been able to find this is supposedly does the same 
> thing, but no success so far:
>
> pass in on $wan1_if reply-to ($wan1_if:peer)
> pass in on $wan2_if reply-to ($wan2_if:peer)

The :peer syntax is for point-to-point interfaces (e.g. pppoe, maybe umb).

> What am I missing? Or this there a better way to do this?

As long as the gateway is at a known address (not a changing address from
DHCP) this should do:

pass in on $wan1_if reply-to $wan1_gw
pass in on $wan2_if reply-to $wan2_gw

You can also have a setup with multiple rtables, but in the simple case,
reply-to is often easier.

-- 
Please keep replies on the mailing list.



Re: pf logging in ascii and send to remote syslog

2023-11-11 Thread Daniele B.
Thnx, this seems toasting better..




Re: pf logging in ascii and send to remote syslog

2023-11-11 Thread Zé Loff
On Sat, Nov 11, 2023 at 06:32:26PM +0100, Daniele B. wrote:
> 
> "Peter N. M. Hansteen" wrote:
> 
> > something like the good old
> > https://home.nuug.no/~peter/pf/newest/log2syslog.html should still
> > work, I think.
> > 
> > - Peter
> 
> 
> To disable pflogd completely what to you consider best:
> 
> ifconfig pflog0 down
> 
> or 
> 
> pflogd_flags="-f /dev/null"
> 
> 
> = Daniele Bonini
> 

rcctl disable pflogd ?

-- 
 



Re: pf logging in ascii and send to remote syslog

2023-11-11 Thread Daniele B.


"Peter N. M. Hansteen" wrote:

> something like the good old
> https://home.nuug.no/~peter/pf/newest/log2syslog.html should still
> work, I think.
> 
> - Peter


To disable pflogd completely what to you consider best:

ifconfig pflog0 down

or 

pflogd_flags="-f /dev/null"


= Daniele Bonini



Re: pf logging in ascii and send to remote syslog

2023-11-11 Thread Hrvoje Popovski
On 11.11.2023. 12:13, Stuart Henderson wrote:
> On 2023-11-11, Peter N. M. Hansteen  wrote:
>> On Fri, Nov 10, 2023 at 08:23:54PM +0100, Hrvoje Popovski wrote:
>>> what would be best way to log pf logs in ascii and sent it to remote
>>> syslog ? I'm aware of pflow but I need ascii pf logs on remote syslog
>>> server.
>>
>> something like the good old 
>> https://home.nuug.no/~peter/pf/newest/log2syslog.html
>> should still work, I think.
> 
> Or 
> https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/www/faq/pf/logging.html?rev=1.68#syslog
> 
> If you don't need _all_ pf logs converting to syslog, you can create a
> separate interface "echo up | doas tee /etc/hostname.pflog1" and use
> "log to pflog1" on selected rules.
> 


Thank you Peter and Stuart that's exactly what I need ...



Re: pf logging in ascii and send to remote syslog

2023-11-11 Thread Stuart Henderson
On 2023-11-11, Peter N. M. Hansteen  wrote:
> On Fri, Nov 10, 2023 at 08:23:54PM +0100, Hrvoje Popovski wrote:
>> what would be best way to log pf logs in ascii and sent it to remote
>> syslog ? I'm aware of pflow but I need ascii pf logs on remote syslog
>> server.
>
> something like the good old 
> https://home.nuug.no/~peter/pf/newest/log2syslog.html
> should still work, I think.

Or 
https://cvsweb.openbsd.org/cgi-bin/cvsweb/~checkout~/www/faq/pf/logging.html?rev=1.68#syslog

If you don't need _all_ pf logs converting to syslog, you can create a
separate interface "echo up | doas tee /etc/hostname.pflog1" and use
"log to pflog1" on selected rules.



Re: pf logging in ascii and send to remote syslog

2023-11-11 Thread Peter N. M. Hansteen
On Fri, Nov 10, 2023 at 08:23:54PM +0100, Hrvoje Popovski wrote:
> what would be best way to log pf logs in ascii and sent it to remote
> syslog ? I'm aware of pflow but I need ascii pf logs on remote syslog
> server.

something like the good old 
https://home.nuug.no/~peter/pf/newest/log2syslog.html
should still work, I think.

- Peter


-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.



Re: PF queue bandwidth limited to 32bit value

2023-09-17 Thread Andy Lemin



> On 15 Sep 2023, at 18:54, Stuart Henderson  wrote:
> 
> On 2023/09/15 13:40, Andy Lemin wrote:
>> Hi Stuart,
>> 
>> Seeing as it seems like everyone is too busy, and my workaround
>> (not queue some flows on interfaces with queue defined) seems of no
>> interest,
> 
> well, it might be, but I'm not sure if it will fit with how
> queues work..

Well I can only hope some more developers sees this :)

> 
>> and my current hack to use queuing on Vlan interfaces is
>> a very incomplete and restrictive workaround; Would you please be
>> so kind as to provide me with a starting point in the source code
>> and variable names to concentrate on, where I can start tracing from
>> beginning to end for changing the scale from bits to bytes?
> 
> maybe try hfsc.c, but overall there are quite a few files involved
> in queue definition and use from start to finish. or going from the
> other side start with how pfctl defines queues and follow through
> from there.
> 

Thank you, I will try (best effort as time permits), and see how far I get.. 
(probably not far ;)




Re: PF queue bandwidth limited to 32bit value

2023-09-15 Thread Stuart Henderson
On 2023/09/15 13:40, Andy Lemin wrote:
> Hi Stuart,
> 
> Seeing as it seems like everyone is too busy, and my workaround
> (not queue some flows on interfaces with queue defined) seems of no
> interest,

well, it might be, but I'm not sure if it will fit with how
queues work..

> and my current hack to use queuing on Vlan interfaces is
> a very incomplete and restrictive workaround; Would you please be
> so kind as to provide me with a starting point in the source code
> and variable names to concentrate on, where I can start tracing from
> beginning to end for changing the scale from bits to bytes?

maybe try hfsc.c, but overall there are quite a few files involved
in queue definition and use from start to finish. or going from the
other side start with how pfctl defines queues and follow through
from there.



Re: PF queue bandwidth limited to 32bit value

2023-09-14 Thread Andy Lemin
Hi Stuart,Seeing as it seems like everyone is too busy, and my workaround (not queue some flows on interfaces with queue defined) seems of no interest, and my current hack to use queuing on Vlan interfaces is a very incomplete and restrictive workaround;Would you please be so kind as to provide me with a starting point in the source code and variable names to concentrate on, where I can start tracing from beginning to end for changing the scale from bits to bytes?Thanks :)AndyOn 14 Sep 2023, at 19:34, Andrew Lemin  wrote:On Thu, Sep 14, 2023 at 7:23 PM Andrew Lemin  wrote:On Wed, Sep 13, 2023 at 8:35 PM Stuart Henderson  wrote:On 2023-09-13, Andrew Lemin  wrote:
> I have noticed another issue while trying to implement a 'prio'-only
> workaround (using only prio ordering for inter-VLAN traffic, and HSFC
> queuing for internet traffic);
> It is not possible to have internal inter-vlan traffic be solely priority
> ordered with 'set prio', as the existence of 'queue' definitions on the
> same internal vlan interfaces (required for internet flows), demands one
> leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into
> the 'default' queue despite queuing not being wanted, and so
> unintentionally clamping all internal traffic to 4294M just because full
> queuing is needed for internet traffic.

If you enable queueing on an interface all traffic sent via that
interface goes via one queue or another.Yes, that is indeed the very problem. Queueing is enabled on the inside interfaces, with bandwidth values set slightly below the ISP capacities (multiple ISP links as well), so that all things work well for all internal users.However this means that inter-vlan traffic from client networks to server networks are restricted to 4294Mbps for no reason.. It would make a huge difference to be able to allow local traffic to flow without being queued/restircted. 

(also, AIUI the correct place for queues is on the physical interface
not the vlan, since that's where the bottleneck is... you can assign
traffic to a queue name as it comes in on the vlan but I believe the
actual queue definition should be on the physical iface).Hehe yes I know. Thanks for sharing though.I actually have very specific reasons for doing this (queues on the VLAN ifaces rather than phy) as there are multiple ISP connections for multiple VLANs, so the VLAN queues are set to restrict for the relevant ISP link etc.Also separate to the multiple ISPs (I wont bore you with why as it is not relevant here), the other reason for queueing on the VLANs is because it allows you to get closer to the 10Gbps figure..Ie, If you have queues on the 10Gbps PHY, you can only egress 4294Mbps to _all_ VLANs. But if you have queues per-VLAN iface, you can egress multiple times 4294Mbps on aggregate.Eg, vlans 10,11,12,13 on single mcx0 trunk. 10->11 can do 4294Mbps and 12->13 can do 4294Mbps, giving over 8Gbps egress in total on the PHY. It is dirty, but like I said, desperate for workarounds... :(  

"required for internet flows" - depends on your network layout.. the
upstream feed doesn't have to go via the same interface as inter-vlan
traffic.I'm not sure what you mean. All the internal networks/vlans are connected to local switches, and the switches have trunk to the firewall which hosts the default gateway for the VLANs and does inter-vlan routing.So all the clients go through the same VLANs/trunk/gateway for inter-vlan as they do for internet. Strict L3/4 filtering is required on inter-vlan traffic.I am honestly looking for support to recognise that this is a correct, valid and common setup, and so there is a genuine need to allow flows to not be queued on interfaces that have queues (which has many potential applications for many use cases, not just mine - so should be of interest to the developers?).Do you know why there has to be a default queue? Yes I know that traffic excluded from queues would take from the same interface the queueing is trying to manage, and potentially causes congestion. However with 10Gbps networking which is beyond common now, this does not matter when the queues are stuck at 4294MbpsDesperately trying to find workarounds that appeal.. Surely the need is a no brainer, and it is just a case of trying to encourage interest from a developer?Thanks :)



Re: PF queue bandwidth limited to 32bit value

2023-09-14 Thread Andrew Lemin
On Thu, Sep 14, 2023 at 7:23 PM Andrew Lemin  wrote:

>
>
> On Wed, Sep 13, 2023 at 8:35 PM Stuart Henderson <
> stu.li...@spacehopper.org> wrote:
>
>> On 2023-09-13, Andrew Lemin  wrote:
>> > I have noticed another issue while trying to implement a 'prio'-only
>> > workaround (using only prio ordering for inter-VLAN traffic, and HSFC
>> > queuing for internet traffic);
>> > It is not possible to have internal inter-vlan traffic be solely
>> priority
>> > ordered with 'set prio', as the existence of 'queue' definitions on the
>> > same internal vlan interfaces (required for internet flows), demands one
>> > leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into
>> > the 'default' queue despite queuing not being wanted, and so
>> > unintentionally clamping all internal traffic to 4294M just because full
>> > queuing is needed for internet traffic.
>>
>> If you enable queueing on an interface all traffic sent via that
>> interface goes via one queue or another.
>>
>
> Yes, that is indeed the very problem. Queueing is enabled on the inside
> interfaces, with bandwidth values set slightly below the ISP capacities
> (multiple ISP links as well), so that all things work well for all internal
> users.
> However this means that inter-vlan traffic from client networks to server
> networks are restricted to 4294Mbps for no reason.. It would make a huge
> difference to be able to allow local traffic to flow without being
> queued/restircted.
>
>
>>
>> (also, AIUI the correct place for queues is on the physical interface
>> not the vlan, since that's where the bottleneck is... you can assign
>> traffic to a queue name as it comes in on the vlan but I believe the
>> actual queue definition should be on the physical iface).
>>
>
> Hehe yes I know. Thanks for sharing though.
> I actually have very specific reasons for doing this (queues on the VLAN
> ifaces rather than phy) as there are multiple ISP connections for multiple
> VLANs, so the VLAN queues are set to restrict for the relevant ISP link etc.
>

Also separate to the multiple ISPs (I wont bore you with why as it is not
relevant here), the other reason for queueing on the VLANs is because it
allows you to get closer to the 10Gbps figure..
Ie, If you have queues on the 10Gbps PHY, you can only egress 4294Mbps to
_all_ VLANs. But if you have queues per-VLAN iface, you can egress multiple
times 4294Mbps on aggregate.
Eg, vlans 10,11,12,13 on single mcx0 trunk. 10->11 can do 4294Mbps and
12->13 can do 4294Mbps, giving over 8Gbps egress in total on the PHY. It is
dirty, but like I said, desperate for workarounds... :(


>
>
>>
>> "required for internet flows" - depends on your network layout.. the
>> upstream feed doesn't have to go via the same interface as inter-vlan
>> traffic.
>
>
> I'm not sure what you mean. All the internal networks/vlans are connected
> to local switches, and the switches have trunk to the firewall which hosts
> the default gateway for the VLANs and does inter-vlan routing.
> So all the clients go through the same VLANs/trunk/gateway for inter-vlan
> as they do for internet. Strict L3/4 filtering is required on inter-vlan
> traffic.
> I am honestly looking for support to recognise that this is a correct,
> valid and common setup, and so there is a genuine need to allow flows to
> not be queued on interfaces that have queues (which has many potential
> applications for many use cases, not just mine - so should be of interest
> to the developers?).
>
> Do you know why there has to be a default queue? Yes I know that traffic
> excluded from queues would take from the same interface the queueing is
> trying to manage, and potentially causes congestion. However with 10Gbps
> networking which is beyond common now, this does not matter when the queues
> are stuck at 4294Mbps
>
> Desperately trying to find workarounds that appeal.. Surely the need is a
> no brainer, and it is just a case of trying to encourage interest from a
> developer?
>
> Thanks :)
>


Re: PF queue bandwidth limited to 32bit value

2023-09-14 Thread Andrew Lemin
On Wed, Sep 13, 2023 at 8:35 PM Stuart Henderson 
wrote:

> On 2023-09-13, Andrew Lemin  wrote:
> > I have noticed another issue while trying to implement a 'prio'-only
> > workaround (using only prio ordering for inter-VLAN traffic, and HSFC
> > queuing for internet traffic);
> > It is not possible to have internal inter-vlan traffic be solely priority
> > ordered with 'set prio', as the existence of 'queue' definitions on the
> > same internal vlan interfaces (required for internet flows), demands one
> > leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into
> > the 'default' queue despite queuing not being wanted, and so
> > unintentionally clamping all internal traffic to 4294M just because full
> > queuing is needed for internet traffic.
>
> If you enable queueing on an interface all traffic sent via that
> interface goes via one queue or another.
>

Yes, that is indeed the very problem. Queueing is enabled on the inside
interfaces, with bandwidth values set slightly below the ISP capacities
(multiple ISP links as well), so that all things work well for all internal
users.
However this means that inter-vlan traffic from client networks to server
networks are restricted to 4294Mbps for no reason.. It would make a huge
difference to be able to allow local traffic to flow without being
queued/restircted.


>
> (also, AIUI the correct place for queues is on the physical interface
> not the vlan, since that's where the bottleneck is... you can assign
> traffic to a queue name as it comes in on the vlan but I believe the
> actual queue definition should be on the physical iface).
>

Hehe yes I know. Thanks for sharing though.
I actually have very specific reasons for doing this (queues on the VLAN
ifaces rather than phy) as there are multiple ISP connections for multiple
VLANs, so the VLAN queues are set to restrict for the relevant ISP link etc.


>
> "required for internet flows" - depends on your network layout.. the
> upstream feed doesn't have to go via the same interface as inter-vlan
> traffic.


I'm not sure what you mean. All the internal networks/vlans are connected
to local switches, and the switches have trunk to the firewall which hosts
the default gateway for the VLANs and does inter-vlan routing.
So all the clients go through the same VLANs/trunk/gateway for inter-vlan
as they do for internet. Strict L3/4 filtering is required on inter-vlan
traffic.
I am honestly looking for support to recognise that this is a correct,
valid and common setup, and so there is a genuine need to allow flows to
not be queued on interfaces that have queues (which has many potential
applications for many use cases, not just mine - so should be of interest
to the developers?).

Do you know why there has to be a default queue? Yes I know that traffic
excluded from queues would take from the same interface the queueing is
trying to manage, and potentially causes congestion. However with 10Gbps
networking which is beyond common now, this does not matter when the queues
are stuck at 4294Mbps

Desperately trying to find workarounds that appeal.. Surely the need is a
no brainer, and it is just a case of trying to encourage interest from a
developer?

Thanks :)


Re: PF queue bandwidth limited to 32bit value

2023-09-14 Thread Andrew Lemin
On Wed, Sep 13, 2023 at 8:22 PM Stuart Henderson 
wrote:

> On 2023-09-12, Andrew Lemin  wrote:
> > A, thats clever! Having bandwidth queues up to 34,352M would
> definitely
> > provide runway for the next decade :)
> >
> > Do you think your idea is worth circulating on tech@ for further
> > discussion? Queueing at bps resolution is rather redundant nowadays, even
> > on the very slowest links.
>
> tech@ is more for diffs or technical questions rather than not-fleshed-out
> quick ideas. Doing this would solve some problems with the "just change it
> to 64-bit" mooted on the freebsd-pf list (not least with 32-bit archs),
> but would still need finding all the places where the bandwidth values are
> used and making sure they're updated to cope.
>
>
Yes good point :) I am not in a position to undertake this myself at the
moment.
If none of the generous developers feel included to do this despite the
broad value, I might have a go myself at some point (probably not able
until next year sadly).

"just change it to 64-bit" mooted on the freebsd-pf list - I have been
unable to find this conversation. Do you have a link?


>
> --
> Please keep replies on the mailing list.
>
>


Re: PF queue bandwidth limited to 32bit value

2023-09-13 Thread Stuart Henderson
On 2023-09-13, Andrew Lemin  wrote:
> I have noticed another issue while trying to implement a 'prio'-only
> workaround (using only prio ordering for inter-VLAN traffic, and HSFC
> queuing for internet traffic);
> It is not possible to have internal inter-vlan traffic be solely priority
> ordered with 'set prio', as the existence of 'queue' definitions on the
> same internal vlan interfaces (required for internet flows), demands one
> leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into
> the 'default' queue despite queuing not being wanted, and so
> unintentionally clamping all internal traffic to 4294M just because full
> queuing is needed for internet traffic.

If you enable queueing on an interface all traffic sent via that
interface goes via one queue or another.

(also, AIUI the correct place for queues is on the physical interface
not the vlan, since that's where the bottleneck is... you can assign
traffic to a queue name as it comes in on the vlan but I believe the
actual queue definition should be on the physical iface).

"required for internet flows" - depends on your network layout.. the
upstream feed doesn't have to go via the same interface as inter-vlan
traffic.
 



Re: PF queue bandwidth limited to 32bit value

2023-09-13 Thread Stuart Henderson
On 2023-09-12, Andrew Lemin  wrote:
> A, thats clever! Having bandwidth queues up to 34,352M would definitely
> provide runway for the next decade :)
>
> Do you think your idea is worth circulating on tech@ for further
> discussion? Queueing at bps resolution is rather redundant nowadays, even
> on the very slowest links.

tech@ is more for diffs or technical questions rather than not-fleshed-out
quick ideas. Doing this would solve some problems with the "just change it
to 64-bit" mooted on the freebsd-pf list (not least with 32-bit archs),
but would still need finding all the places where the bandwidth values are
used and making sure they're updated to cope.


-- 
Please keep replies on the mailing list.



Re: PF queue bandwidth limited to 32bit value

2023-09-12 Thread Andrew Lemin
On Wed, Sep 13, 2023 at 3:43 AM Andrew Lemin  wrote:

> Hi Stuart.
>
> On Wed, Sep 13, 2023 at 12:25 AM Stuart Henderson <
> stu.li...@spacehopper.org> wrote:
>
>> On 2023-09-12, Andrew Lemin  wrote:
>> > Hi all,
>> > Hope this finds you well.
>> >
>> > I have discovered that PF's queueing is still limited to 32bit bandwidth
>> > values.
>> >
>> > I don't know if this is a regression or not.
>>
>> It's not a regression, it has been capped at 32 bits afaik forever
>> (certainly was like that when the separate classification via altq.conf
>> was merged into PF config, in OpenBSD 3.3).
>>
>
> Ah ok, it was talked about so much I thought it was part of it. Thanks for
> clarifying.
>
>
>>
>> >  I am sure one of the
>> > objectives of the ALTQ rewrite into the new queuing system we have in
>> > OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I
>> am
>> > imagining it..
>>
>> I don't recall that though there were some hopes expressed by
>> non-developers.
>>
>
> Haha, it is definitely still wanted and needed. prio-only based ordering
> is too limited
>

I have noticed another issue while trying to implement a 'prio'-only
workaround (using only prio ordering for inter-VLAN traffic, and HSFC
queuing for internet traffic);
It is not possible to have internal inter-vlan traffic be solely priority
ordered with 'set prio', as the existence of 'queue' definitions on the
same internal vlan interfaces (required for internet flows), demands one
leaf queue be set as 'default'. Thus forcing all inter-vlan traffic into
the 'default' queue despite queuing not being wanted, and so
unintentionally clamping all internal traffic to 4294M just because full
queuing is needed for internet traffic.
In fact 'prio' is irrelevant, as with or without 'prio' because queue's are
required for internet traffic, all internal traffic becomes bound by the
'default' HSFC queue.

So I would propose that the mandate on the 'default' keyword is relaxed (or
a new keyword is provided for match/pass rules to force flows to not be
queued), and/or implement the uint32 scale in bytes, instead of bits?

I personally believe both are valid and needed?


>
>
>>
>> > Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN
>> routing
>> > with 10Gbps trunks, and I cannot set the queue bandwidth higher than a
>> > 32bit value?
>> >
>> > Setting the bandwidth value to 4295M results in a value overflow where
>> > 'systat queues' shows it wrapped and starts from 0 again. And traffic is
>> > indeed restricted to such values, so does not appear to be just a
>> cosmetic
>> > 'systat queues' issue.
>> >
>> > I am sure this must be a bug/regression,
>>
>> I'd say a not-implemented feature (and I have a feeling it is not
>> going to be all that simple a thing to implement - though changing
>> scales so the uint32 carries bytes instead of bits per second might
>> not be _too_ terrible).
>>
>
> Following the great work to SMP unlock in the VLAN interface, and recent
> NIC optimisations (offloading and interrupt handling) in various drivers,
> you can now push packet filtered 10Gbps with modern CPUs without breaking a
> sweat..
>
> A, thats clever! Having bandwidth queues up to 34,352M would
> definitely provide runway for the next decade :)
>
> Do you think your idea is worth circulating on tech@ for further
> discussion? Queueing at bps resolution is rather redundant nowadays, even
> on the very slowest links.
>
>
>> >  10Gbps on OpenBSD is trivial
>> and
>> > common nowadays..
>>
>> While using interfaces with 10Gbps link speed on OpenBSD is trivial,
>> actually pushing that much traffic (particularly with more complex
>> processing e.g. things like bandwidth controls, and particularly with
>> smaller packet sizes) not so much.
>>
>>
>> --
>> Please keep replies on the mailing list.
>>
>>
Thanks again, Andy.


Re: PF queue bandwidth limited to 32bit value

2023-09-12 Thread Andrew Lemin
Hi Stuart.

On Wed, Sep 13, 2023 at 12:25 AM Stuart Henderson 
wrote:

> On 2023-09-12, Andrew Lemin  wrote:
> > Hi all,
> > Hope this finds you well.
> >
> > I have discovered that PF's queueing is still limited to 32bit bandwidth
> > values.
> >
> > I don't know if this is a regression or not.
>
> It's not a regression, it has been capped at 32 bits afaik forever
> (certainly was like that when the separate classification via altq.conf
> was merged into PF config, in OpenBSD 3.3).
>

Ah ok, it was talked about so much I thought it was part of it. Thanks for
clarifying.


>
> >  I am sure one of the
> > objectives of the ALTQ rewrite into the new queuing system we have in
> > OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I
> am
> > imagining it..
>
> I don't recall that though there were some hopes expressed by
> non-developers.
>

Haha, it is definitely still wanted and needed. prio-only based ordering is
too limited


>
> > Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN
> routing
> > with 10Gbps trunks, and I cannot set the queue bandwidth higher than a
> > 32bit value?
> >
> > Setting the bandwidth value to 4295M results in a value overflow where
> > 'systat queues' shows it wrapped and starts from 0 again. And traffic is
> > indeed restricted to such values, so does not appear to be just a
> cosmetic
> > 'systat queues' issue.
> >
> > I am sure this must be a bug/regression,
>
> I'd say a not-implemented feature (and I have a feeling it is not
> going to be all that simple a thing to implement - though changing
> scales so the uint32 carries bytes instead of bits per second might
> not be _too_ terrible).
>

Following the great work to SMP unlock in the VLAN interface, and recent
NIC optimisations (offloading and interrupt handling) in various drivers,
you can now push packet filtered 10Gbps with modern CPUs without breaking a
sweat..

A, thats clever! Having bandwidth queues up to 34,352M would definitely
provide runway for the next decade :)

Do you think your idea is worth circulating on tech@ for further
discussion? Queueing at bps resolution is rather redundant nowadays, even
on the very slowest links.


> >  10Gbps on OpenBSD is trivial and
> > common nowadays..
>
> While using interfaces with 10Gbps link speed on OpenBSD is trivial,
> actually pushing that much traffic (particularly with more complex
> processing e.g. things like bandwidth controls, and particularly with
> smaller packet sizes) not so much.
>
>
> --
> Please keep replies on the mailing list.
>
>


Re: PF queue bandwidth limited to 32bit value

2023-09-12 Thread Stuart Henderson
On 2023-09-12, Andrew Lemin  wrote:
> Hi all,
> Hope this finds you well.
>
> I have discovered that PF's queueing is still limited to 32bit bandwidth
> values.
>
> I don't know if this is a regression or not.

It's not a regression, it has been capped at 32 bits afaik forever
(certainly was like that when the separate classification via altq.conf
was merged into PF config, in OpenBSD 3.3).

>  I am sure one of the
> objectives of the ALTQ rewrite into the new queuing system we have in
> OpenBSD today, was to allow bandwidth values larger than 4294M. Maybe I am
> imagining it..

I don't recall that though there were some hopes expressed by
non-developers.

> Anyway, I am trying to use OpenBSD PF to perform/filter Inter-VLAN routing
> with 10Gbps trunks, and I cannot set the queue bandwidth higher than a
> 32bit value?
>
> Setting the bandwidth value to 4295M results in a value overflow where
> 'systat queues' shows it wrapped and starts from 0 again. And traffic is
> indeed restricted to such values, so does not appear to be just a cosmetic
> 'systat queues' issue.
>
> I am sure this must be a bug/regression,

I'd say a not-implemented feature (and I have a feeling it is not
going to be all that simple a thing to implement - though changing
scales so the uint32 carries bytes instead of bits per second might
not be _too_ terrible).

>  10Gbps on OpenBSD is trivial and
> common nowadays..

While using interfaces with 10Gbps link speed on OpenBSD is trivial,
actually pushing that much traffic (particularly with more complex
processing e.g. things like bandwidth controls, and particularly with
smaller packet sizes) not so much.


-- 
Please keep replies on the mailing list.



Re: pf state-table-induced instability

2023-08-31 Thread David Gwynne
On Thu, Aug 31, 2023 at 04:10:06PM +0200, Gabor LENCSE wrote:
> Dear David,
> 
> Thank you very much for all the new information!
> 
> I keep only those parts that I want to react.
> 
> > > It is not a fundamental issue, but it seems to me that during my tests not
> > > only four but five CPU cores were used by IP packet forwarding:
> > the packet processing is done in kernel threads (task queues are built
> > on threads), and those threads could be scheduled on any cpu. the
> > pf purge processing runs in yet another thread.
> > 
> > iirc, the schedule scans down the list of cpus looking for an idle
> > one when it needs to run stuff, except to avoid cpu0 if possible.
> > this is why you see most of the system time on cpus 1 to 5.
> 
> Yes, I can confirm that any time I observed, CPU00 was not used by the
> system tasks.
> 
> However, I remembered that PF was disabled during my stateless tests, so I
> think its purge could not be the one that used CPU05. Now I repeated the
> experiment, first disabling PF as follows:

disabling pf means it doesnt get run for packets in the network stack.
however, the once the state purge processing is started it just keeps
running. if you have zero states, there wont be much to process though.

there will be other things running in the system that could account for
the "extra" cpu utilisation.

> dut# pfctl -d
> pf disabled
> 
> And I can still see FIVE CPU cores used by system tasks:

the network stack runs in these threads. pf is just one part of the
network stack.

> 
> load averages:?? 0.69,?? 0.29,
> 0.13 dut.cntrg
> 14:41:06
> 36 processes: 35 idle, 1 on processor up 0 days 00:03:46
> CPU00 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 8.1% intr,
> 91.7% idle
> CPU01 states:?? 0.0% user,?? 0.0% nice, 61.1% sys,?? 9.5% spin, 9.5% intr,
> 19.8% idle
> CPU02 states:?? 0.0% user,?? 0.0% nice, 62.8% sys, 10.9% spin, 8.5% intr,
> 17.8% idle
> CPU03 states:?? 0.0% user,?? 0.0% nice, 54.7% sys,?? 9.1% spin, 10.1% intr,
> 26.0% idle
> CPU04 states:?? 0.0% user,?? 0.0% nice, 62.7% sys, 10.2% spin, 9.8% intr,
> 17.4% idle
> CPU05 states:?? 0.0% user,?? 0.0% nice, 51.7% sys,?? 9.1% spin, 7.6% intr,
> 31.6% idle
> CPU06 states:?? 0.2% user,?? 0.0% nice,?? 2.8% sys,?? 0.8% spin, 10.0% intr,
> 86.1% idle
> CPU07 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.2% intr,
> 92.6% idle
> CPU08 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.0% spin, 8.4% intr,
> 91.6% idle
> CPU09 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.0% spin, 9.2% intr,
> 90.8% idle
> CPU10 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 10.8% intr,
> 89.0% idle
> CPU11 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 9.2% intr,
> 90.6% idle
> CPU12 states:?? 0.0% user,?? 0.0% nice,?? 0.2% sys,?? 0.8% spin, 9.2% intr,
> 89.8% idle
> CPU13 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.2% intr,
> 92.6% idle
> CPU14 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.8% spin, 9.8% intr,
> 89.4% idle
> CPU15 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.8% intr,
> 92.0% idle
> Memory: Real: 34M/1546M act/tot Free: 122G Cache: 807M Swap: 0K/256M
> 
> I suspect that top shows an average (in a few seconds time window) and
> perhaps one of the cores from CPU01 to CPU04 are skipped (e.g. because it
> was used by the "top" command?), this is why I can see system load on CPU05.
> (There is even some low amount of system load on CPU06.)
> 
> 
> > > *Is there any way to completely delete its entire content?*
> > hrm.
> > 
> > so i just read the code again. "pfctl -F states" goes through the whole
> > state table and unlinks the states from the red-black trees used for
> > packet processing, and then marks them as unlinked so the purge process
> > can immediately claim then as soon as they're scanned. this means that
> > in terms of packet processing the tree is empty. the memory (which is
> > what the state limit applies to) won't be reclaimed until the purge
> > processing takes them.
> > 
> > if you just wait 10 or so seconds after "pfctl -F states" then both the
> > tree and state limits should be back to 0. you can watch pfctl -si,
> > "systat pf", or the pfstate row in "systat pool" to confirm.
> > 
> > you can change the scan interval with "set timeout interval" in pf.conf
> > from 10s. no one fiddles with that though, so i'd put it back between
> > runs to be representative of real world performance.
> 
> I usually wait 10s between the consecutive steps of the binary search of my
> measurements to give the system a chance to relax (trying to ensure that the
> steps are independent measurements). However, the timeout interval of PF was
> set to 1 hour (using "set timeout interval 3600"). You may ask, why?
> 
> To have some well defined performance metrics, and to define repeatable and
> reproducible measurements, we use the following tests:
> - maximum connection establishment 

Re: pf state-table-induced instability

2023-08-31 Thread Gabor LENCSE

Dear David,

Thank you very much for all the new information!

I keep only those parts that I want to react.


It is not a fundamental issue, but it seems to me that during my tests not
only four but five CPU cores were used by IP packet forwarding:

the packet processing is done in kernel threads (task queues are built
on threads), and those threads could be scheduled on any cpu. the
pf purge processing runs in yet another thread.

iirc, the schedule scans down the list of cpus looking for an idle
one when it needs to run stuff, except to avoid cpu0 if possible.
this is why you see most of the system time on cpus 1 to 5.


Yes, I can confirm that any time I observed, CPU00 was not used by the 
system tasks.


However, I remembered that PF was disabled during my stateless tests, so 
I think its purge could not be the one that used CPU05. Now I repeated 
the experiment, first disabling PF as follows:


dut# pfctl -d
pf disabled

And I can still see FIVE CPU cores used by system tasks:

load averages:  0.69,  0.29, 0.13   
dut.cntrg 14:41:06

36 processes: 35 idle, 1 on processor up 0 days 00:03:46
CPU00 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 8.1% intr, 
91.7% idle
CPU01 states:  0.0% user,  0.0% nice, 61.1% sys,  9.5% spin, 9.5% intr, 
19.8% idle
CPU02 states:  0.0% user,  0.0% nice, 62.8% sys, 10.9% spin, 8.5% intr, 
17.8% idle
CPU03 states:  0.0% user,  0.0% nice, 54.7% sys,  9.1% spin, 10.1% intr, 
26.0% idle
CPU04 states:  0.0% user,  0.0% nice, 62.7% sys, 10.2% spin, 9.8% intr, 
17.4% idle
CPU05 states:  0.0% user,  0.0% nice, 51.7% sys,  9.1% spin, 7.6% intr, 
31.6% idle
CPU06 states:  0.2% user,  0.0% nice,  2.8% sys,  0.8% spin, 10.0% intr, 
86.1% idle
CPU07 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 7.2% intr, 
92.6% idle
CPU08 states:  0.0% user,  0.0% nice,  0.0% sys,  0.0% spin, 8.4% intr, 
91.6% idle
CPU09 states:  0.0% user,  0.0% nice,  0.0% sys,  0.0% spin, 9.2% intr, 
90.8% idle
CPU10 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 10.8% intr, 
89.0% idle
CPU11 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 9.2% intr, 
90.6% idle
CPU12 states:  0.0% user,  0.0% nice,  0.2% sys,  0.8% spin, 9.2% intr, 
89.8% idle
CPU13 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 7.2% intr, 
92.6% idle
CPU14 states:  0.0% user,  0.0% nice,  0.0% sys,  0.8% spin, 9.8% intr, 
89.4% idle
CPU15 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 7.8% intr, 
92.0% idle

Memory: Real: 34M/1546M act/tot Free: 122G Cache: 807M Swap: 0K/256M

I suspect that top shows an average (in a few seconds time window) and 
perhaps one of the cores from CPU01 to CPU04 are skipped (e.g. because 
it was used by the "top" command?), this is why I can see system load on 
CPU05. (There is even some low amount of system load on CPU06.)




*Is there any way to completely delete its entire content?*

hrm.

so i just read the code again. "pfctl -F states" goes through the whole
state table and unlinks the states from the red-black trees used for
packet processing, and then marks them as unlinked so the purge process
can immediately claim then as soon as they're scanned. this means that
in terms of packet processing the tree is empty. the memory (which is
what the state limit applies to) won't be reclaimed until the purge
processing takes them.

if you just wait 10 or so seconds after "pfctl -F states" then both the
tree and state limits should be back to 0. you can watch pfctl -si,
"systat pf", or the pfstate row in "systat pool" to confirm.

you can change the scan interval with "set timeout interval" in pf.conf
from 10s. no one fiddles with that though, so i'd put it back between
runs to be representative of real world performance.


I usually wait 10s between the consecutive steps of the binary search of 
my measurements to give the system a chance to relax (trying to ensure 
that the steps are independent measurements). However, the timeout 
interval of PF was set to 1 hour (using "set timeout interval 3600"). 
You may ask, why?


To have some well defined performance metrics, and to define repeatable 
and reproducible measurements, we use the following tests:
- maximum connection establishment rate (during this test all test 
frames result in a new connection)
- throughput with bidirectional traffic as required by RFC 2544 (during 
this test no test frames result in a new connection, neither connection 
time out happens -- a sufficiently high timeout could guarantee it)
- connection tear down performance (first loading N number of 
connections and then deleting all connections in a single step and 
measuring the execution time of the deletion: connection tear down rate 
= N / deletion time of N connections)


It is a good question, how well the above performance metrics can 
represent the "real word" performance of a stateful NAT64 implementation!


If you are interested (and have time) I would be happy to work together 
with you in this area. We 

Re: pf state-table-induced instability

2023-08-30 Thread David Gwynne
On Wed, Aug 30, 2023 at 09:54:45AM +0200, Gabor LENCSE wrote:
> Dear David,
> 
> Thank you very much for your detailed answer! Now I have got the explanation
> for seemingly rather strange things. :-)
> 
> However, I have some further questions. Let me explain what I do now so that
> you can more clearly see the background.
> 
> I have recently enabled siitperf to use multiple IP addresses. (Siitperf is
> an IPv4, IPv6,?? SIIT, and stateful NAT64/NAT44 bechmarking tool
> implementing the measurements of RFC 2544, RFC 8219, and this draft:
> https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-benchmarking-stateful
> .)
> 
> Currently I want to test (and demonstrate) the difference this improvement
> has made. I have already covered the stateless case by measuring the IPv4
> and IPv6 packet forwarding performance of OpenBSD using
> 1) the very same test frames following the test frame format defined in the
> appendix of RFC 2544
> 2) using only pseudorandom port numbers required by RFC 4814 (resulted in no
> performance improvement compared to case 1)
> 3) using pseudorandom IP addresses from specified ranges (resulted in
> significant performance improvement compared to case 1)
> 4) using both pseudorandom IP addresses and port numbers (same results as in
> case 3)
> 
> Many thanks to OpenBSD developers for enabling multi-core IP packet
> forwarding!
> 
> https://www.openbsd.org/plus72.html says: "Activated parallel IP forwarding,
> starting 4 softnet tasks but limiting the usage to the number of CPUs."
> 
> It is not a fundamental issue, but it seems to me that during my tests not
> only four but five CPU cores were used by IP packet forwarding:

the packet processing is done in kernel threads (task queues are built
on threads), and those threads could be scheduled on any cpu. the
pf purge processing runs in yet another thread.

iirc, the schedule scans down the list of cpus looking for an idle
one when it needs to run stuff, except to avoid cpu0 if possible.
this is why you see most of the system time on cpus 1 to 5.

> 
> load averages:?? 1.34,?? 0.35,
> 0.12 dut.cntrg
> 20:10:15
> 36 processes: 35 idle, 1 on processor up 1 days 02:16:56
> CPU00 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 6.1% intr,
> 93.7% idle
> CPU01 states:?? 0.0% user,?? 0.0% nice, 55.8% sys,?? 7.2% spin, 5.2% intr,
> 31.9% idle
> CPU02 states:?? 0.0% user,?? 0.0% nice, 53.6% sys,?? 8.0% spin, 6.2% intr,
> 32.1% idle
> CPU03 states:?? 0.0% user,?? 0.0% nice, 48.3% sys,?? 7.2% spin, 6.2% intr,
> 38.3% idle
> CPU04 states:?? 0.0% user,?? 0.0% nice, 44.2% sys,?? 9.7% spin, 6.3% intr,
> 39.8% idle
> CPU05 states:?? 0.0% user,?? 0.0% nice, 33.5% sys,?? 5.8% spin, 6.4% intr,
> 54.3% idle
> CPU06 states:?? 0.0% user,?? 0.0% nice,?? 3.2% sys,?? 0.2% spin, 7.2% intr,
> 89.4% idle
> CPU07 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.8% spin, 6.0% intr,
> 93.2% idle
> CPU08 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 5.4% intr,
> 94.4% idle
> CPU09 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.2% intr,
> 92.6% idle
> CPU10 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 8.9% intr,
> 90.9% idle
> CPU11 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 7.6% intr,
> 92.2% idle
> CPU12 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.0% spin, 8.6% intr,
> 91.4% idle
> CPU13 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.4% spin, 6.1% intr,
> 93.5% idle
> CPU14 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.2% spin, 6.4% intr,
> 93.4% idle
> CPU15 states:?? 0.0% user,?? 0.0% nice,?? 0.0% sys,?? 0.4% spin, 4.8% intr,
> 94.8% idle
> Memory: Real: 34M/2041M act/tot Free: 122G Cache: 825M Swap: 0K/256M
> 
> The above output of the "top" command show significant system load at CPU
> cores form CPU1 to CPU5.
> 
> *Has the number of softnet tasks been increased from 4 to 5?*

no :)

> What it more crucial for me, are the stateful NAT64 the measurements with
> PF.
> 
> My stateful NAT64 measurement are as follows.
> 
> 1. Maximum connection establishment rate test uses a binary search to find
> the highest rate, at which all connections can be established through the
> stateful NAT64 gateway when all test frames create a new connection.
> 
> 2. Throughput test also uses a binary search to find the highest rate
> (called throughput) at which all test frames are forwarded by the stateful
> NAT64 gateway using bidirectional traffic. (All test frames belong to an
> already existing connection. This test requires to load the connections into
> the connection tracking table of the stateful NAT64 gateway in a previous
> step using a safely lower rate than determined by the maximum connection
> establishment rate test.)
> 
> And both tests need to repeat multiple times to acquire statistically
> reliable results.
> 
> As for the explanation of the seemingly deteriorating performance of PF, now
> I understand from your 

Re: pf state-table-induced instability

2023-08-30 Thread Gabor LENCSE

Dear David,

Thank you very much for your detailed answer! Now I have got the 
explanation for seemingly rather strange things. :-)


However, I have some further questions. Let me explain what I do now so 
that you can more clearly see the background.


I have recently enabled siitperf to use multiple IP addresses. (Siitperf 
is an IPv4, IPv6,  SIIT, and stateful NAT64/NAT44 bechmarking tool 
implementing the measurements of RFC 2544, RFC 8219, and this draft: 
https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-benchmarking-stateful 
.)


Currently I want to test (and demonstrate) the difference this 
improvement has made. I have already covered the stateless case by 
measuring the IPv4 and IPv6 packet forwarding performance of OpenBSD using
1) the very same test frames following the test frame format defined in 
the appendix of RFC 2544
2) using only pseudorandom port numbers required by RFC 4814 (resulted 
in no performance improvement compared to case 1)
3) using pseudorandom IP addresses from specified ranges (resulted in 
significant performance improvement compared to case 1)
4) using both pseudorandom IP addresses and port numbers (same results 
as in case 3)


Many thanks to OpenBSD developers for enabling multi-core IP packet 
forwarding!


https://www.openbsd.org/plus72.html says: "Activated parallel IP 
forwarding, starting 4 softnet tasks but limiting the usage to the 
number of CPUs."


It is not a fundamental issue, but it seems to me that during my tests 
not only four but five CPU cores were used by IP packet forwarding:


load averages:  1.34,  0.35, 0.12   
dut.cntrg 20:10:15

36 processes: 35 idle, 1 on processor up 1 days 02:16:56
CPU00 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 6.1% intr, 
93.7% idle
CPU01 states:  0.0% user,  0.0% nice, 55.8% sys,  7.2% spin, 5.2% intr, 
31.9% idle
CPU02 states:  0.0% user,  0.0% nice, 53.6% sys,  8.0% spin, 6.2% intr, 
32.1% idle
CPU03 states:  0.0% user,  0.0% nice, 48.3% sys,  7.2% spin, 6.2% intr, 
38.3% idle
CPU04 states:  0.0% user,  0.0% nice, 44.2% sys,  9.7% spin, 6.3% intr, 
39.8% idle
CPU05 states:  0.0% user,  0.0% nice, 33.5% sys,  5.8% spin, 6.4% intr, 
54.3% idle
CPU06 states:  0.0% user,  0.0% nice,  3.2% sys,  0.2% spin, 7.2% intr, 
89.4% idle
CPU07 states:  0.0% user,  0.0% nice,  0.0% sys,  0.8% spin, 6.0% intr, 
93.2% idle
CPU08 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 5.4% intr, 
94.4% idle
CPU09 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 7.2% intr, 
92.6% idle
CPU10 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 8.9% intr, 
90.9% idle
CPU11 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 7.6% intr, 
92.2% idle
CPU12 states:  0.0% user,  0.0% nice,  0.0% sys,  0.0% spin, 8.6% intr, 
91.4% idle
CPU13 states:  0.0% user,  0.0% nice,  0.0% sys,  0.4% spin, 6.1% intr, 
93.5% idle
CPU14 states:  0.0% user,  0.0% nice,  0.0% sys,  0.2% spin, 6.4% intr, 
93.4% idle
CPU15 states:  0.0% user,  0.0% nice,  0.0% sys,  0.4% spin, 4.8% intr, 
94.8% idle

Memory: Real: 34M/2041M act/tot Free: 122G Cache: 825M Swap: 0K/256M

The above output of the "top" command show significant system load at 
CPU cores form CPU1 to CPU5.


*Has the number of softnet tasks been increased from 4 to 5?*

What it more crucial for me, are the stateful NAT64 the measurements 
with PF.


My stateful NAT64 measurement are as follows.

1. Maximum connection establishment rate test uses a binary search to 
find the highest rate, at which all connections can be established 
through the stateful NAT64 gateway when all test frames create a new 
connection.


2. Throughput test also uses a binary search to find the highest rate 
(called throughput) at which all test frames are forwarded by the 
stateful NAT64 gateway using bidirectional traffic. (All test frames 
belong to an already existing connection. This test requires to load the 
connections into the connection tracking table of the stateful NAT64 
gateway in a previous step using a safely lower rate than determined by 
the maximum connection establishment rate test.)


And both tests need to repeat multiple times to acquire statistically 
reliable results.


As for the explanation of the seemingly deteriorating performance of PF, 
now I understand from your explanation that the "pfctl -F states" 
command does not delete the content of the connection tracking table.


*Is there any way to completely delete its entire content?*

(E.g., under Linux, I can delete the connection tracking table of 
iptables or Jool by deleting the appropriate kernel module.)


Of course, I can delete it by rebooting the server. However, currently I 
use a Dell PowerEdge R730 server, and its complete reboot (including 
stopping OpenBSD, initialization of the hardware, booting OpenBSD and 
some spare time) takes 5 minutes. This is a way too long overhead, if I 
need to do it between every single elementary steps (that is, the steps 
of the binary search) 

Re: pf state-table-induced instability

2023-08-28 Thread David Gwynne
On Mon, Aug 28, 2023 at 01:46:32PM +0200, Gabor LENCSE wrote:
> Hi Lyndon,
> 
> Sorry for my late reply. Please see my answers inline.
> 
> On 8/24/2023 11:13 PM, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote:
> > Gabor LENCSE writes:
> > 
> > > If you are interested, you can find the results in Tables 18 - 20 of
> > > this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009
> > Thanks for the pointer -- that's a very interesting paper.
> > 
> > After giving it a quick read through, one thing immediately jumps
> > out.  The paper mentions (section A.4) a boost in performance after
> > increasing the state table size limit.  Not having looked at the
> > relevant code, so I'm guessing here, but this is a classic indicator
> > of a hashing algorithm falling apart when the table gets close to
> > full.  Could it be that simple?  I need to go digging into the pf
> > code for a closer look.
> 
> Beware, I wrote it about iptables and not PF!
> 
> As for iptables, it is really so simple. I have done a deeper analysis of
> iptables performance as the function of its hash table size. It is
> documented in another (open access) paper:
> http://doi.org/10.36244/ICJ.2023.1.6
> 
> However, I am not familiar with the internals of the other two tested
> stateful NAT64 implementations, Jool and OpenBSD PF. I have no idea, what
> kind of data structures they use for storing the connections.

openbsd uses a red-black tree to look up states. packets are parsed into
a key that looks up states by address family, ips, ipproto, ports, etc,
to find the relevant state. if a state isnt found, it falls through to
ruleset evaluation, which is notionally a linked list, but has been
optimised.

> > You also describe how the performance degrades over time.  This
> > exactly matches the behaviour we see.  Could the fix be as simple
> > as cranking 'set limit states' up to, say, two milltion?  There is
> > one way to find out ... :-)
> 
> As you could see, the highest number of connections was 40M, and the limit
> of the states was set to 1000M. It worked well for me then with the PF of
> OpenBSD 7.1.
> 
> It would be interesting to find the root cause of the phenomenon, why the
> performance of PF seems to deteriorate with time. E.g., somehow the internal
> data structures of PF become "polluted" if many connections are established
> and then deleted?

my first guess is that you're starting to fight agains the pf state
purge processing. pf tries to scan the entire state table every 10
seconds (by default) looking for expired states it can remove. this scan
process runs every second, but it tries to cover the whole state table
by 10 seconds. the more states you have the more time this takes, and
this increases linearly with the number of states you have.

until relatively recently (post 7.2), the scan and gc processing
effectively stopped the world. at work we run with about 2 million
states during business hours, and i was seeing the gc processing take up
approx 70ms a second, during which packet processing didnt really
happen.

now the scan can happen without blocking pf packet processing. it still
takes cpu time, so there is a point that processing packets and scanning
for states will fight each other for time, but at least they're not
fighting each other for locks now.

> However, I have deleted the content of the state table after each elementary
> measurement step using the "pfctl -F states" command. (I am sorry, this
> command is missing from the paper, but it is there in my saved "del-pf"
> file!)
> 
> Perhaps PF developers could advise us, if the deletion of the states
> generate a fresh state table or not.

it marks the states as expired, and then the purge scan is able to take
them and actually free them.

> Could anyone help us in this question?
> 
> Best regards,
> 
> G??bor
> 
> 
> 
> 
> I use binary search to find the highest lossless rate (throughput).
> Especially w
> 
> 
> > 
> > --lyndon
> 



Re: pf state-table-induced instability

2023-08-28 Thread Gabor LENCSE

Hi Lyndon,

Sorry for my late reply. Please see my answers inline.

On 8/24/2023 11:13 PM, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote:

Gabor LENCSE writes:


If you are interested, you can find the results in Tables 18 - 20 of
this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009

Thanks for the pointer -- that's a very interesting paper.

After giving it a quick read through, one thing immediately jumps
out.  The paper mentions (section A.4) a boost in performance after
increasing the state table size limit.  Not having looked at the
relevant code, so I'm guessing here, but this is a classic indicator
of a hashing algorithm falling apart when the table gets close to
full.  Could it be that simple?  I need to go digging into the pf
code for a closer look.


Beware, I wrote it about iptables and not PF!

As for iptables, it is really so simple. I have done a deeper analysis 
of iptables performance as the function of its hash table size. It is 
documented in another (open access) paper: 
http://doi.org/10.36244/ICJ.2023.1.6


However, I am not familiar with the internals of the other two tested 
stateful NAT64 implementations, Jool and OpenBSD PF. I have no idea, 
what kind of data structures they use for storing the connections.



You also describe how the performance degrades over time.  This
exactly matches the behaviour we see.  Could the fix be as simple
as cranking 'set limit states' up to, say, two milltion?  There is
one way to find out ... :-)


As you could see, the highest number of connections was 40M, and the 
limit of the states was set to 1000M. It worked well for me then with 
the PF of OpenBSD 7.1.


It would be interesting to find the root cause of the phenomenon, why 
the performance of PF seems to deteriorate with time. E.g., somehow the 
internal data structures of PF become "polluted" if many connections are 
established and then deleted?


However, I have deleted the content of the state table after each 
elementary measurement step using the "pfctl -F states" command. (I am 
sorry, this command is missing from the paper, but it is there in my 
saved "del-pf" file!)


Perhaps PF developers could advise us, if the deletion of the states 
generate a fresh state table or not.


Could anyone help us in this question?

Best regards,

Gábor




I use binary search to find the highest lossless rate (throughput). 
Especially w





--lyndon




Re: pf state-table-induced instability

2023-08-24 Thread Daniel Melameth
On Thu, Aug 24, 2023 at 12:31 PM Lyndon Nerenberg (VE7TFX/VE6BBM)
 wrote:
> For over a year now we have been seeing instability on our firewalls
> that seems to kick in when our state tables approach 200K entries.
> The number varies, but it's a safe bet that once we cross the 180K
> threshold, the machines start getting cranky.  At 200K+ performance
> visibly degrades, often leading to a complete lockup of the network
> stack, or a spontaneous reboot.

...

> Our pf settings are pretty simple:
>
>   set optimization normal
>   set ruleset-optimization basic
>   set limit states 40
>   set limit src-nodes 10
>   set loginterface none
>   set skip on lo
>   set reassemble yes
>
>   # Reduce the number of state table entries in FIN_WAIT_2 state.
>   set timeout tcp.finwait 4

I don't know if there is any relation, but, with 40 states
defined, adaptive scaling should start to kick in at around 24
states.



Re: pf state-table-induced instability

2023-08-24 Thread Daniel Melameth
On Thu, Aug 24, 2023 at 2:57 PM Gabor LENCSE  wrote:
> I used OpenBSD 7.1 PF during stateful NAT64 benchmarking measurements
> from 400,000 to 40,000,000 states. (Of course, its connection setup and
> packet forwarding performance degraded with the number of states, but
> the degradation was not very drastic.)
>
> If you are interested, you can find the results in Tables 18 - 20 of
> this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009

Seriously awesome paper with volumes of detail--thank you!



Re: pf state-table-induced instability

2023-08-24 Thread Lyndon Nerenberg (VE7TFX/VE6BBM)
Gabor LENCSE writes:

> If you are interested, you can find the results in Tables 18 - 20 of 
> this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009

Thanks for the pointer -- that's a very interesting paper.

After giving it a quick read through, one thing immediately jumps
out.  The paper mentions (section A.4) a boost in performance after
increasing the state table size limit.  Not having looked at the
relevant code, so I'm guessing here, but this is a classic indicator
of a hashing algorithm falling apart when the table gets close to
full.  Could it be that simple?  I need to go digging into the pf
code for a closer look.

You also describe how the performance degrades over time.  This
exactly matches the behaviour we see.  Could the fix be as simple
as cranking 'set limit states' up to, say, two milltion?  There is
one way to find out ... :-)

--lyndon



Re: pf state-table-induced instability

2023-08-24 Thread Gabor LENCSE

Hi,


But my immediate (and only -- please do NOT start a bikeshed on
ruleset design!) question is:

Is there a practical limit on the number of states pf can handle?


I used OpenBSD 7.1 PF during stateful NAT64 benchmarking measurements 
from 400,000 to 40,000,000 states. (Of course, its connection setup and 
packet forwarding performance degraded with the number of states, but 
the degradation was not very drastic.)


If you are interested, you can find the results in Tables 18 - 20 of 
this (open access) paper: https://doi.org/10.1016/j.comcom.2023.08.009


Best regards,

Gábor



Re: PF rate limiting options valid for UDP?

2023-07-20 Thread Otto Moerbeek
On Thu, Jul 20, 2023 at 05:52:07PM +, mabi wrote:

> --- Original Message ---
> On Wednesday, July 19th, 2023 at 10:58 PM, Stuart Henderson 
>  wrote:
> 
> > For rules that pass traffic to your authoritative DNS servers,
> > I don't think you need much longer than the time taken to answer a
> > query. So could be quite a bit less.
> 
> Right good point, I will add custom state timeouts for this specific UDP pass 
> rule on port 53.
> 
> > Usually carp/ospf will enter the state table before the machines start
> > seeing large amounts of packets and stay there, which is what you would
> > normally want. If the state table is full, you have more problem
> > opening new connections that require state to be added than you do
> > maintaining existing ones.
> > 
> > fwiw I typically use this on ospf+carp machines, "pass quick proto
> > {carp, ospf} keep state (no-sync) set prio 7"
> 
> That's very interesting, I never realized there was a simple priority system 
> ready to use in PF without the need of setting up any queues. Probably the 
> "set prio 7" option on OSPF+CARP pass rules will juts do the trick and I will 
> definitely also implement this. 
> 
> > DNS server software is written with this type of traffic in mind, and
> > has more information available (from inside the DNS request packet)
> > to make a decision about what to do with it, than is available in a
> > general-purpose packet filter like PF.
> > 
> > Also it stores the tracking information in data structures that have
> > been chosen to make sense for this use (and common DNS servers default
> > to masking on common subnet sizes, reducing the amount they have to
> > store compared to tracking the full IP address).
> > 
> > http://man.openbsd.org/nsd.conf#rrl
> > https://bind9.readthedocs.io/en/latest/reference.html#response-rate-limiting
> > https://www.knot-dns.cz/docs/2.4/html/reference.html#module-rrl
> 
> Too bad I use PowereDNS, it does not seem to offer much parameters related to 
> rate-limiting for UDP but for TCP I found at least max-tcp-connections. Maybe 
> it's time for a change as Gabor mentions his tests in his reply (thanks 
> btw!)...
> 

In a typical PowerDNS setup the task of rate limiting is done by dnsdist.

-Otto



Re: PF rate limiting options valid for UDP?

2023-07-20 Thread mabi
--- Original Message ---
On Wednesday, July 19th, 2023 at 10:58 PM, Stuart Henderson 
 wrote:

> For rules that pass traffic to your authoritative DNS servers,
> I don't think you need much longer than the time taken to answer a
> query. So could be quite a bit less.

Right good point, I will add custom state timeouts for this specific UDP pass 
rule on port 53.

> Usually carp/ospf will enter the state table before the machines start
> seeing large amounts of packets and stay there, which is what you would
> normally want. If the state table is full, you have more problem
> opening new connections that require state to be added than you do
> maintaining existing ones.
> 
> fwiw I typically use this on ospf+carp machines, "pass quick proto
> {carp, ospf} keep state (no-sync) set prio 7"

That's very interesting, I never realized there was a simple priority system 
ready to use in PF without the need of setting up any queues. Probably the "set 
prio 7" option on OSPF+CARP pass rules will juts do the trick and I will 
definitely also implement this. 

> DNS server software is written with this type of traffic in mind, and
> has more information available (from inside the DNS request packet)
> to make a decision about what to do with it, than is available in a
> general-purpose packet filter like PF.
> 
> Also it stores the tracking information in data structures that have
> been chosen to make sense for this use (and common DNS servers default
> to masking on common subnet sizes, reducing the amount they have to
> store compared to tracking the full IP address).
> 
> http://man.openbsd.org/nsd.conf#rrl
> https://bind9.readthedocs.io/en/latest/reference.html#response-rate-limiting
> https://www.knot-dns.cz/docs/2.4/html/reference.html#module-rrl

Too bad I use PowerDNS, it does not seem to offer much parameters related to 
rate-limiting for UDP but for TCP I found at least max-tcp-connections. Maybe 
it's time for a change as Gabor mentions his tests in his reply (thanks btw!)...



Re: PF rate limiting options valid for UDP?

2023-07-19 Thread Gabor LENCSE

Hi,


Are you already using your DNS server's response rate limiting features?

Not yet, as I still believe I should stop as much as possible such traffic at 
the firewall before it even reaches the network behind my firewall. So at the 
software/daemon/service level it would be my last line of defense.


If your hardware is powerful enough (e.g. at least 10Gbps Ethernet and 
the authoritative DNS server has let us say 32 CPU cores) you could also 
try fending off the DoS attack simply by using NSD or Knot DNS instead 
of BIND. According to my measurements, they both outperformed BIND by a 
factor or 10.


If you are interested, you can find all the details in my open access 
paper: G. Lencse, "Benchmarking Authoritative DNS Servers", /IEEE 
Access/, vol. 8. pp. 130224-130238, July 2020. 
https://doi.org/10.1109/ACCESS.2020.3009141


Best regards,

Gábor


Re: PF rate limiting options valid for UDP?

2023-07-19 Thread Stuart Henderson
On 2023/07/19 19:54, mabi wrote:
> --- Original Message ---
> On Wednesday, July 19th, 2023 at 9:32 PM, Stuart Henderson 
>  wrote:
> 
> > If PF is struggling as it is, there's a good chance it will buckle
> > completely if it has to do source tracking too
> 
> That is also something I thought might be the case :|
> 
> > Did you already tweak timeouts for the rule passing UDP DNS traffic?
> > Defaults are 60s/30s/60s for udp.first, udp.single and udp.multiple
> > respectively, that is much too high for a very busy DNS server -
> > you can set them on the specific rule itself rather than changing
> > defaults for all rules. For an auth server which is expected to
> > respond quickly they can be cranked way down.
> 
> Yes, this at least I did since quite some time now and use the following 
> timeout settings:
> 
> set timeout udp.first 20
> set timeout udp.multiple 20
> set timeout udp.single 10
> 
> Do you think I could go even lower? When I check the PF state entries during 
> such a DDoS I see mostly states with the "SINGLE" state.

For rules that pass traffic to your authoritative DNS servers,
I don't think you need much longer than the time taken to answer a
query. So could be quite a bit less.

> > (If that is still too many states, I wonder if your network might
> > actually be happier if you "pass quick proto udp to $server port 53 no
> > state" and "pass quick proto udp from $server port 53 no state" right at
> > the top of the ruleset).
> 
> That's actually an excellent idea to bypass PF states and hence consume less 
> resources... Next thing to try out. I was also thinking I should use "no 
> state" with CARP and OSPF rules in pf.conf so that in case the PF state table 
> entries is full it does not prevent such important protocols to function. 
> What do you think, would that also work?

Usually carp/ospf will enter the state table before the machines start
seeing large amounts of packets and stay there, which is what you would
normally want. If the state table is full, you have more problem
opening new connections that require state to be added than you do
maintaining existing ones.

fwiw I typically use this on ospf+carp machines, "pass quick proto
{carp, ospf} keep state (no-sync) set prio 7"

> > Are you already using your DNS server's response rate limiting features?
> 
> Not yet, as I still believe I should stop as much as possible such traffic at 
> the firewall before it even reaches the network behind my firewall. So at the 
> software/daemon/service level it would be my last line of defense.

DNS server software is written with this type of traffic in mind, and
has more information available (from inside the DNS request packet)
to make a decision about what to do with it, than is available in a
general-purpose packet filter like PF.

Also it stores the tracking information in data structures that have
been chosen to make sense for this use (and common DNS servers default
to masking on common subnet sizes, reducing the amount they have to
store compared to tracking the full IP address).

http://man.openbsd.org/nsd.conf#rrl
https://bind9.readthedocs.io/en/latest/reference.html#response-rate-limiting
https://www.knot-dns.cz/docs/2.4/html/reference.html#module-rrl



Re: PF rate limiting options valid for UDP?

2023-07-19 Thread mabi
--- Original Message ---
On Wednesday, July 19th, 2023 at 9:32 PM, Stuart Henderson 
 wrote:

> If PF is struggling as it is, there's a good chance it will buckle
> completely if it has to do source tracking too

That is also something I thought might be the case :|

> Did you already tweak timeouts for the rule passing UDP DNS traffic?
> Defaults are 60s/30s/60s for udp.first, udp.single and udp.multiple
> respectively, that is much too high for a very busy DNS server -
> you can set them on the specific rule itself rather than changing
> defaults for all rules. For an auth server which is expected to
> respond quickly they can be cranked way down.

Yes, this at least I did since quite some time now and use the following 
timeout settings:

set timeout udp.first 20
set timeout udp.multiple 20
set timeout udp.single 10

Do you think I could go even lower? When I check the PF state entries during 
such a DDoS I see mostly states with the "SINGLE" state.
 
> (If that is still too many states, I wonder if your network might
> actually be happier if you "pass quick proto udp to $server port 53 no
> state" and "pass quick proto udp from $server port 53 no state" right at
> the top of the ruleset).

That's actually an excellent idea to bypass PF states and hence consume less 
resources... Next thing to try out. I was also thinking I should use "no state" 
with CARP and OSPF rules in pf.conf so that in case the PF state table entries 
is full it does not prevent such important protocols to function. What do you 
think, would that also work?

> Are you already using your DNS server's response rate limiting features?

Not yet, as I still believe I should stop as much as possible such traffic at 
the firewall before it even reaches the network behind my firewall. So at the 
software/daemon/service level it would be my last line of defense.



Re: PF rate limiting options valid for UDP?

2023-07-19 Thread Stuart Henderson
On 2023/07/19 19:13, mabi wrote:
> --- Original Message ---
> On Wednesday, July 19th, 2023 at 12:40 PM, Stuart Henderson 
>  wrote:
> 
> > I don't think you understood what I wrote then - they are the
> > opposite of helpful here.
> 
> No, I do understand what you wrote but I should have explained my case
> in more details. Behind my OpenBSD firewall I have two authoritative DNS
> servers and because of recent DDoS originating from >12k IPs against UDP
> port 53 on these two servers the whole network behind the firewall gets
> unresponsive or has a high packet loss because there is over 2 million
> states in the PF states table during the attack. So in my specific case
> I don't care that cloudflare or other external DNS servers can not query
> my DNS authoritative servers for a few seconds or minutes but I do care
> a lot that my whole rest of my network and servers behind the OpenBSD
> firewall stays responsive. It's a trade-off I can totally accept and
> welcome. Furthermore when I have so many state entries due to a DDoS on
> UDP port 53, CARP breaks as well as the OSPF sessions with my border
> routers because it can not communicate properly within the defined
> timeouts.

If PF is struggling as it is, there's a good chance it will buckle
completely if it has to do source tracking too

Did you already tweak timeouts for the rule passing UDP DNS traffic?
Defaults are 60s/30s/60s for udp.first, udp.single and udp.multiple
respectively, that is much too high for a very busy DNS server -
you can set them on the specific rule itself rather than changing
defaults for all rules. For an auth server which is expected to
respond quickly they can be cranked way down.

(If that is still too many states, I wonder if your network might
actually be happier if you "pass quick proto udp to $server port 53 no
state" and "pass quick proto udp from $server port 53 no state" right at
the top of the ruleset).

Are you already using your DNS server's response rate limiting features?



Re: PF rate limiting options valid for UDP?

2023-07-19 Thread mabi
--- Original Message ---
On Wednesday, July 19th, 2023 at 12:40 PM, Stuart Henderson 
 wrote:

> I don't think you understood what I wrote then - they are the
> opposite of helpful here.

No, I do understand what you wrote but I should have explained my case in more 
details. Behind my OpenBSD firewall I have two authoritative DNS servers and 
because of recent DDoS originating from >12k IPs against UDP port 53 on these 
two servers the whole network behind the firewall gets unresponsive or has a 
high packet loss because there is over 2 million states in the PF states table 
during the attack. So in my specific case I don't care that cloudflare or other 
external DNS servers can not query my DNS authoritative servers for a few 
seconds or minutes but I do care a lot that my whole rest of my network and 
servers behind the OpenBSD firewall stays responsive. It's a trade-off I can 
totally accept and welcome. Furthermore when I have so many state entries due 
to a DDoS on UDP port 53, CARP breaks as well as the OSPF sessions with my 
border routers because it can not communicate properly within the defined 
timeouts.



Re: PF rate limiting options valid for UDP?

2023-07-19 Thread Kapetanakis Giannis


On 19/07/2023 13:31, Stuart Henderson wrote:
> On 2023-07-19, Kapetanakis Giannis  wrote:
>> Maybe even better, can it run under relayd (redirect) on top of carp?
> That's just rdr-to behind the scenes, no problem with that, though if
> you want to do per IP rate limiting alongside load-balancing you might
> want "mode source-hash" rather than the default round-robin or one of
> the random options.
>
> (I wouldn't recommend sticky-address, because then you get into more
> complex paths inside PF because it has to maintain source-tracking
> information).


I don't think source tracking is that important in this case scenario.

relayd will only have one host, which will be the dnsdist listening on 
localhost (on each load balancer).

dnsdist will have whatever it can support with stickiness/source-tracking.

pf rdr-to could also be an option, but then you loose the carp demotion which 
relayd provides.

thanks

G



Re: PF rate limiting options valid for UDP?

2023-07-19 Thread Stuart Henderson
On 2023-07-19, mabi  wrote:
> --- Original Message ---
> On Tuesday, July 18th, 2023 at 10:59 PM, Stuart Henderson 
>  wrote:
>
>
>> PF's state-tracking options are only for TCP. (Blocking an IP
>> based on number of connections from easily spoofed UDP is a good
>> way to let third parties prevent your machine from communicating
>> with IPs that may well get in the way i.e. trigger a "self DoS").
>
> What a pitty, these kind of rate limiting options for UDP would have  been 
> quite useful.

I don't think you understood what I wrote then - they are the
opposite of helpful here.

Say you are running a DNS recursive resolver with such protection;
if someone were to send you spoofed high rate packets from the IPs
of the root servers, some big gTLD/ccTLD servers, or big DNS hosters
(cloudflare or someone), your lookups will be quite broken.

Likewise for an authoritative server: send packets with source IPs
of some large DNS recursive resolvers and you then won't be sending
replies to legitimate requests from those resolvers.

The difference with TCP is that someone sending packets needs to
be able to see the response to those packets in order to carry out
the handshake. That's not needed for UDP where a single packet in
one direction is all that's needed.




Re: PF rate limiting options valid for UDP?

2023-07-19 Thread Stuart Henderson
On 2023-07-19, Kapetanakis Giannis  wrote:
> On 18/07/2023 23:59, Stuart Henderson wrote:
>> PF's state-tracking options are only for TCP. (Blocking an IP
>> based on number of connections from easily spoofed UDP is a good
>> way to let third parties prevent your machine from communicating
>> with IPs that may well get in the way i.e. trigger a "self DoS").
>>
>> You may be interested in looking into L7 methods of mitigating
>> problems from high rates of DNS queries - for example dnsdist
>> allows a lot of flexibility in this area.
>
>
> dnsdist looks interesting.
>
> Can it run on top of carp interfaces?

Don't think I tried it, but I don't see why not.

> Maybe even better, can it run under relayd (redirect) on top of carp?

That's just rdr-to behind the scenes, no problem with that, though if
you want to do per IP rate limiting alongside load-balancing you might
want "mode source-hash" rather than the default round-robin or one of
the random options.

(I wouldn't recommend sticky-address, because then you get into more
complex paths inside PF because it has to maintain source-tracking
information).
 



Re: PF rate limiting options valid for UDP?

2023-07-19 Thread mabi
--- Original Message ---
On Tuesday, July 18th, 2023 at 10:59 PM, Stuart Henderson 
 wrote:


> PF's state-tracking options are only for TCP. (Blocking an IP
> based on number of connections from easily spoofed UDP is a good
> way to let third parties prevent your machine from communicating
> with IPs that may well get in the way i.e. trigger a "self DoS").

What a pitty, these kind of rate limiting options for UDP would have  been 
quite useful.

> You may be interested in looking into L7 methods of mitigating
> problems from high rates of DNS queries - for example dnsdist
> allows a lot of flexibility in this area.

Thanks for the hint about dnsdist, it looks powerful. Still whenever possible I 
would rather avoid having an extra piece of software and instead have that 
traffic controlled more upstream so ideally on the firewall directly.



Re: PF rate limiting options valid for UDP?

2023-07-19 Thread Kapetanakis Giannis
On 18/07/2023 23:59, Stuart Henderson wrote:
> PF's state-tracking options are only for TCP. (Blocking an IP
> based on number of connections from easily spoofed UDP is a good
> way to let third parties prevent your machine from communicating
> with IPs that may well get in the way i.e. trigger a "self DoS").
>
> You may be interested in looking into L7 methods of mitigating
> problems from high rates of DNS queries - for example dnsdist
> allows a lot of flexibility in this area.


dnsdist looks interesting.

Can it run on top of carp interfaces?

Maybe even better, can it run under relayd (redirect) on top of carp?

G



Re: PF rate limiting options valid for UDP?

2023-07-18 Thread Stuart Henderson
On 2023-07-18, mabi  wrote:
> Hello,
>
> From the following documentation, I am trying to figure out which PF tracking 
> options are also valid for UDP but unfortunately it is not quite clear to me: 
>
> https://man.openbsd.org/pf.conf.5#Stateful_Tracking_Options
>
> My goal would be to do add rate limiting options to a PF UDP pass rule in 
> order to limit DDoS/DoS attacks on port 53.
>
> Interesting would be especially the "max-src-states" option. Is this option 
> also valid for UDP?
> 
> Is it also possible to use the "overload" option with UDP in order to add 
> source IPs into a table of attackers which I will then block?

PF's state-tracking options are only for TCP. (Blocking an IP
based on number of connections from easily spoofed UDP is a good
way to let third parties prevent your machine from communicating
with IPs that may well get in the way i.e. trigger a "self DoS").

You may be interested in looking into L7 methods of mitigating
problems from high rates of DNS queries - for example dnsdist
allows a lot of flexibility in this area.




Re: pf state-policy floating to if-bound

2023-06-15 Thread Kapetanakis Giannis
On 15/06/2023 19:07, Peter Nicolai Mathias Hansteen wrote:
>> On 15 Jun 2023, at 16:26, Kapetanakis Giannis  
>> wrote:
>> After applying some keep state (if-bound) on major rules, I 've already 
>> found a problem.
>>
>> pfsync.
>>
>> It copies the interface. The interfaces are different on the backup firewall 
>> so the states will not match if I demote master.
>>
>> Anyway to overcome this? Maybe filtering with same group name that is the 
>> same on both firewalls?
> Yes, I was going to suggest creating interface groups and referencing those 
> in your rules instead of interfaces.
>
> - P


I believe that will only work for rule copying between the firewalls and not 
state copying with pfsync.

State has an interface (or "all" for floating states) and that is copied 
between pfsync hosts.

For example when filtering with egress group, pfsync copies the egress state's 
interface from primary firewall to backup (different interface names).

It would be nice to add some kind of translation/mapping on the pfsync 
interface, to translate incoming remote states to local interface names.
Don't know how difficult that would be.

G



Re: pf state-policy floating to if-bound

2023-06-15 Thread Peter Nicolai Mathias Hansteen


> On 15 Jun 2023, at 16:26, Kapetanakis Giannis  
> wrote:
> After applying some keep state (if-bound) on major rules, I 've already found 
> a problem.
> 
> pfsync.
> 
> It copies the interface. The interfaces are different on the backup firewall 
> so the states will not match if I demote master.
> 
> Anyway to overcome this? Maybe filtering with same group name that is the 
> same on both firewalls?

Yes, I was going to suggest creating interface groups and referencing those in 
your rules instead of interfaces.

- P

--
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
http://bsdly.blogspot.com/ http://www.bsdly.net/ http://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.




Re: pf state-policy floating to if-bound

2023-06-15 Thread Kapetanakis Giannis
On 15/06/2023 17:17, Kapetanakis Giannis wrote:
> Hello,
>
> I'd like to make a change to my firewall/router from the default state-policy 
> floating to if-bound
>
> I believe the way my pf.conf is configured it will not do any harm but I'm 
> being cautious here and I'd like some info.
>
> The way I see it, I have two states for each packet traveling either 
> direction of the firewall.
> One on the incoming interface and one on the outgoing interface for each 
> packet.
> Each state is floating (pfctl -ss gives all)
>
> I filter always on the incoming interface, apply a tag and pass on the 
> outgoing interface everything that matches the tag.
> One tag for packets coming from internet and a different tag for packets 
> coming from my internal network to the internet.
>
> I believe that if all my filtering is like above then changing the default 
> policy will work without any further changes in pf.conf
>
> I don't understand why floating is the default.
> I mean, even with floating states, each state has a direction in/out, thus 
> the same state cannot be applied to multiple interfaces (incoming/outgoing) 
> and a different (floating) state is created on each interface.
>
> There must be a case I'm missing here. Maybe multipath routing?
>
> regards,
>
> Giannis


After applying some keep state (if-bound) on major rules, I 've already found a 
problem.

pfsync.

It copies the interface. The interfaces are different on the backup firewall so 
the states will not match if I demote master.

Anyway to overcome this? Maybe filtering with same group name that is the same 
on both firewalls?

G



Re: pf - traffic flow through 2 routers

2023-04-30 Thread Roman Samoilenko

Hi.

Check your PF rules and also confirm you have set 
net.inet.ip.forwarding=1 via sysctl.


Regards,
Roman

On 30.04.23 11:23, Gurra wrote:

Hi list,

I’m stuck setting up this configuration - 2 OpenBSD 7.3 boxes
connected via a private network 192.168.2.0/24.
The clients connected to box 1 on 192.168.1.0/24 should be able to reach the 
server
on 192.168.2.0/24 with ip 192.168.2.2 on port 1234 tcp
The communication between  clients and server needs to go through the 
192.168.2.0/24 network
Box 1 can communicate with the server but the clients can not reach the server.


 internet
^
|  em0
v
   +-+   em1
   | OpenBSD |<>  clients
   |1|192.168.1./24 192.168.1.0/24
   +-+
em2   192.168.2.10/24
^
|
v

em1  192.168.2.1/24
   +-+   server
   | OpenBSD |  <>
   |2|192.168.2.2 port 1234
   +-+
^
|
| em0
|
v
 internet

Any pointers?

Cheers,
Gurra





Re: pf - traffic flow through 2 routers

2023-04-30 Thread Janne Johansson
> I’m stuck setting up this configuration - 2 OpenBSD 7.3 boxes
> connected via a private network 192.168.2.0/24.
> The clients connected to box 1 on 192.168.1.0/24 should be able to reach the 
> server
> on 192.168.2.0/24 with ip 192.168.2.2 on port 1234 tcp
> The communication between  clients and server needs to go through the 
> 192.168.2.0/24 network
> Box 1 can communicate with the server but the clients can not reach the 
> server.
> Any pointers?

Use tcpdump to figure out where those packets go and where they stop
going, so you know on which machine to look for the issue.
If you use PF, enable logging on rules (man pflog) and see which rule
those packets hit.

-- 
May the most significant bit of your life be positive.



Re: PF: Redirect SOCKS connections to another server on a different net

2023-04-24 Thread Charlie
Below comes the solution to this problem. For the explanations on why it works,
you may refer to the original answer [1].

# sysctl net.inet.ip.forwarding=1
# cat /etc/pf.conf
  ...
  pass in on re0 proto tcp from any to (re0) port 1080 rdr-to 10.64.0.1 tag nat
  pass out on wg0 proto tcp nat-to (wg0) tagged nat
  ...

[1]
https://marc.info/?l=openbsd-pf=168215778109013=2

Cheers,
Charlie



Re: pf tcpdump rule def ?

2022-12-28 Thread Shadrock Uhuru
Hi 
many thanks Otto and Stuart
forgot to move my default block rule 
back to the top after adding some ipv6 stuff at the beginning.

have a happy and successful new year.
shadrock



Re: pf tcpdump rule def ?

2022-12-27 Thread Stuart Henderson
On 2022-12-27, Otto Moerbeek  wrote:
> On Tue, Dec 27, 2022 at 04:23:13AM +, Shadrock Uhuru wrote:
>
>> hi everyone
>> viewing my pf logs with
>> tcpdump -nettt -i pflog0 there are lines with no rule numbers
>> just rule def on the line instead,
>> i've tried googling without success,
>> need to know if they are wolf,sheep or misconfigurations causing them,
>> and against which rule do i match them up with.
>> 
>> the following is a snippet showing the rules
>> thanks shadrock
>> 
>> Dec 27 03:00:40.557716 rule 7/(match) block in on em0: 192.168.1.1 > 
>> 224.0.0.1: igmp query [tos 0xc0] [ttl 1]
>> Dec 27 03:00:59.495834 rule 35/(match) block in on pppoe0: 
>> 167.248.133.160.60037 > 88.97.5.79.12473: S 904362479:904362479(0) win 1024
>>  
>>  Dec 27 03:00:59.813362 rule def/(match) pass in on pppoe0: 
>> 198.252.206.25.443 > 10.2.1.79.13522: P 3251931305:3251931366(61) ack 27080
>>  26055 win 63 
>>  Dec 27 03:00:59.820893 rule def/(match) pass out on pppoe0: 
>> 88.97.5.79.14256 > 198.252.206.25.443: P 4273536371:4273536410(39) ack 334
>>  5204755 win 256  (DF)
>>  Dec 27 03:00:59.823015 rule def/(match) pass out on pppoe0: 
>> 88.97.5.79.14256 > 198.252.206.25.443: P 39:78(39) ack 1 win 256 >  timestamp 380012019 1163664932> (DF)
>>  Dec 27 03:00:59.825388 rule def/(match) pass out on pppoe0: 
>> 88.97.5.79.14256 > 198.252.206.25.443: P 78:117(39) ack 1 win 256 
>>  (DF)
>>  Dec 27 03:00:59.900318 rule def/(match) pass in on pppoe0: 
>> 198.252.206.25.443 > 10.2.1.79.13522: . ack 40 win 63 > 1163665020 380012019>
>>  Dec 27 03:00:59.902502 rule def/(match) pass in on pppoe0: 
>> 198.252.206.25.443 > 10.2.1.79.13522: . ack 79 win 63 > 1163665022 380012019>
>>  Dec 27 03:00:59.904998 rule def/(match) pass in on pppoe0: 
>> 198.252.206.25.443 > 10.2.1.79.13522: . ack 118 win 63 > 1163665024 380012019>
>>  Dec 27 03:01:03.661072 rule 35/(match) block in on pppoe0: 
>> 45.64.84.24.27789 > 88.97.5.79.23: S 1482753359:1482753359(0) win 30613 > 1440>
>>  Dec 27 03:01:11.480942 rule 35/(match) block in on pppoe0: 
>> 205.185.127.238.40598 > 88.97.5.79.60001: S 1843251311:1843251311(0) win 
>> 65535 
>>  Dec 27 03:01:11.935746 rule 7/(match) block in on bge0: 0.0.0.0 > 
>> 224.0.0.1: igmp query [len 12] [tos 0xc0] [ttl 1]
>>  Dec 27 03:01:25.422772 rule 38/(match) pass in on pppoe0: 
>> 145.131.132.84.443 > 10.2.1.79.42434: P 5666:5697(31) ack 1264 win 244 
>> 
>>  Dec 27 03:01:25.422795 rule 38/(match) pass in on pppoe0: 
>> 145.131.132.84.443 > 10.2.1.79.42434: F 5697:5697(0) ack 1264 win 244 
>> 
>>  Dec 27 03:01:25.424055 rule 38/(match) pass out on pppoe0: 88.97.5.79.8748 
>> > 145.131.132.84.443: . ack 5698 win 255 > 3399431690> (DF)
>>  Dec 27 03:01:28.600657 rule 37/(match) pass in on pppoe0: 93.184.220.29.80 
>> > 10.2.1.79.12939: . ack 481 win 131 
>>  Dec 27 03:01:28.601419 rule 37/(match) pass out on pppoe0: 88.97.5.79.31263 
>> > 93.184.220.29.80: . ack 575 win 256 > 235524325> (DF)
>> 
>
> def is the default rule. From pf.conf(5):
>
>  Each time a packet processed by the packet filter comes in on or goes out
>  through an interface, the filter rules are evaluated in sequential order,
>  from first to last.  For block and pass, the last matching rule decides
>  what action is taken; if no rule matches the packet, the default action
>  is to pass the packet without creating a state.  For match, rules are
>  evaluated every time they match; the pass/block state of a packet remains
>  unchanged.

...and: you don't normally want to see this, it means you pass some traffic
without keeping state, which can result in problems with TCP window scaling.

I'd recommend using "block all" or "block log all" at the start of your
ruleset, then allow whichever traffic you want afterwards, _even if that
is "pass all"_, so you can be sure that all "pass"ed traffic has state
table entries created.




Re: pf tcpdump rule def ?

2022-12-26 Thread Otto Moerbeek
On Tue, Dec 27, 2022 at 04:23:13AM +, Shadrock Uhuru wrote:

> hi everyone
> viewing my pf logs with
> tcpdump -nettt -i pflog0 there are lines with no rule numbers
> just rule def on the line instead,
> i've tried googling without success,
> need to know if they are wolf,sheep or misconfigurations causing them,
> and against which rule do i match them up with.
> 
> the following is a snippet showing the rules
> thanks shadrock
> 
> Dec 27 03:00:40.557716 rule 7/(match) block in on em0: 192.168.1.1 > 
> 224.0.0.1: igmp query [tos 0xc0] [ttl 1]
> Dec 27 03:00:59.495834 rule 35/(match) block in on pppoe0: 
> 167.248.133.160.60037 > 88.97.5.79.12473: S 904362479:904362479(0) win 1024
>  
>  Dec 27 03:00:59.813362 rule def/(match) pass in on pppoe0: 
> 198.252.206.25.443 > 10.2.1.79.13522: P 3251931305:3251931366(61) ack 27080
>  26055 win 63 
>  Dec 27 03:00:59.820893 rule def/(match) pass out on pppoe0: 88.97.5.79.14256 
> > 198.252.206.25.443: P 4273536371:4273536410(39) ack 334
>  5204755 win 256  (DF)
>  Dec 27 03:00:59.823015 rule def/(match) pass out on pppoe0: 88.97.5.79.14256 
> > 198.252.206.25.443: P 39:78(39) ack 1 win 256   timestamp 380012019 1163664932> (DF)
>  Dec 27 03:00:59.825388 rule def/(match) pass out on pppoe0: 88.97.5.79.14256 
> > 198.252.206.25.443: P 78:117(39) ack 1 win 256  1163664932> (DF)
>  Dec 27 03:00:59.900318 rule def/(match) pass in on pppoe0: 
> 198.252.206.25.443 > 10.2.1.79.13522: . ack 40 win 63  1163665020 380012019>
>  Dec 27 03:00:59.902502 rule def/(match) pass in on pppoe0: 
> 198.252.206.25.443 > 10.2.1.79.13522: . ack 79 win 63  1163665022 380012019>
>  Dec 27 03:00:59.904998 rule def/(match) pass in on pppoe0: 
> 198.252.206.25.443 > 10.2.1.79.13522: . ack 118 win 63  1163665024 380012019>
>  Dec 27 03:01:03.661072 rule 35/(match) block in on pppoe0: 45.64.84.24.27789 
> > 88.97.5.79.23: S 1482753359:1482753359(0) win 30613 
>  Dec 27 03:01:11.480942 rule 35/(match) block in on pppoe0: 
> 205.185.127.238.40598 > 88.97.5.79.60001: S 1843251311:1843251311(0) win 
> 65535 
>  Dec 27 03:01:11.935746 rule 7/(match) block in on bge0: 0.0.0.0 > 224.0.0.1: 
> igmp query [len 12] [tos 0xc0] [ttl 1]
>  Dec 27 03:01:25.422772 rule 38/(match) pass in on pppoe0: 145.131.132.84.443 
> > 10.2.1.79.42434: P 5666:5697(31) ack 1264 win 244  3399431690 2022623608>
>  Dec 27 03:01:25.422795 rule 38/(match) pass in on pppoe0: 145.131.132.84.443 
> > 10.2.1.79.42434: F 5697:5697(0) ack 1264 win 244  3399431690 2022623608>
>  Dec 27 03:01:25.424055 rule 38/(match) pass out on pppoe0: 88.97.5.79.8748 > 
> 145.131.132.84.443: . ack 5698 win 255  3399431690> (DF)
>  Dec 27 03:01:28.600657 rule 37/(match) pass in on pppoe0: 93.184.220.29.80 > 
> 10.2.1.79.12939: . ack 481 win 131 
>  Dec 27 03:01:28.601419 rule 37/(match) pass out on pppoe0: 88.97.5.79.31263 
> > 93.184.220.29.80: . ack 575 win 256  235524325> (DF)
> 

def is the default rule. From pf.conf(5):

 Each time a packet processed by the packet filter comes in on or goes out
 through an interface, the filter rules are evaluated in sequential order,
 from first to last.  For block and pass, the last matching rule decides
 what action is taken; if no rule matches the packet, the default action
 is to pass the packet without creating a state.  For match, rules are
 evaluated every time they match; the pass/block state of a packet remains
 unchanged.

-Otto



Re: pf question - antispoof and loopback

2022-12-24 Thread J Doe



On 2022-12-24 02:32, Philipp Buehler wrote:


Am 22.12.2022 21:37 schrieb J Doe:

    set skip on lo0
. . .
    antispoof quick for $ext_if


This one will be faster (a tad) if you do not plan for more
detailled filtering (and who does so on lo0 besides the
esoteric ones).

ciao


Hi Philipp,

Thank you for your reply.  Ok, I've gone with:


set skip on lo0
antispoof quick for $ext_if

... as you have recommended.  I was thinking this was correct, but I 
figured a message to misc@ was worth it to double-check and learn 
something new!


- J



Re: pf question - antispoof and loopback

2022-12-23 Thread Philipp Buehler

Am 22.12.2022 21:37 schrieb J Doe:

set skip on lo0
. . .
antispoof quick for $ext_if


This one will be faster (a tad) if you do not plan for more
detailled filtering (and who does so on lo0 besides the
esoteric ones).

ciao
--
pb



Re: pf question - set skip on wildcards ?

2022-12-13 Thread Philipp Buehler

Am 13.12.2022 22:11 schrieb J Doe:

set skip on !$ext_if

... with the idea that this skips all interfaces (virtual or
otherwise) _EXCEPT_ em0, which is the real Ethernet NIC that I want to
perform filtering on ?


Yes, but likely to need a space between ! and $.

ciao
--
pb



Re: pf question - set skip on wildcards ?

2022-12-13 Thread J Doe



On 2022-12-13 01:23, Philipp Buehler wrote:

Am 13.12.2022 06:02 schrieb J Doe:

    set skip on { lo0, vif* }


in pf.conf(5) the GRAMMAR shows:
  ifspec = ( [ "!" ] ( interface-name | interface-group ) ) |
   "{" interface-list "}"

So you could do "set skip on { lo0 vif0 vif1 }" for explicit, or you
use interface-group, alas "set skip on vif". If that "one" interface
is e.g. vif7 within vif(4) this MIGHT go: "set skip on { vif !vif7 }".


Hi Philipp,

Ok, so the "!" is a NOT operation ?

If that is the case, could I use:

ext_if = "em0"

set skip on !$ext_if

... with the idea that this skips all interfaces (virtual or otherwise) 
_EXCEPT_ em0, which is the real Ethernet NIC that I want to perform 
filtering on ?


Thanks,

- J



Re: pf question - set skip on wildcards ?

2022-12-12 Thread Philipp Buehler

Am 13.12.2022 06:02 schrieb J Doe:

set skip on { lo0, vif* }


in pf.conf(5) the GRAMMAR shows:
 ifspec = ( [ "!" ] ( interface-name | interface-group ) ) |
  "{" interface-list "}"

So you could do "set skip on { lo0 vif0 vif1 }" for explicit, or you
use interface-group, alas "set skip on vif". If that "one" interface
is e.g. vif7 within vif(4) this MIGHT go: "set skip on { vif !vif7 }".

HTH,
--
pb



Re: PF rules to block out every IP from a given country

2022-12-07 Thread Frank Habicht

Hi,

On 07/12/2022 18:36, Peter N. M. Hansteen wrote:
...> and can now be found at 
https://nxdomain.no/~peter/ripe2cidr_country.sh.txt --

as it says in the script itself, a trivial hack.

And I might add, it comes with *NO* warranties of any kind.


I think instead of :
grep allocated
in the two important lines, it should be :
egrep '(allocated)|(assigned)'

coz both can go to countries.

Frank



Re: PF rules to block out every IP from a given country

2022-12-07 Thread Stuart Henderson
On 2022-12-07, Peter N. M. Hansteen  wrote:
> On Wed, Dec 07, 2022 at 10:28:27AM +1100, Damian McGuckin wrote:
>> 
>> Has anybody created rules such as this and if so, do you have an example?
>
> As others have already indicated, the PF way to do anything like this would be
> to generate a list of addresses and networks you want to address (block in 
> this case),
> feed that list into a table and make the table the criteria for a blocking 
> rule.
>
> I remembered that a few years back I was asked to do something along those 
> lines,
> I forget the exact reason why, but anyway I decided that the most reasonable 
> way
> to determine which IP addresses or ranges belong to a certain country would be
> to fetch the most up to date data from the things RIPE publish. 
>
> My tiny writeup which in fact contains the entire script for massaging RIPE's
> data into something you can feed into a PF table survived a couple of job 
> changes
> and can now be found at https://nxdomain.no/~peter/ripe2cidr_country.sh.txt --
> as it says in the script itself, a trivial hack. 

# 16777216 -> /8 (Not actually found in RIPE data but with ARIN who knows)

btw there are /8's in the RIPE file now. Also prefix lengths smaller than
/26, even down to single addresses, so the subst will need some tweaks to
cover those.

> It is for example quite conceivable that an organization with premises in more
> than one country might want to split their allocations not strictly according
> to national borders.

And other specialities like anycast addresses, and as it's user-supplied
data it can't be completely relied upon. It changes often too; people using
this will want to arrange to keep it updated; allocations do change and
can move between countries (and, these days, even between regions).

It's likely that the output can be shrunk further by passing it through
aggregate6 (in ports).

-- 
Please keep replies on the mailing list.



Re: PF rules to block out every IP from a given country

2022-12-07 Thread Peter N. M. Hansteen
On Wed, Dec 07, 2022 at 10:28:27AM +1100, Damian McGuckin wrote:
> 
> Has anybody created rules such as this and if so, do you have an example?

As others have already indicated, the PF way to do anything like this would be
to generate a list of addresses and networks you want to address (block in this 
case),
feed that list into a table and make the table the criteria for a blocking rule.

I remembered that a few years back I was asked to do something along those 
lines,
I forget the exact reason why, but anyway I decided that the most reasonable way
to determine which IP addresses or ranges belong to a certain country would be
to fetch the most up to date data from the things RIPE publish. 

My tiny writeup which in fact contains the entire script for massaging RIPE's
data into something you can feed into a PF table survived a couple of job 
changes
and can now be found at https://nxdomain.no/~peter/ripe2cidr_country.sh.txt --
as it says in the script itself, a trivial hack. 

And I might add, it comes with *NO* warranties of any kind. 

It is for example quite conceivable that an organization with premises in more
than one country might want to split their allocations not strictly according
to national borders.

- Peter

-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.



Re: PF rules to block out every IP from a given country

2022-12-07 Thread Muhammad Muntaza
On Wed, 7 Dec 2022 at 08.55 Damian McGuckin  wrote:

>
> Has anybody created rules such as this and if so, do you have an example?
>
> Stay safe - Damian
>

Check this Example:

https://www.muntaza.id/pf/2020/02/03/pf-firewall-bagian-kedua.html

I write in Indonesia, you can use Google Translate to read it.


Thanks,

Muhammad Muntaza bin Hatta



>


Re: PF rules to block out every IP from a given country

2022-12-06 Thread Craig Schulz
Take a look at PF-Badhost.

Here is a decent write-up:

https://undeadly.org/cgi?action=article;sid=20210119113425

Craig

> On Dec 6, 2022, at 18:28, Damian McGuckin  wrote:
> 
> 
> Has anybody created rules such as this and if so, do you have an example?
> 
> Stay safe - Damian
> 
> Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
> Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
> Views & opinions here are mine and not those of any past or present employer
> 



signature.asc
Description: Message signed with OpenPGP


Re: PF rules to block out every IP from a given country

2022-12-06 Thread All
Considering you solved the issue with getting all IPs
for a given country correctly (and perhaps updating it sometimes):
1. Dump all IP addresses/ranges into a file (eg. blocked.ips)
2. add table  file  /path/to/blocked.ips
add "persist" if you want.
3. create rule to block all incoming connections from  

Alternatively, you can just create a file with IPs you allow, 
create table and write rules to allow connections from IPs
in that file. 

On Wednesday, December 7, 2022 at 09:44:34 a.m. GMT+9, Damian McGuckin 
 wrote: 






Has anybody created rules such as this and if so, do you have an example?

Stay safe - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer



Re: pf rdr-to (localhost ntpd) not always works

2022-09-15 Thread Kapetanakis Giannis
The problem/limitation is probably from local port binding of the client:123 
which is used for both connections.

I see other clients that use high ports for ntp queries that create multiple 
states without any problem.

all udp 127.0.0.1:123 (remote_ntp1:123) <- y.y.y.y:54401   SINGLE:MULTIPLE
all udp 127.0.0.1:123 (remote_ntp2:123) <- y.y.y.y:52525   SINGLE:MULTIPLE

:(

G

On 15/09/2022 11:12, Kapetanakis Giannis wrote:
> Hi,
>
> I'm trying to enforce a local ntpd server (which is also our external 
> firewall/router) for all connections and I have a very strange problem.
> Only one (dst) IP is allowed to create a state. After state expires a new dst 
> IP can be used.
>
> fw# pfctl -sr -R 154
> pass in log quick on $int_if inet proto udp from x.x.x.x to any port = 123 
> rdr-to 127.0.0.1
>
> client-x-x-x-x# ntpdate 1.2.3.4
> 15 Sep 10:34:15 ntpdate[620]: adjust time server 1.2.3.4 offset -0.96 sec
>
> On fw (ntpd server) I see:
>
> 10:34:09.366370 x.x.x.x.123 > 1.2.3.4.123: v4 alarm client strat 0 poll 3 
> prec -6 (DF)
> 10:34:09.366460 1.2.3.4.123 > x.x.x.x.123: v4 server strat 4 poll 3 prec -29 
> [tos 0x10]
> 10:34:11.366247 x.x.x.x.123 > 1.2.3.4.123: v4 alarm client strat 0 poll 3 
> prec -6 (DF)
> 10:34:11.366281 1.2.3.4.123 > x.x.x.x.123: v4 server strat 4 poll 3 prec -29 
> [tos 0x10]
> 10:34:13.366275 x.x.x.x.123 > 1.2.3.4.123: v4 alarm client strat 0 poll 3 
> prec -6 (DF)
> 10:34:13.366324 1.2.3.4.123 > x.x.x.x.123: v4 server strat 4 poll 3 prec -29 
> [tos 0x10]
>
> Sep 15 10:34:09.366383 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
> 1.2.3.4.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
>
> # pfctl -ss -vv -R 154
>
> all udp 127.0.0.1:123 (1.2.3.4:123) <- x.x.x.x:123   MULTIPLE:MULTIPLE
>    age 00:00:19, expires in 00:00:47, 4:4 pkts, 304:304 bytes, rule 154
>    id: 628ba534a943cb3c creatorid: 0001
>
> Subsequent ntp queries to same IP 1.2.3.4 work fine. Same state is used. pkts 
> advance, expire time resets to 60 seconds
>
> However if I try a different dst IP from the same client it does not work 
> until state to 1.2.3.4 above expires.
>
> I see the incoming packet on the internal interface but I see no reply going 
> back to client (as before with 1.2.3.4).
>
> 10:34:26.812675 x.x.x.x.123 > 2.3.4.5.123: v4 alarm client strat 0 poll 3 
> prec -6 (DF)
> 10:34:28.812571 x.x.x.x.123 > 2.3.4.5.123: v4 alarm client strat 0 poll 3 
> prec -6 (DF)
> 10:34:30.812587 x.x.x.x.123 > 2.3.4.5.123: v4 alarm client strat 0 poll 3 
> prec -6 (DF)
> 10:34:32.812554 x.x.x.x.123 > 2.3.4.5.123: v4 alarm client strat 0 poll 3 
> prec -6 (DF)
>
> I also see the pf log (4 times now and not 1 as before)
> Sep 15 10:34:26.812688 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
> 2.3.4.5.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
> Sep 15 10:34:28.812583 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
> 2.3.4.5.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
> Sep 15 10:34:30.812598 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
> 2.3.4.5.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
> Sep 15 10:34:32.812566 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
> 2.3.4.5.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
>
> No new state is created. pfctl -ss -R 154 only lists the one state of 1.2.3.4
>
> After state expiration, a different IP can be used and works.
>
> Initial pf rule included keep state (max-src-states 10, source-track rule)
> which also behaves the same. 10 src-IP connections are allowed and then no 
> more until one is expired.
> However, again, if another destination IP is used I see no replies (bellow 
> the src limit of 10).
>
> system is 7.1 amd64 which syspatches.
>
> ntpd only lists servers, constraints and
> listen on 127.0.0.1
> listen on ::1
>
> ideas?
>
> G
>
>
>
>



Re: pf rdr-to (localhost ntpd) not always works

2022-09-15 Thread Kapetanakis Giannis


On 15/09/2022 15:06, Kapetanakis Giannis wrote:
> The problem/limitation is probably from local port binding of the client:123 
> which is used for both connections.
>
> I see other clients that use high ports for ntp queries that create multiple 
> states without any problem.
>
> all udp 127.0.0.1:123 (remote_ntp1:123) <- y.y.y.y:54401   SINGLE:MULTIPLE
> all udp 127.0.0.1:123 (remote_ntp2:123) <- y.y.y.y:52525   SINGLE:MULTIPLE
>
> :(
>
> G

Yes indeed. from info debug level I get.

Sep 15 15:48:02 fw /bsd: pf: stack key attach failed on all: UDP in wire: (0) 
x.x.x.x:123 1.1.1.1:123 stack: (0) x.x.x.x:123 127.0.0.1:123 1:0 @154, 
existing: UDP in wire: (0) x.x.x.x:123 2.2.2.2:123 stack: (0) x.x.x.x:123 
127.0.0.1:123 2:2 @154

Apparently
src_ip:port <-> rdr_ip:port is used for state mapping and not
src_ip_port <-> dst_ip:port

G


>
> On 15/09/2022 11:12, Kapetanakis Giannis wrote:
>> Hi,
>>
>> I'm trying to enforce a local ntpd server (which is also our external 
>> firewall/router) for all connections and I have a very strange problem.
>> Only one (dst) IP is allowed to create a state. After state expires a new 
>> dst IP can be used.
>>
>> fw# pfctl -sr -R 154
>> pass in log quick on $int_if inet proto udp from x.x.x.x to any port = 123 
>> rdr-to 127.0.0.1
>>
>> client-x-x-x-x# ntpdate 1.2.3.4
>> 15 Sep 10:34:15 ntpdate[620]: adjust time server 1.2.3.4 offset -0.96 sec
>>
>> On fw (ntpd server) I see:
>>
>> 10:34:09.366370 x.x.x.x.123 > 1.2.3.4.123: v4 alarm client strat 0 poll 3 
>> prec -6 (DF)
>> 10:34:09.366460 1.2.3.4.123 > x.x.x.x.123: v4 server strat 4 poll 3 prec -29 
>> [tos 0x10]
>> 10:34:11.366247 x.x.x.x.123 > 1.2.3.4.123: v4 alarm client strat 0 poll 3 
>> prec -6 (DF)
>> 10:34:11.366281 1.2.3.4.123 > x.x.x.x.123: v4 server strat 4 poll 3 prec -29 
>> [tos 0x10]
>> 10:34:13.366275 x.x.x.x.123 > 1.2.3.4.123: v4 alarm client strat 0 poll 3 
>> prec -6 (DF)
>> 10:34:13.366324 1.2.3.4.123 > x.x.x.x.123: v4 server strat 4 poll 3 prec -29 
>> [tos 0x10]
>>
>> Sep 15 10:34:09.366383 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
>> 1.2.3.4.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
>>
>> # pfctl -ss -vv -R 154
>>
>> all udp 127.0.0.1:123 (1.2.3.4:123) <- x.x.x.x:123   MULTIPLE:MULTIPLE
>>    age 00:00:19, expires in 00:00:47, 4:4 pkts, 304:304 bytes, rule 154
>>    id: 628ba534a943cb3c creatorid: 0001
>>
>> Subsequent ntp queries to same IP 1.2.3.4 work fine. Same state is used. 
>> pkts advance, expire time resets to 60 seconds
>>
>> However if I try a different dst IP from the same client it does not work 
>> until state to 1.2.3.4 above expires.
>>
>> I see the incoming packet on the internal interface but I see no reply going 
>> back to client (as before with 1.2.3.4).
>>
>> 10:34:26.812675 x.x.x.x.123 > 2.3.4.5.123: v4 alarm client strat 0 poll 3 
>> prec -6 (DF)
>> 10:34:28.812571 x.x.x.x.123 > 2.3.4.5.123: v4 alarm client strat 0 poll 3 
>> prec -6 (DF)
>> 10:34:30.812587 x.x.x.x.123 > 2.3.4.5.123: v4 alarm client strat 0 poll 3 
>> prec -6 (DF)
>> 10:34:32.812554 x.x.x.x.123 > 2.3.4.5.123: v4 alarm client strat 0 poll 3 
>> prec -6 (DF)
>>
>> I also see the pf log (4 times now and not 1 as before)
>> Sep 15 10:34:26.812688 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
>> 2.3.4.5.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
>> Sep 15 10:34:28.812583 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
>> 2.3.4.5.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
>> Sep 15 10:34:30.812598 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
>> 2.3.4.5.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
>> Sep 15 10:34:32.812566 rule 154/(match) pass in on int_if: x.x.x.x.123 > 
>> 2.3.4.5.123: v4 alarm client strat 0 poll 3 prec -6 (DF)
>>
>> No new state is created. pfctl -ss -R 154 only lists the one state of 1.2.3.4
>>
>> After state expiration, a different IP can be used and works.
>>
>> Initial pf rule included keep state (max-src-states 10, source-track rule)
>> which also behaves the same. 10 src-IP connections are allowed and then no 
>> more until one is expired.
>> However, again, if another destination IP is used I see no replies (bellow 
>> the src limit of 10).
>>
>> system is 7.1 amd64 which syspatches.
>>
>> ntpd only lists servers, constraints and
>> listen on 127.0.0.1
>> listen on ::1
>>
>> ideas?
>>
>> G
>>
>>
>>
>>



Re: PF table issue on 7.1-Current

2022-06-07 Thread Sven F.
On Tue, Jun 7, 2022 at 11:34 AM Zé Loff  wrote:
>
> On Tue, Jun 07, 2022 at 04:26:11PM +0300, Barbaros Bilek wrote:
> > Hello Misc,
> >
> > I think there is an issue about PF tables at current.
> > Here my working PF config sample before 7.1-Current.
> > block log quick inet from 
> > pfctl -f /etc/pf.conf
> > Another software fills this Malicious table with this command:
> > # pfctl -t Malicious -T add 1.2.3.4
> >  1 table created.
> >  1/1 addresses added.
> > # pfctl -t Malicious -T show 1.2.3.4
> >  1.2.3.4
> >
> > But with my newly upgraded OpenBSD version it doesn't.
> > OpenBSD 7.1-current (GENERIC.MP) #575: Mon Jun 6 10:11:31 MDT 2022
> > #pfctl -t Malicious -T add 1.2.3.4
> > 1 table created.
> > pfctl: Table does not exist
> >
> > #pfctl -t Malicious -T show
> > pfctl: Table does not exist
> >
> >
> > Thanks for your time.
> >
> > --
> > Barbaros
>
> You now need to explicitly create the table with
>
> table 
>
> on your pf.conf.  This was not enforced in 7.1, so you got away with it,
> but it is now.
>

that s a 'feature' ???


-- 
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do



Re: PF table issue on 7.1-Current

2022-06-07 Thread Zé Loff
On Tue, Jun 07, 2022 at 04:26:11PM +0300, Barbaros Bilek wrote:
> Hello Misc,
> 
> I think there is an issue about PF tables at current.
> Here my working PF config sample before 7.1-Current.
> block log quick inet from 
> pfctl -f /etc/pf.conf
> Another software fills this Malicious table with this command:
> # pfctl -t Malicious -T add 1.2.3.4
>  1 table created.
>  1/1 addresses added.
> # pfctl -t Malicious -T show 1.2.3.4
>  1.2.3.4
> 
> But with my newly upgraded OpenBSD version it doesn't.
> OpenBSD 7.1-current (GENERIC.MP) #575: Mon Jun 6 10:11:31 MDT 2022
> #pfctl -t Malicious -T add 1.2.3.4
> 1 table created.
> pfctl: Table does not exist
> 
> #pfctl -t Malicious -T show
> pfctl: Table does not exist
> 
> 
> Thanks for your time.
> 
> --
> Barbaros

You now need to explicitly create the table with

table 

on your pf.conf.  This was not enforced in 7.1, so you got away with it,
but it is now.



-- 
 



Re: pf documentation

2022-04-07 Thread Stuart Henderson
On 2022-04-07, Steve Litt  wrote:
> I need some easy beginner's pf documentation as well as some
> intermediate pf documentation. I plan to make an OpenBSD/pf firewall. I
> haven't done this in ten years, and imagine pf and the process of
> turning OpenBSD into a firewall have changed in that time.

The pf.conf(5) manual is the primary reference, if you prefer to have a
nicely formatted printable version you can get one with

$ man -T pdf pf.conf > pf.conf.pdf

There are many many online guides about configuring PF; some are
helpful, many less so. If you do use these, cross-referring to
pf.conf(5) is a good idea.

IMHO the "building a router" example on the FAQ complicates things a
bit too much (it is actually "how to setup dhcp, wifi hostap [which few
people actually use and doesn't work on many adapters], and a DNS
resolver", and uses some PF features which I think it's really better
if you understand what they do before using them.

My main tips would be:

- start the ruleset with a "block" or "block log" rule so that no
packets match the implicit default "rule 0", which is effectively
"pass all no state". This avoids one of the main hard-to-diagnose
cases where some packets accepted without creating firewall state.

- tags and received-on can be pretty helpful and most guides don't
use them.

- if you can't figure out which rules are matching a packet, put
a "match log(matches)" rule at the top of the ruleset (maybe
with a from/to or port restriction if it's on a busy machine),
and watch "tcpdump -nevvipflog0" - when a packet traverses the
PF ruleset, you'll get some output for every rule matching that
packet, with a final line showing the overall pass/drop outcome.
the rule numbers shown can be looked up with "pfctl -sr -R XX -v".




Re: pf documentation

2022-04-07 Thread Tom Smyth
Steve,

if you like books ...
Peter Hansteen has written a book the book of pf
which I have read and would recommend

https://nostarch.com/pf3

and if you are interested in firewalls ingeneral and comparing features



On Thu, 7 Apr 2022 at 10:40, Tom Smyth  wrote:
>
> Hi Steve,
> Im going to give my usual answer here
>
>
> Peter Hansteen and Max Stucchi have an amazing tutorial on PF
> https://home.nuug.no/~peter/pftutorial/#1
>
> but they explain the concepts really well
> recommend the class that they do in person ..
>
> for the latest features about PF in the version of Openbsd you are running ...
>
> man pfctl or man pf.conf will help you ...
>
> if you need a intro to the intro ...
> https://openbsdjumpstart.org by Wesley is pretty cool and gets you
> started on OpenBSD and PF
>
>
>
> Hope this helps,
>
> Tom Smyth
>
> On Thu, 7 Apr 2022 at 10:28, Brodey Dover  wrote:
> >
> > To be honest, I just used the handbook/FAQ.
> >
> > https://www.openbsd.org/faq/pf/example1.html
> >
> > Note that some grammar and syntax from Google search results will not work 
> > in newer versions of pf.
> >
> > Sent from my iPhone
> >
> > > On Apr 7, 2022, at 05:13, Steve Litt  wrote:
> > >
> > > Hi all,
> > >
> > > I need some easy beginner's pf documentation as well as some
> > > intermediate pf documentation. I plan to make an OpenBSD/pf firewall. I
> > > haven't done this in ten years, and imagine pf and the process of
> > > turning OpenBSD into a firewall have changed in that time.
> > >
> > > Thanks,
> > >
> > > SteveT
> > >
> > > Steve Litt
> > > March 2022 featured book: Making Mental Models: Advanced Edition
> > > http://www.troubleshooters.com/mmm
> > >
>
>
>
> --
> Kindest regards,
> Tom Smyth.



--
Kindest regards,
Tom Smyth.



Re: pf documentation

2022-04-07 Thread Tom Smyth
Hi Steve,
Im going to give my usual answer here


Peter Hansteen and Max Stucchi have an amazing tutorial on PF
https://home.nuug.no/~peter/pftutorial/#1

but they explain the concepts really well
recommend the class that they do in person ..

for the latest features about PF in the version of Openbsd you are running ...

man pfctl or man pf.conf will help you ...

if you need a intro to the intro ...
https://openbsdjumpstart.org by Wesley is pretty cool and gets you
started on OpenBSD and PF



Hope this helps,

Tom Smyth

On Thu, 7 Apr 2022 at 10:28, Brodey Dover  wrote:
>
> To be honest, I just used the handbook/FAQ.
>
> https://www.openbsd.org/faq/pf/example1.html
>
> Note that some grammar and syntax from Google search results will not work in 
> newer versions of pf.
>
> Sent from my iPhone
>
> > On Apr 7, 2022, at 05:13, Steve Litt  wrote:
> >
> > Hi all,
> >
> > I need some easy beginner's pf documentation as well as some
> > intermediate pf documentation. I plan to make an OpenBSD/pf firewall. I
> > haven't done this in ten years, and imagine pf and the process of
> > turning OpenBSD into a firewall have changed in that time.
> >
> > Thanks,
> >
> > SteveT
> >
> > Steve Litt
> > March 2022 featured book: Making Mental Models: Advanced Edition
> > http://www.troubleshooters.com/mmm
> >



-- 
Kindest regards,
Tom Smyth.



Re: pf documentation

2022-04-07 Thread Brodey Dover
To be honest, I just used the handbook/FAQ. 

https://www.openbsd.org/faq/pf/example1.html

Note that some grammar and syntax from Google search results will not work in 
newer versions of pf.

Sent from my iPhone

> On Apr 7, 2022, at 05:13, Steve Litt  wrote:
> 
> Hi all,
> 
> I need some easy beginner's pf documentation as well as some
> intermediate pf documentation. I plan to make an OpenBSD/pf firewall. I
> haven't done this in ten years, and imagine pf and the process of
> turning OpenBSD into a firewall have changed in that time.
> 
> Thanks,
> 
> SteveT
> 
> Steve Litt 
> March 2022 featured book: Making Mental Models: Advanced Edition
> http://www.troubleshooters.com/mmm
> 


Re: pf documentation

2022-04-07 Thread Janne Johansson
Den tors 7 apr. 2022 kl 11:12 skrev Steve Litt :
>
> Hi all,
>
> I need some easy beginner's pf documentation as well as some
> intermediate pf documentation. I plan to make an OpenBSD/pf firewall. I
> haven't done this in ten years, and imagine pf and the process of
> turning OpenBSD into a firewall have changed in that time.

Might be worth looking around the OpenBSD webpage, perhaps it has a
section with Frequently Asked Questions that contain PF information
one might learn from?


-- 
May the most significant bit of your life be positive.



Re: PF pass not working (on complex "firewall")

2022-03-06 Thread Szél Gábor

Dear @misc

We found the error!
This is not PF problem.

I found this:
http://undeadly.org/cgi?action=article=20090127205841

If i modify an ipsec config *from:*
ike active esp from 172.20.123.0/24 to 172.20.122.0/24 \

*to:*
ike active esp from 172.20.123.0/24 *(192.168.123.0/24)* to 
172.20.122.0/24 \


PF rules working correctly.


--
Regards
Gábor Szél

email:gabor.s...@wantax.hu

2022. 03. 05. 23:08 keltezéssel, Szél Gábor írta:

Dear @misc

We have an stupid problem.
On a complex firewall (currently PF rules 1200 row), one PASS rule not 
working.

I do not know why.

There are many VLANs, WAN, LAN interfaces, many ipsec VPNs, CARP 
(master-backup), pfsync, etc ...


PF main rules:
# set
#.
set block-policy drop
set loginterface $ext_wan1_if
set skip on { lo $pfsync_if }
set reassemble no
set timeout { tcp.established 600, tcp.closing 60 }
set optimization aggressive
set ruleset-optimization none
set limit { states 10, src-nodes 10, tables 10, 
table-entries 10 }


# scrub
# -
match on $ext_wan1_if all scrub ( no-df max-mss 1440 random-id )

#. antispof
#. 
antispoof quick for { $ext_wan1_if } inet

# anchors
# -
anchor "ftp-proxy/*"

# Block(s)
#.
block quick proto udp to port { 1985 8116 } # neighbours 
HSRP & ...
block quick log on $ext_wan1_if from {   } 
label IPBlackList

block log inet6 all
block log all

So all interface traffic are basically forbidden (block).
Each traffic is allowed separately

We have one ipsec VPN, where there are NAT on both sides. (on both 
sides have 192.168.x.x subnets, there is a subnet collision)

we want to solve a simple thing:

  * comes in the packet on VPN tunnel to "virtual" IP address -
172.20.123.54 (bind to oBSD vlan interface)
  * from this address PF redirect packet to destination server -
192.168.123.54
  * destination server make return package, and send back
  * the response packet comes in oBSD VLAN interface (vlan141)
  * PF NAT-ed this packate to 172.20.123.54
  * NAT-ed package return to source address in VPN


rules:
    match in log on enc0 proto tcp from 172.20.122.0/24 to 
172.20.123.54 port 5240   rdr-to 192.168.123.54 port 5240
    pass in log on enc0 proto tcp from 172.20.122.0/24 to 
192.168.123.54

    pass out log on vlan141 from 172.20.122.0/24 to 192.168.123.54

    match in log on vlan141  from 192.168.123.54    to 
172.20.122.0/24 nat-to 172.20.123.54

    pass in log on vlan141  from 172.20.123.54  to 172.20.122.0/24
    pass in log on vlan141  from 192.168.123.54 to 
172.20.122.0/24        (not needed, but ... :)


return package tcpdump:

nat-to, okay:
Mar 05 23:01:09.418806 rule 410/(match) [uid 0, pid 32543] match in on 
vlan141: [orig src 192.168.123.54:5240, dst 172.20.122.10:39322] 
172.20.123.54.51958 > 172.20.122.10.39322: S [bad tcp cksum 5166! -> 
af7b] 966412712:966412712(0) ack 437277320 win 65160 1460,sackOK,timestamp 452766647 201794907,nop,wscale 7> (DF) (ttl 64, 
id 0, len 60, bad ip cksum d8be! -> ed52)


and, PF block this packet:
Mar 05 23:01:09.418820 rule 9/(match) [uid 0, pid 32543]*block in on 
vlan141:* [orig src 192.168.123.54:5240, dst 172.20.122.10:39322] 
172.20.123.54.51958 > 172.20.122.10.39322: S [bad tcp cksum 5166! -> 
af7b] 966412712:966412712(0) ack 437277320 win 65160 1460,sackOK,timestamp 452766647 201794907,nop,wscale 7> (DF) (ttl 64, 
id 0, len 60, bad ip cksum d8be! -> ed52)


If i modify pass rule, to match rule:
   match in log on vlan141 from 172.20.123.54

i see, match it works, but pass rule not works!

I've tried a lot of things already, without match rules, without nat 
(okay, no route, but ...), it is always blocked.


Why can't i override the block rule?
Everywhere else goes ...



--
Regards
Gábor Szél

email:gabor.s...@wantax.hu



  1   2   3   4   5   6   7   8   9   10   >