Hi Andy,

sorry for the delay, but a lot of more important work were between your mail and this answer ;).

You can set a simple prio on a rule like;
pass proto tcp from $left to $right set prio (1,4)

With PRIQ I mean the scheduler priq instead of cbq.

Relevant lines of my current pf.conf rule set.

<pf.conf>
...
altq on em0 priq bandwidth 1000Mb queue { std_em0, tcp_ack_em0 }
queue std_em0     priq(default)
queue tcp_ack_em0 priority 6

altq on em1 priq bandwidth 1000Mb queue { std_em1, tcp_ack_em1 }
queue std_em1     priq(default)
queue tcp_ack_em1 priority 6

match em0 on em0 inet proto tcp from any to any queue(std_em0, tcp_ack_em0)
match em0 on em1 inet proto tcp from any to any queue(std_em1, tcp_ack_em1)
...
</pf.conf>

I have read The Book of PF 2nd, but there is nothing about troubleshooting. What should I do to find the problem?

I have made some notes for troubleshooting purpose:

top -> Interrupts -> High CPU or network interfaces => Hardware limit systat -> Interrupts on CPU and network cards => Hardware limit
bwm-ng -> Bandwidth near the theoretical limit => Hardware limit
pfctl -si -> Look for current states, default limit to 10000. The memory
counter shows failed allocation of memory for states. Is this number is high and increased further => Set limit for states (pfctl -sm -> shows States Limit) sysctl kern.netlivelocks -> High number means something like two processes blocks each user => Hardware limit

No problem can be found with above steps:
- prioritize TCP-ACK for tcp traffic

Best Regards,
Patrick


On Thu, 9 Oct 2014, Andy wrote:

Hi,

Just so I understand what you have done, PRIQ is not the same as queuing.

You can set a simple prio on a rule like;
pass proto tcp from $left to $right set prio (1,4)

But this doesn't manage the situations where you have lots of different types/profiles of traffic on your network. For example you might have some big file transfers going on which can be delayed and can have a high latency but high throughput, alongside your control/real-time protocols which need low latency etc. Generally in this situation just using prio won't always be enough and your file transfers will still swamp your Interactive SSH or VNC connections etc..

So we do something like this;

altq on $if_trunk1 bandwidth 4294Mb hfsc queue { _wan }
oldqueue _wan on $if_trunk1 bandwidth 4290Mb priority 15 hfsc(linkshare 4290Mb, upperlimit 4290Mb) { _wan_rt, _wan_int, _wan_pri, _wan_vpn, _wan_web, _wan_dflt, _wan_bulk } oldqueue _wan_rt on $if_trunk1 bandwidth 20% priority 7 qlimit 50 hfsc(realtime(20%, 5000, 10%), linkshare 20%) oldqueue _wan_int on $if_trunk1 bandwidth 10% priority 5 qlimit 100 hfsc(realtime 5%, linkshare 10%) oldqueue _wan_pri on $if_trunk1 bandwidth 10% priority 4 qlimit 100 hfsc(realtime(15%, 2000, 5%), linkshare 10%) oldqueue _wan_vpn on $if_trunk1 bandwidth 30% priority 3 qlimit 300 hfsc(realtime(15%, 2000, 5%), linkshare 30%) oldqueue _wan_web on $if_trunk1 bandwidth 10% priority 2 qlimit 300 hfsc(realtime(10%, 3000, 5%), linkshare 10%) oldqueue _wan_dflt on $if_trunk1 bandwidth 15% priority 1 qlimit 100 hfsc(realtime(10%, 5000, 5%), linkshare 15%, ecn, default) oldqueue _wan_bulk on $if_trunk1 bandwidth 5% priority 0 qlimit 100 hfsc(linkshare 5%, upperlimit 30%, ecn, red)

altq on $if_trunk2 bandwidth 4294Mb hfsc queue { _wan }
oldqueue _wan on $if_trunk2 bandwidth 4290Mb priority 15 hfsc(linkshare 4290Mb, upperlimit 4290Mb) { _wan_rt, _wan_int, _wan_pri, _wan_vpn, _wan_web, _wan_dflt, _wan_bulk } oldqueue _wan_rt on $if_trunk2 bandwidth 20% priority 7 qlimit 50 hfsc(realtime(20%, 5000, 10%), linkshare 20%) oldqueue _wan_int on $if_trunk2 bandwidth 10% priority 5 qlimit 100 hfsc(realtime 5%, linkshare 10%) oldqueue _wan_pri on $if_trunk2 bandwidth 10% priority 4 qlimit 100 hfsc(realtime(15%, 2000, 5%), linkshare 10%) oldqueue _wan_vpn on $if_trunk2 bandwidth 30% priority 3 qlimit 300 hfsc(realtime(15%, 2000, 5%), linkshare 30%) oldqueue _wan_web on $if_trunk2 bandwidth 10% priority 2 qlimit 300 hfsc(realtime(10%, 3000, 5%), linkshare 10%) oldqueue _wan_dflt on $if_trunk2 bandwidth 15% priority 1 qlimit 100 hfsc(realtime(10%, 5000, 5%), linkshare 15%, ecn, default) oldqueue _wan_bulk on $if_trunk2 bandwidth 5% priority 0 qlimit 100 hfsc(linkshare 5%, upperlimit 30%, ecn, red)

pass quick proto { tcp, udp } from { (vlan1:network) } to { (vlan234:network) } port { 4569, 5060, 10000:20000 } queue _wan_rt set prio 7 pass quick proto { tcp, udp } from { (vlan1:network) } to { (vlan234:network) } port { 53, 123, 5900 } queue _wan_pri set prio 4 pass quick proto { tcp } from { (vlan1:network) } to { (vlan234:network) } port { 80, 443 } queue (_wan_web,_wan_pri) set prio (2,4) pass quick proto { tcp } from { (vlan1:network) } to { (vlan234:network) } port { ssh } queue (_wan_bulk,_wan_int) set prio (0,5)
.
. All the other rules needing higher priority than the rest
.
pass quick proto { tcp, udp, icmp } from { (vlan1:network) } to { (vlan234:network) } queue (_wan_bulk,_wan_pri) set prio (0,4)


NB; This is the old syntax for queues and I strongly recommend reading the 3rd edition of "The book of PF" (A must read for *anyone* new or old to OpenBSD and PF) :) and using the new syntax

The rule I use is that whenever one queue starts to get used too much and their is more than one type of traffic in a queue (here in this example I have DNS, NTP and VNC in the same queue) and if they start to affect eachother, its time to split the traffic out into further separate queues. So here you would split VNC into its own queue to stop VNC swamping the DNS queries :)

The priority in these queues is not the same as PRIO. These "priority" values don't have much impact *apparently* compared the the queues themselves (I just understand these to be CPU or bucket scheduling or something), but I've never understood how true that is, so I just set them to be the same number as the desired relative PRIO as that seems sensible.


Last but NOT least; the PRIO value gets copied into the VLAN's CoS header! :) So if you use VLANs like we do here on our trunks, the different packets will end up as frames with the prio copied in meaning your switches can then also maintain the layer 3 QoS in the layer 2 CoS... Amazing stuff :)


Good luck

Andrew Lemin

*** looking forward to 64bit queues! :) ***



On 08/10/14 20:49, jum...@yahoo.de wrote:
Hi Andy,

This morning I have added Priority Queueing (PRIQ) to the ruleset and prefer TCP ACK packets over everything else. I can see the queues with systat queue but the change has no effect on the user experience nor the throughput.

I have read something about adjust TCP send and receive window sizes settings, but OpenBSD to this automatically since 2010 [1]. What else can I set?

Best Regards,
Patrick

[1] http://marc.info/?l=openbsd-misc&m=128905075911814

On Thu, 2 Oct 2014, jum...@yahoo.de wrote:

Hi Andy,

Setup some queues and prioritise your ACK's ;)
Good idea, I will try to implement a Priority Queueing with the old altq.

Best Regards,
Patrick

On Thu, 2 Oct 2014, Andy wrote:

Setup some queues and prioritise your ACK's ;)

The box is fine under the load I'm sure, but you'll still need to prioritise those TCP acknowledgments to make things snappy when lots of traffic is going on..


On 02/10/14 17:13, Ville Valkonen wrote:
Hello Patrick,

On 2 October 2014 17:32, Patrick <jum...@yahoo.de> wrote:
Hi,

I use a OpenBSD based firewall (version 5.2, I know I should upgrade but ...) between a 8 host cluster of Linux server and 300 clients which will access this clutser via VNC. Each server is connected with one gigabit port to a dedicated switch and the firewall has on each site one gigabit (dedicated switch and campus network).

The users complains about slow VNC response times (if I connect a client system to the dedicated switch, the access is faster, even during peak hours), and the admins of the cluster blame my firewall :(.

I use MRTG for traffic monitoring (data retrieves from OpenBSD in one minute interval) and can see average traffic of 160 Mbit/s during office hours and peaks and 280 Mbit/s. With bwm-ng and a five second interval I can see peaks and 580 Mbit/s. The peak packets per second is arround 80000 packets (also measured with bwm-ng). The interrupt of CPU0 is in peak 25%. So with this data I don't think the firewall is at the limit, I'm right?

The server is a standard Intel Xeon (E3-1220V2, 4 Cores, 3.10 GHz) with 4 GByte of memory and 4 1 Gbit/s ethernet cooper Intel nics (driver em).

Where is the problem? Can't the nics handle more packets/second? How can I check for this?

If I connect a client system directly to the dedicated system, the response times are better.

Thanks for your help,
Patrick
In addition to dmesg, could you please provide the following information:
$ pfctl -si
$ sysctl kern.netlivelocks
and interrupt statistics (by systat for example) would be helpful.

Thanks!

--
Regards,
Ville

Reply via email to