On Sat, Nov 21, 2015 at 6:21 AM, Martin Hlavatý <[email protected]> wrote: > I have issues with firewall lags while there is peak in match > rule counter in pf. Normally it has match ratio of about > 1500/sec, but several times a day it jumps to somewhere > around 6k/sec and firewall lags, some traffic gets dropped. > This takes a few seconds. > > Lag causes system to delay sending carp packets and > sometimes backup box promotes itself to master and > immediately back to backup. Sadly, after sending inverse ARP. > I workarounded this issue by setting advbase to 10. > > Another problem is obviously with normal forwarding traffic, > like lags in online games or iptv streams. > > There is no visible raise in cpu utilization, but cpu load goes > from about 0.7 to 1.5 and there are packets getting dropped > on wan interface. > > Box is Core i3 530 on Supermicro X8SIL with 2x1GB RAM, > intel 40GB SSD, two 82574 and two 82571 NICs. In afternoon > hours it is loaded on 40k/25k tx/rx pps on wan interface. > > Looking to systat vmstat, LAN and WAN nics are getting > around 7.5k interrupts, while pfsync about 2.5-3k > and interrupts in top take about 60-70%. > > I tried to switch NICs for i350, but it had no effect, same > thing with openBSD versions, 5.6 5.7 and 5.8 have same > behavior. I also tried to replacing other hardware like CPU > for Xeon X3430 or motherboard S5500BC with Xeon E5620, > but without effect. Happens also on backup box when it > runs as master (same hw config). > > System is running GENERIC.MP stable amd64 kernel. > > I read in some discussions, that raising interrupt limit and > rx/tx queue in em(4) driver or using broadcoms instead > of intels might help, but didnt try it yet. > > Is there any way to determine what is causing the peaks > and how to prevent them or getting system powerful > enough to handle them? > > pfctl -si > Status: Enabled for 0 days 22:12:20 Debug: err > > State Table Total Rate > current entries 66901 > searches 5003330275 62588.6/s > inserts 47704143 596.7/s > removals 47637242 595.9/s > Counters > match 96819915 1211.2/s > bad-offset 0 0.0/s > fragment 1850 0.0/s > short 86 0.0/s > normalize 48 0.0/s > memory 786228 9.8/s > bad-timestamp 0 0.0/s > congestion 3948624 49.4/s > ip-option 24341 0.3/s > proto-cksum 0 0.0/s > state-mismatch 1644853 20.6/s > state-insert 464 0.0/s > state-limit 0 0.0/s > src-limit 0 0.0/s > synproxy 3948 0.0/s > translate 0 0.0/s > no-route 0 0.0/s > > kern.netlivelocks=1534 > > netstat -si > em0 1500 <Link> 1533962428 266567 955232172 0 0 > em1 1500 <Link> 979515291 8697 1526507571 0 0 > em2 1500 <Link> 6970941 0 140093911 0 0 > em3* 1500 <Link> 0 0 0 0 0
Are you doing packet queuing with pf? What's the value of net.inet.ip.ifq.maxlen and net.inet.ip.ifq.drops? You might want to try disabling any power-saving features on that hardware.

