Re: Trying to find what leaks in 6.4 after a recent network topology change

2019-04-15 Thread LÉVAI Dániel
Stuart Henderson @ 2019-04-15T15:39:30 +0200:
> On 2019-04-15, LÉVAI Dániel  wrote:
> > Hi!
> >
> >
> > After a recent network configuration change (added re(4), vether(4)) I'm
> > experiencing this memory leak from somewhere.
> >
> > How can I check/query how much memory the kernel (or parts of it) is
> > using over time, besides running top(1) with system processes shown --
> > I'm also staring at systat(1)'s `malloc' and `pool' views but I'm not
> > really sure what I'm (or rather what I should be) looking at.
> >
> 
> Sorry I didn't read the whole lot, but from skimming through it's likely
> to be kernel not userland (which is why you don't see much detail in top).
> There are likely some clues in output from the following:
> 
> netstat -m
> vmstat -m
> systat -b mbuf

No worries, this is perfect, thank you for the tips.
I'm already wondering about the `mbufs' output from systat(1)
(especially the ALIVE column):

After a reboot + 20 minutes:
   1 users Load 0.46 0.38 0.33 firefly.ecentrum.hu 16:40:43
IFACE LIVELOCKS  SIZE ALIVE   LWM   HWM   CWM
System0   256  4100 260
 2048  3717 468
 211253   7
 4096   128  20
 921617  12


After two days:
   1 users Load 0.14 0.20 0.22 firefly.ecentrum.hu 15:00:00
IFACE LIVELOCKS  SIZE ALIVE   LWM   HWM   CWM
System0   256  215K   13780
 2048 *   26954
 211266   9
 4096   128  25
 921611  11

How does ALIVE: 215k look after two days uptime with mild internet
traffic? Is this something that could be a potential culprit or am I
just reading things into that?

And I'm just guessing but that star either means infinite :) or that it
couldn't even write the number into that space.


Dani

-- 
LÉVAI Dániel
PGP key ID = 0x83B63A8F
Key fingerprint = DBEC C66B A47A DFA2 792D  650C C69B BE4C 83B6 3A8F



Re: Trying to find what leaks in 6.4 after a recent network topology change

2019-04-15 Thread Stuart Henderson
On 2019-04-15, LÉVAI Dániel  wrote:
> Hi!
>
>
> After a recent network configuration change (added re(4), vether(4)) I'm
> experiencing this memory leak from somewhere.
>
> How can I check/query how much memory the kernel (or parts of it) is
> using over time, besides running top(1) with system processes shown --
> I'm also staring at systat(1)'s `malloc' and `pool' views but I'm not
> really sure what I'm (or rather what I should be) looking at.
>

Sorry I didn't read the whole lot, but from skimming through it's likely
to be kernel not userland (which is why you don't see much detail in top).
There are likely some clues in output from the following:

netstat -m
vmstat -m
systat -b mbuf

>
>



Trying to find what leaks in 6.4 after a recent network topology change

2019-04-15 Thread LÉVAI Dániel
Hi!


After a recent network configuration change (added re(4), vether(4)) I'm
experiencing this memory leak from somewhere.

How can I check/query how much memory the kernel (or parts of it) is
using over time, besides running top(1) with system processes shown --
I'm also staring at systat(1)'s `malloc' and `pool' views but I'm not
really sure what I'm (or rather what I should be) looking at.



I have an on-board Realtek NIC and an add-on Intel PCIe 4 port NIC.

re0 at pci3 dev 0 function 0 "Realtek 8168" rev 0x0c: RTL8168G/8111G (0x4c00), 
msi

em0 at pci1 dev 0 function 0 "Intel I350" rev 0x01: msi
em1 at pci1 dev 0 function 1 "Intel I350" rev 0x01: msi
em2 at pci1 dev 0 function 2 "Intel I350" rev 0x01: msi
em3 at pci1 dev 0 function 3 "Intel I350" rev 0x01: msi

Before I started to experience the leak I had the on-board NIC disabled
in BIOS and only used two ports of the Intel NIC. My setup looked like
this:


== Network

ISP --- pppoe0 <-> em0 -- OpenBSD 6.4 -- em1(dhcpd) --- [switch] LAN
  `- athn0 --- LAN
  `- athn1 --- LAN
  `- bridge0(em1, athn[01])

Internal IP configured on em1.
athn? devices don't have IPs, just bridged to the internal network.
dhcpd(8) is running on em1.


Now I started to use re0 and one more NIC on the Intel card:

ISP --- pppoe0 <-> re0 -- OpenBSD 6.4 -- em0 --- LAN
  `- em1 --- LAN
  `- em2 --- LAN
  `- athn0 --- LAN
  `- athn1 --- LAN
  `- vether0 (dhcpd)
  `- bridge0(em[123], athn[01], vether0)

Now the 4 port Intel NIC basically acts as a switch, the em? interfaces
don't have IP addresses configured, and the athn? interfaces don't have
one either.
I've added a vether(4) interface with an internal IP (same as the em1
have had before) to the bridge, and dhcpd(8) (and everything else that
needs it) now uses that interface.


# ifconfig bridge0
bridge0: flags=41
index 10 llprio 3
groups: bridge
priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp
designated: id 00:00:00:00:00:00 priority 0

athn0 flags=3
port 6 ifpriority 0 ifcost 0
[bunch of `pass/block in/out on athn0 src/dst ' rules]

athn1 flags=3
port 7 ifpriority 0 ifcost 0
[bunch of `pass/block in/out on athn1 src/dst ' rules]

em0 flags=3
port 1 ifpriority 0 ifcost 0
em1 flags=3
port 2 ifpriority 0 ifcost 0
em2 flags=3
port 3 ifpriority 0 ifcost 0
em3 flags=3
port 4 ifpriority 0 ifcost 0
vether0 flags=3
port 13 ifpriority 0 ifcost 0
Addresses (max cache: 100, timeout: 240):
[...]

# ifconfig vether0
vether0: flags=8943 mtu 1500
lladdr fe:e1:ba:d0:58:44
description: Internal LAN
index 13 priority 0 llprio 3
groups: vether
media: Ethernet autoselect
status: active
inet 192.168.0.1 netmask 0x broadcast 192.168.255.255
inet6 <> prefixlen 64 scopeid 0xd
inet6 <> prefixlen 64 pltime 573642 vltime 573642

# ifconfig em
em0: flags=8b43 mtu 
1500
lladdr 
description: Internal LAN - Room1
index 1 priority 0 llprio 3
media: Ethernet autoselect (100baseTX full-duplex,rxpause,txpause)
status: active
em1: flags=8b43 mtu 
1500
lladdr 
description: Internal LAN - Room2
index 2 priority 0 llprio 3
media: Ethernet autoselect (1000baseT 
full-duplex,master,rxpause,txpause)
status: active
em2: flags=8b43 mtu 
1500
lladdr 
description: Internal LAN - Room3
index 3 priority 0 llprio 3
media: Ethernet autoselect (100baseTX full-duplex,rxpause,txpause)
status: active
em3: flags=8902 mtu 1500
lladdr 
index 4 priority 0 llprio 3
media: Ethernet autoselect (none)
status: no carrier


Nothing fancy, just basically a "software switch" for connecting
separate rooms in the house and adding a vether(4) interface for
conveniency.

Also, everything else remain unchanged, i.e.: no additional services
running, no new software has been installed, no introduction of new pf
rules, other than replacing the interface names of course.

I'm using /etc/malloc.conf@ -> S on all my machines but never had any
problems with it (and still don't anywhere else).


== pf(4)

pf(4) is used on the box, and the rules that have been changed are the
ones related to the interfaces mentioned above:

set skip on lo0
set skip on bridge0
pass on vether0 all flags S/SA allow-opts
pass on em0 all flags S/SA allow-opts
pass on em1 all flags S/SA allow-opts
pass on em2 all flags S/SA allow-opts
pass on em3 all