Re: still mbuf leak in 9.0 / 9.1?

dennis berger Wed, 15 May 2013 13:14:36 -0700

Hi jack,

so the increasing number of "mbufs in use" or mbuf clusters in use is normal, 
you would say?
jumbo frames are of size 9k. I know that they're from different pools, I also 
checked that pool.
nmb are:


#cat loader.conf

#tuning network
hw.intr_storm_threshold=9000
kern.ipc.nmbclusters=262144
kern.ipc.nmbjumbop=262144
kern.ipc.nmbjumbo9=65536
kern.ipc.nmbjumbo16=32768


14-05-2013-14-09.txt:9246/4918/14164/262144 mbuf clusters in use 
(current/cache/total/max)
14-05-2013-15-09.txt:9256/4856/14112/262144 mbuf clusters in use 
(current/cache/total/max)
14-05-2013-16-09.txt:9266/4846/14112/262144 mbuf clusters in use 
(current/cache/total/max)
14-05-2013-17-09.txt:9276/4836/14112/262144 mbuf clusters in use 
(current/cache/total/max)
14-05-2013-18-09.txt:9286/4826/14112/262144 mbuf clusters in use 
(current/cache/total/max)
14-05-2013-19-09.txt:9296/4734/14030/262144 mbuf clusters in use 
(current/cache/total/max)
14-05-2013-20-09.txt:9306/4724/14030/262144 mbuf clusters in use 
(current/cache/total/max)
14-05-2013-21-09.txt:9316/4714/14030/262144 mbuf clusters in use 
(current/cache/total/max)
14-05-2013-22-09.txt:9326/4704/14030/262144 mbuf clusters in use 
(current/cache/total/max)
14-05-2013-23-09.txt:9336/4694/14030/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-00-09.txt:9346/4684/14030/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-01-09.txt:9356/4674/14030/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-02-09.txt:9366/4664/14030/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-03-09.txt:9379/4279/13658/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-04-09.txt:9384/4086/13470/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-05-09.txt:9394/4076/13470/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-06-09.txt:9404/4066/13470/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-07-09.txt:9414/5040/14454/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-08-09.txt:9424/5030/14454/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-09-09.txt:9434/4898/14332/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-10-09.txt:9444/4850/14294/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-11-09.txt:9454/5000/14454/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-12-09.txt:9464/4874/14338/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-13-09.txt:9474/4856/14330/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-14-09.txt:17674/4460/22134/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-15-09.txt:17684/4450/22134/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-16-09.txt:17694/4696/22390/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-17-09.txt:17704/4686/22390/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-18-09.txt:17714/4658/22372/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-19-09.txt:17724/4648/22372/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-20-09.txt:17734/4638/22372/262144 mbuf clusters in use 
(current/cache/total/max)
15-05-2013-21-09.txt:17744/4628/22372/262144 mbuf clusters in use 
(current/cache/total/max)

Please see the link to http://knownhosts.org/reports-14-15.tgz in my original 
post, there is the full information including 9k jumbo frames.

it's the driver version 2.4.8 which should be from 9.1-release directly
yes TWINAX is correct.

I'll replace the driver with the latest one.

best regards and thanks,
dennis


Am 15.05.2013 um 19:00 schrieb Jack Vogel:

> So, you stop getting 10G transmission and so you are looking at mbuf leaks? I 
> don't see
> anything in your data that makes it look like you've run out of available 
> mbufs.  You said
> you're running jumbos, what size? You do realize that if you do this the 
> clusters are coming
> from different pools and you are not displaying those. What are all your nmb 
> limits set to?
> 
> So, this is 9.1 RELEASE, or stable? If you are using the driver from release 
> I would first off
> suggest you test the code from HEAD.
> 
> What is the 10G device, I see its using Twinax, and I have been told there is 
> a problem at
> times with those that is corrected in recent shared code, this is why you 
> should try the
> latest code.
> 
> Cheers,
> 
> Jack
> 
> 
> 
> On Wed, May 15, 2013 at 2:00 AM, dennis berger <d...@nipsi.de> wrote:
> Hi list,
> since we activated 10gbe on ixgbe cards + jumbo frames(9k) on 9.0 and now on 
> 9.1 we recognize that after a random period of time, sometimes a week, 
> sometimes only a day, the
> system doesn't send any packets out. The phenomenon is that you can't login 
> via ssh, nfs and istgt is not operative. Yet you can login on the console and 
> execute commands.
> A clean shutdown isn't possible though. It hangs after vnode cleaning, 
> normally you would see detaching of usb devices here, or other devices maybe?
> I've read the other post on this ML about mbuf leak in the arp handling code 
> in if_ether.c line 558. We don't see any of those notices in dmesg so I don't 
> think that glebius fix would apply for us.
> I'm collecting system and memory information every hour.
> 
> 
> Script looks like this.
> less /etc/periodic/hourly/100.report-memory.sh
> #!/bin/sh
> 
> reporttimestamp=`date +%d-%m-%Y-%H-%M`
> reportname=${reporttimestamp}.txt
> 
> cd /root/memory-mon
> 
> top -b > $reportname
> echo "" >> $reportname
> vmstat -m >> $reportname
> echo "" >> $reportname
> vmstat -z >> $reportname
> echo "" >> $reportname
> netstat -Q >> $reportname
> echo "" >> $reportname
> netstat -n -x >> $reportname
> echo "" >> $reportname
> netstat -m >> $reportname
> /usr/bin/perl /usr/local/bin/zfs-stats -a >> $reportname
> 
> When you grep for mbuf or mbuf usage you will see this for example:
> 
> root@freenas:/root/memory-mon # grep mbuf_packet: *
> 14-05-2013-14-09.txt:mbuf_packet:            256,      0,    9246,    
> 2786,201700429,   0,   0
> 14-05-2013-15-09.txt:mbuf_packet:            256,      0,    9256,    
> 2776,201773122,   0,   0
> 14-05-2013-16-09.txt:mbuf_packet:            256,      0,    9266,    
> 2766,201871553,   0,   0
> 14-05-2013-17-09.txt:mbuf_packet:            256,      0,    9276,    
> 2756,201915405,   0,   0
> 14-05-2013-18-09.txt:mbuf_packet:            256,      0,    9286,    
> 2746,201927956,   0,   0
> 14-05-2013-19-09.txt:mbuf_packet:            256,      0,    9296,    
> 2352,201935681,   0,   0
> 14-05-2013-20-09.txt:mbuf_packet:            256,      0,    9306,    
> 2342,201943754,   0,   0
> 14-05-2013-21-09.txt:mbuf_packet:            256,      0,    9316,    
> 2332,201950961,   0,   0
> 14-05-2013-22-09.txt:mbuf_packet:            256,      0,    9326,    
> 2450,201958150,   0,   0
> 14-05-2013-23-09.txt:mbuf_packet:            256,      0,    9336,    
> 2440,201967178,   0,   0
> 15-05-2013-00-09.txt:mbuf_packet:            256,      0,    9346,    
> 2430,201974561,   0,   0
> 15-05-2013-01-09.txt:mbuf_packet:            256,      0,    9356,    
> 2420,201982105,   0,   0
> 15-05-2013-02-09.txt:mbuf_packet:            256,      0,    9366,    
> 2410,201989463,   0,   0
> 15-05-2013-03-09.txt:mbuf_packet:            256,      0,    9378,    
> 1502,203019168,   0,   0
> 15-05-2013-04-09.txt:mbuf_packet:            256,      0,    9384,    
> 1624,205953601,   0,   0
> 15-05-2013-05-09.txt:mbuf_packet:            256,      0,    9394,    
> 1870,205959258,   0,   0
> 15-05-2013-06-09.txt:mbuf_packet:            256,      0,    9404,    
> 2500,205969396,   0,   0
> 15-05-2013-07-09.txt:mbuf_packet:            256,      0,    9414,    
> 3386,207945161,   0,   0
> 15-05-2013-08-09.txt:mbuf_packet:            256,      0,    9424,    
> 3376,208094689,   0,   0
> 15-05-2013-09-09.txt:mbuf_packet:            256,      0,    9434,    
> 2982,208172465,   0,   0
> 15-05-2013-10-09.txt:mbuf_packet:            256,      0,    9444,    
> 3100,208270369,   0,   0
> 
> and
> 
> root@freenas:/root/memory-mon # grep "mbufs in use" *
> 14-05-2013-14-09.txt:58444/5816/64260 mbufs in use (current/cache/total)
> 14-05-2013-15-09.txt:58455/5805/64260 mbufs in use (current/cache/total)
> 14-05-2013-16-09.txt:58464/5796/64260 mbufs in use (current/cache/total)
> 14-05-2013-17-09.txt:58475/5785/64260 mbufs in use (current/cache/total)
> 14-05-2013-18-09.txt:58484/5776/64260 mbufs in use (current/cache/total)
> 14-05-2013-19-09.txt:58493/5767/64260 mbufs in use (current/cache/total)
> 14-05-2013-20-09.txt:58503/5757/64260 mbufs in use (current/cache/total)
> 14-05-2013-21-09.txt:58513/5747/64260 mbufs in use (current/cache/total)
> 14-05-2013-22-09.txt:58523/5737/64260 mbufs in use (current/cache/total)
> 14-05-2013-23-09.txt:58533/5727/64260 mbufs in use (current/cache/total)
> 15-05-2013-00-09.txt:58543/5717/64260 mbufs in use (current/cache/total)
> 15-05-2013-01-09.txt:58554/5706/64260 mbufs in use (current/cache/total)
> 15-05-2013-02-09.txt:58563/5697/64260 mbufs in use (current/cache/total)
> 15-05-2013-03-09.txt:58639/5621/64260 mbufs in use (current/cache/total)
> 15-05-2013-04-09.txt:58581/5679/64260 mbufs in use (current/cache/total)
> 15-05-2013-05-09.txt:58591/5669/64260 mbufs in use (current/cache/total)
> 15-05-2013-06-09.txt:58602/5658/64260 mbufs in use (current/cache/total)
> 15-05-2013-07-09.txt:58613/5647/64260 mbufs in use (current/cache/total)
> 15-05-2013-08-09.txt:58623/6027/64650 mbufs in use (current/cache/total)
> 15-05-2013-09-09.txt:58634/6016/64650 mbufs in use (current/cache/total)
> 15-05-2013-10-09.txt:58645/6005/64650 mbufs in use (current/cache/total)
> 
> 
> This increasing number of used mbuf_packets and mbufs in use makes me nervous.
> See the complete reports http://knownhosts.org:/reports-14-15.tgz
> 
> Thanks for help,
> 
> -dennis
> 
> 
> 
> --------------BEGIN System information---------------
> It's a stock FreeBSD 9.1, yet the hostname is called freenas. Don't be 
> confused.
> 
> 
> igb0: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:25:90:34:c1:12
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (1000baseT <full-duplex>)
>         status: active
> igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:25:90:34:c1:13
>         inet 172.16.1.6 netmask 0xfffff000 broadcast 172.16.15.255
>         inet6 fe80::225:90ff:fe34:c113%igb1 prefixlen 64 scopeid 0x2
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (1000baseT <full-duplex>)
>         status: active
> ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
>         
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:1b:21:cc:12:8b
>         inet 10.254.254.242 netmask 0xfffffffc broadcast 10.254.254.243
>         inet6 fe80::21b:21ff:fecc:128b%ix0 prefixlen 64 scopeid 0xb
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
>         status: active
> ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
>         
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:1b:21:cc:12:8a
>         inet 10.254.254.254 netmask 0xfffffffc broadcast 10.254.254.255
>         inet6 fe80::21b:21ff:fecc:128a%ix1 prefixlen 64 scopeid 0xc
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
>         status: active
> ix2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
>         
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:1b:21:cc:12:b3
>         inet 10.254.254.246 netmask 0xfffffffc broadcast 10.254.254.247
>         inet6 fe80::21b:21ff:fecc:12b3%ix2 prefixlen 64 scopeid 0xd
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect
>         status: no carrier
> ix3: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         
> options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
>         ether 00:1b:21:cc:12:b2
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
>         media: Ethernet autoselect
>         status: no carrier
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>         options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
>         inet6 ::1 prefixlen 128
>         inet6 fe80::1%lo0 prefixlen 64 scopeid 0xf
>         inet 127.0.0.1 netmask 0xff000000
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> 
> #dmesg
> …..
> mfi0: 21294 (421879975s/0x0008/info) - Battery started charging
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> ix1: link state changed to DOWN
> ix1: link state changed to UP
> 
> 
> I should add that the servers that are directly connected to this freebsd 
> server reboot every night. This is why you see ix0 UP/DOWN
> messages in dmesg.
> 
> 
> 
> 
> 
> 
> ------------- END System information------------
> 
> 
> 
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 

Dipl.-Inform. (FH)
Dennis Berger

email:   d...@bsdsystems.de
mobile: +491791231509
fon: +494054001817

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: still mbuf leak in 9.0 / 9.1?

Reply via email to