Hi list,
since we activated 10gbe on ixgbe cards + jumbo frames(9k) on 9.0 and now on 
9.1 we recognize that after a random period of time, sometimes a week, 
sometimes only a day, the
system doesn't send any packets out. The phenomenon is that you can't login via 
ssh, nfs and istgt is not operative. Yet you can login on the console and 
execute commands.
A clean shutdown isn't possible though. It hangs after vnode cleaning, normally 
you would see detaching of usb devices here, or other devices maybe? 
I've read the other post on this ML about mbuf leak in the arp handling code in 
if_ether.c line 558. We don't see any of those notices in dmesg so I don't 
think that glebius fix would apply for us.
I'm collecting system and memory information every hour. 


Script looks like this.
less /etc/periodic/hourly/100.report-memory.sh 
#!/bin/sh

reporttimestamp=`date +%d-%m-%Y-%H-%M`
reportname=${reporttimestamp}.txt

cd /root/memory-mon

top -b > $reportname
echo "" >> $reportname
vmstat -m >> $reportname
echo "" >> $reportname
vmstat -z >> $reportname
echo "" >> $reportname
netstat -Q >> $reportname
echo "" >> $reportname
netstat -n -x >> $reportname
echo "" >> $reportname
netstat -m >> $reportname
/usr/bin/perl /usr/local/bin/zfs-stats -a >> $reportname

When you grep for mbuf or mbuf usage you will see this for example:

root@freenas:/root/memory-mon # grep mbuf_packet: *
14-05-2013-14-09.txt:mbuf_packet:            256,      0,    9246,    
2786,201700429,   0,   0
14-05-2013-15-09.txt:mbuf_packet:            256,      0,    9256,    
2776,201773122,   0,   0
14-05-2013-16-09.txt:mbuf_packet:            256,      0,    9266,    
2766,201871553,   0,   0
14-05-2013-17-09.txt:mbuf_packet:            256,      0,    9276,    
2756,201915405,   0,   0
14-05-2013-18-09.txt:mbuf_packet:            256,      0,    9286,    
2746,201927956,   0,   0
14-05-2013-19-09.txt:mbuf_packet:            256,      0,    9296,    
2352,201935681,   0,   0
14-05-2013-20-09.txt:mbuf_packet:            256,      0,    9306,    
2342,201943754,   0,   0
14-05-2013-21-09.txt:mbuf_packet:            256,      0,    9316,    
2332,201950961,   0,   0
14-05-2013-22-09.txt:mbuf_packet:            256,      0,    9326,    
2450,201958150,   0,   0
14-05-2013-23-09.txt:mbuf_packet:            256,      0,    9336,    
2440,201967178,   0,   0
15-05-2013-00-09.txt:mbuf_packet:            256,      0,    9346,    
2430,201974561,   0,   0
15-05-2013-01-09.txt:mbuf_packet:            256,      0,    9356,    
2420,201982105,   0,   0
15-05-2013-02-09.txt:mbuf_packet:            256,      0,    9366,    
2410,201989463,   0,   0
15-05-2013-03-09.txt:mbuf_packet:            256,      0,    9378,    
1502,203019168,   0,   0
15-05-2013-04-09.txt:mbuf_packet:            256,      0,    9384,    
1624,205953601,   0,   0
15-05-2013-05-09.txt:mbuf_packet:            256,      0,    9394,    
1870,205959258,   0,   0
15-05-2013-06-09.txt:mbuf_packet:            256,      0,    9404,    
2500,205969396,   0,   0
15-05-2013-07-09.txt:mbuf_packet:            256,      0,    9414,    
3386,207945161,   0,   0
15-05-2013-08-09.txt:mbuf_packet:            256,      0,    9424,    
3376,208094689,   0,   0
15-05-2013-09-09.txt:mbuf_packet:            256,      0,    9434,    
2982,208172465,   0,   0
15-05-2013-10-09.txt:mbuf_packet:            256,      0,    9444,    
3100,208270369,   0,   0

and

root@freenas:/root/memory-mon # grep "mbufs in use" *
14-05-2013-14-09.txt:58444/5816/64260 mbufs in use (current/cache/total)
14-05-2013-15-09.txt:58455/5805/64260 mbufs in use (current/cache/total)
14-05-2013-16-09.txt:58464/5796/64260 mbufs in use (current/cache/total)
14-05-2013-17-09.txt:58475/5785/64260 mbufs in use (current/cache/total)
14-05-2013-18-09.txt:58484/5776/64260 mbufs in use (current/cache/total)
14-05-2013-19-09.txt:58493/5767/64260 mbufs in use (current/cache/total)
14-05-2013-20-09.txt:58503/5757/64260 mbufs in use (current/cache/total)
14-05-2013-21-09.txt:58513/5747/64260 mbufs in use (current/cache/total)
14-05-2013-22-09.txt:58523/5737/64260 mbufs in use (current/cache/total)
14-05-2013-23-09.txt:58533/5727/64260 mbufs in use (current/cache/total)
15-05-2013-00-09.txt:58543/5717/64260 mbufs in use (current/cache/total)
15-05-2013-01-09.txt:58554/5706/64260 mbufs in use (current/cache/total)
15-05-2013-02-09.txt:58563/5697/64260 mbufs in use (current/cache/total)
15-05-2013-03-09.txt:58639/5621/64260 mbufs in use (current/cache/total)
15-05-2013-04-09.txt:58581/5679/64260 mbufs in use (current/cache/total)
15-05-2013-05-09.txt:58591/5669/64260 mbufs in use (current/cache/total)
15-05-2013-06-09.txt:58602/5658/64260 mbufs in use (current/cache/total)
15-05-2013-07-09.txt:58613/5647/64260 mbufs in use (current/cache/total)
15-05-2013-08-09.txt:58623/6027/64650 mbufs in use (current/cache/total)
15-05-2013-09-09.txt:58634/6016/64650 mbufs in use (current/cache/total)
15-05-2013-10-09.txt:58645/6005/64650 mbufs in use (current/cache/total)


This increasing number of used mbuf_packets and mbufs in use makes me nervous.
See the complete reports http://knownhosts.org:/reports-14-15.tgz 

Thanks for help,

-dennis



--------------BEGIN System information---------------
It's a stock FreeBSD 9.1, yet the hostname is called freenas. Don't be confused.


igb0: flags=8c02<BROADCAST,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
        
options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
        ether 00:25:90:34:c1:12
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
igb1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        
options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
        ether 00:25:90:34:c1:13
        inet 172.16.1.6 netmask 0xfffff000 broadcast 172.16.15.255
        inet6 fe80::225:90ff:fe34:c113%igb1 prefixlen 64 scopeid 0x2 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        
options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
        ether 00:1b:21:cc:12:8b
        inet 10.254.254.242 netmask 0xfffffffc broadcast 10.254.254.243
        inet6 fe80::21b:21ff:fecc:128b%ix0 prefixlen 64 scopeid 0xb 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
        status: active
ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        
options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
        ether 00:1b:21:cc:12:8a
        inet 10.254.254.254 netmask 0xfffffffc broadcast 10.254.254.255
        inet6 fe80::21b:21ff:fecc:128a%ix1 prefixlen 64 scopeid 0xc 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-Twinax <full-duplex>)
        status: active
ix2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        
options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
        ether 00:1b:21:cc:12:b3
        inet 10.254.254.246 netmask 0xfffffffc broadcast 10.254.254.247
        inet6 fe80::21b:21ff:fecc:12b3%ix2 prefixlen 64 scopeid 0xd 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: no carrier
ix3: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        
options=401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO>
        ether 00:1b:21:cc:12:b2
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: no carrier
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128 
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0xf 
        inet 127.0.0.1 netmask 0xff000000 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

#dmesg
…..
mfi0: 21294 (421879975s/0x0008/info) - Battery started charging
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP
ix1: link state changed to DOWN
ix1: link state changed to UP


I should add that the servers that are directly connected to this freebsd 
server reboot every night. This is why you see ix0 UP/DOWN
messages in dmesg.

Attachment: dmesg.boot
Description: Binary data




------------- END System information------------


_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to