Re: mbuf leak in OpenBSD 3.6?

Brad Thu, 09 Jun 2005 18:56:31 -0700

You could try actually looking at the errata list first...

006: RELIABILITY FIX: November 21, 2004   All architectures
Fix for transmit side breakage on macppc and mbuf leaks with xl(4).


I wonder what that is for..

On Thu, Jun 09, 2005 at 09:10:17PM -0500, R Ginn wrote:
> Hi,
> OpenBSD 3.6 (I'm running i386) seems to have a memory leak as regards
> to its use of mbufs for network traffic.  The default number of
> mbuf clusters (kern.maxclusters) is fine until I run a series of
> dump commands to a tape drive on a remote system.  After the dump
> completes, the number of mbufs in use remains high.  Each time I
> run another dump, the number climbs.  Soon I run out of them and
> the system locks all ethernet traffic (which hangs all the other
> systems depending on this one).  Increasing the kern.maxclusters
> at this point unlocks the system (although the dump terminates at
> that point).
> 
> Fortunately, when it hangs, it spits out a message to indicate
> that it ran out of mbuf clusters and to increase kern.maxclusters
> BTW, kudos to whoever put that message and suggestion in, it is
> a great/necessary feature that is so often missing in products.
> 
> Note that after the dump completes, there are no extra processes
> left (the # of processes before I run the rdump = the # of processes
> after the rdump completes).
> 
> I checked w/ipcs to see if dump was using any shared memory
> but, as expected, it doesn't and there weren't any in use.
> 
> Here is the dump command being used:
> 
>   dump 0udbsf 54000 64 96000 [EMAIL PROTECTED]:/dev/nrst0 /
> 
> Before the dump, 40 mbufs and 33 mbuf clusters are in use.
> After the dump, 437 mbufs and 146 mbuf clusters are in use.
> Before a 2nd dump, 438 mbufs and 148 mbuf clusters are in use.
> After a 2nd dump, 4329 mbufs and 1197 mbuf clusters are in use.
> Before a 3rd dump, 4330 mbufs and 1199 mbuf clusters are in use.
> After a 3rd dump, 8545 mbufs and 2325 mbuf clusters are in use.
> 
> BTW, the first dump here is for "/" and the 2nd dump is
> for "/usr" ("/usr" is about 10x bigger than "/").  To
> eliminate the case where the issue is just the highwater
> mark, the 3rd dump above is an identical dump of "/usr".
> 
> So, since dump (and nothing else extra) is running after the dump
> completes, I don't know why the system is "using" more mbufs after
> it completes its dump.
> 
> I noticed that a wireless driver had an mbuf leak.  So, in case
> it's relevant, I am using the xl(4) ethernet driver.
> 
> So, is this a memory/mbuf leak in the kernel?  Am I doing something
> wrong?  Is there anything I can do to "clean up" after each dump?
> My current work-around is to set a very large (40,000) maxclusters
> value and reboot the system after each set of dumps but that really
> rubs me the wrong way -- this is a UNIX(y 8-) system after all ...
> 
> I've provided some traces below.
> 
> Thanks,
> Rob Ginn
> [EMAIL PROTECTED]
> 
> BEFORE I run an remote dump (but after a reboot)
> ================================================
> 
> Script started on Thu Jun  9 16:14:17 2005
> demo# netstat -m
> 40 mbufs in use:
>       35 mbufs allocated to data
>       1 mbuf allocated to packet headers
>       4 mbufs allocated to socket names and addresses
> 33/46/40000 mbuf clusters in use (current/peak/max)
> 112 Kbytes allocated to network (67% in use)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines
> demo# ps xa
>   PID TT   STAT      TIME COMMAND
>     1 ??  Is      0:00.04 /sbin/init 
> 21191 ??  Is      0:00.03 syslogd: [priv] (syslogd)
> 26414 ??  I       0:00.09 syslogd -a /var/empty/dev/log 
> 30515 ??  Is      0:00.01 pflogd: [priv] (pflogd)
> 16850 ??  Is      0:00.01 portmap 
> 28016 ??  I       0:00.32 pflogd: [running] -s 116 -f /var/log/pflog (pflogd)
> 10782 ??  I       0:00.05 ypserv 
>  4580 ??  Is      0:00.30 ypbind 
> 26954 ??  Is      0:00.01 mountd 
> 20553 ??  Is      0:00.01 nfsd: master (nfsd)
> 11934 ??  IL      0:00.00 nfsd: server (nfsd)
> 14637 ??  IL      0:00.40 nfsd: server (nfsd)
>  6754 ??  IL      0:00.00 nfsd: server (nfsd)
> 15064 ??  IL      0:00.00 nfsd: server (nfsd)
> 16771 ??  Is      0:00.00 rpc.lockd 
> 20629 ??  Is      0:00.07 /usr/sbin/dhcpd xl0 
>  3712 ??  Is      0:00.01 lpd 
> 26612 ??  Is      0:00.02 inetd 
> 21469 ??  Is      0:00.42 sendmail: accepting connections (sendmail)
> 24532 ??  Is      0:00.17 /usr/sbin/sshd 
> 14769 ??  I       0:00.01 rarpd -a 
> 25583 ??  Is      0:00.01 rpc.bootparamd 
> 13440 ??  Is      0:00.01 mopd -a 
> 10486 ??  Is      0:00.00 /usr/local/adm/bin/rpc.statd 
> 23664 ??  Is      0:00.04 cron 
> 27922 p0  Is      0:00.02 -bin/csh -i 
> 17109 p0  ?+      0:00.00 ps -xa 
> 12055 C0  Is      0:00.07 -csh (csh)
> 31440 C0  I+      0:00.01 script BEFORE 
> 20480 C0  I+      0:00.01 script BEFORE 
> 29807 C1  Is+     0:00.01 /usr/libexec/getty Pc ttyC1 
>  5065 C2  Is+     0:00.01 /usr/libexec/getty Pc ttyC2 
>  6641 C3  Is+     0:00.01 /usr/libexec/getty Pc ttyC3 
> 23297 C5  Is+     0:00.01 /usr/libexec/getty Pc ttyC5 
> 
> 
> AFTER I run a remote dump
> =========================
> 
> Script started on Thu Jun  9 16:16:12 2005
> demo# netstat -m
> 437 mbufs in use:
>       232 mbufs allocated to data
>       201 mbufs allocated to packet headers
>       4 mbufs allocated to socket names and addresses
> 146/188/40000 mbuf clusters in use (current/peak/max)
> 516 Kbytes allocated to network (77% in use)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines
> demo# ps xa
>   PID TT   STAT      TIME COMMAND
>     1 ??  Is      0:00.04 /sbin/init 
> 21191 ??  Is      0:00.03 syslogd: [priv] (syslogd)
> 26414 ??  I       0:00.10 syslogd -a /var/empty/dev/log 
> 30515 ??  Is      0:00.01 pflogd: [priv] (pflogd)
> 16850 ??  Is      0:00.01 portmap 
> 28016 ??  I       0:00.32 pflogd: [running] -s 116 -f /var/log/pflog (pflogd)
> 10782 ??  I       0:00.05 ypserv 
>  4580 ??  Is      0:00.31 ypbind 
> 26954 ??  Is      0:00.01 mountd 
> 20553 ??  Is      0:00.01 nfsd: master (nfsd)
> 11934 ??  IL      0:00.00 nfsd: server (nfsd)
> 14637 ??  IL      0:00.41 nfsd: server (nfsd)
>  6754 ??  IL      0:00.00 nfsd: server (nfsd)
> 15064 ??  IL      0:00.00 nfsd: server (nfsd)
> 16771 ??  Is      0:00.00 rpc.lockd 
> 20629 ??  Is      0:00.07 /usr/sbin/dhcpd xl0 
>  3712 ??  Is      0:00.01 lpd 
> 26612 ??  Is      0:00.02 inetd 
> 21469 ??  Is      0:00.42 sendmail: accepting connections (sendmail)
> 24532 ??  Is      0:00.17 /usr/sbin/sshd 
> 14769 ??  I       0:00.01 rarpd -a 
> 25583 ??  Is      0:00.01 rpc.bootparamd 
> 13440 ??  Is      0:00.01 mopd -a 
> 10486 ??  Is      0:00.00 /usr/local/adm/bin/rpc.statd 
> 23664 ??  Is      0:00.04 cron 
> 24790 p0  Is      0:00.02 -bin/csh -i 
>  6134 p0  ?+      0:00.00 ps -xa 
> 12055 C0  Is      0:00.08 -csh (csh)
> 13116 C0  I+      0:00.01 script AFTER 
> 27577 C0  I+      0:00.01 script AFTER 
> 29807 C1  Is+     0:00.01 /usr/libexec/getty Pc ttyC1 
>  5065 C2  Is+     0:00.01 /usr/libexec/getty Pc ttyC2 
>  6641 C3  Is+     0:00.01 /usr/libexec/getty Pc ttyC3 
> 23297 C5  Is+     0:00.01 /usr/libexec/getty Pc ttyC5

Re: mbuf leak in OpenBSD 3.6?

Reply via email to