Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-20 Thread Terry Lambert
M. Warner Losh wrote:
 :  : + if (sops)
 :  : + free(sops, M_SEM);
 : 
 :  The kernel free() groks free(NULL, M_FOO), so the if isn't needed.
 :
 : Wow.  That's bogus.  It should panic.
 
 It isn't bogus.  free(NULL) is defined to be OK in ansi-c.  The kernel
 just mirrors that.

The free(NULL) in ANSI C is to permit invocation of the garbage
collector; there are very specific semantics involved.  Specifically,
if you do not call free(NULL), you are *guaranteed* that a malloc()
followed by a free() followed by a subsequent malloc(), if the size
of the area allocated by the subsequent malloc() is less than or
equal to the size of the area freed, *will not fail*.

Of course, FreeBSD is a memory overcommit system, and fails to
maintain this guarantee, as required by the standard (e.g. only
do garbage collection when it is signalled that it is OK for a
subsequent re-malloc() to fail, because the GC'ed memory has
been released to the system.

This is OK; we all realize that the standard, which permits a NULL
argument to free(), allows this value for reasons of compatability
with historical source code.


But that begs the question: does the kernel interface also allow
it for the purposes of compatability with legacy code?  This seems
unlikely in the extreme.

Does the kernel interface use this as a trigger, as the user space
interface historically did, to perform garbage collection?  This
also seems unlikely.

Does it do it so that people can write code that doesn't check
return values, and get away with it when they shouldn't?  This
seems highly likely.


 : Or we should fix all of libc to take NULL arguments for strings,
 : and treat them as if they were actually .
 
 That's bogus.

I agree that it's bogus, but it's the same argument in user space
as in kernel space.  Actually, it's not the same: the kernel
argument is much poorer, not having legacy code it needs to support.

In user space, there is plenty of legacy code that acts this way;
in fact, one could trap a zero dereference (one does; one just
faults on it, currently), map a page full of zeros at page zero,
and then a dereference would in fact b giving a pointer to a NULL
string.

SVR4 does this, as a kernel option for compatability with legacy
software.  It is tunable to be able to turn it off, and you can
not only run legacy software which will not run in FreeBSD ABI
compatability (hardly compatable, that...), but you can know
from the memory map of the process, as examined through /proc,
that a NULL dereference has occurred.

So it should arguably be controllable via sysctl, minimally for
IBCS2 and similar ABI modules, for user space.

But it's still unjustified in kernel space.

And panic'ing on attempts to free NULL pointers would be a nice
way of avoiding cascade failures later on, and keep the problem
from being hidden a long ways away from its effect.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-20 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Terry Lambert [EMAIL PROTECTED] writes:
: M. Warner Losh wrote:
:  :  : + if (sops)
:  :  : + free(sops, M_SEM);
:  : 
:  :  The kernel free() groks free(NULL, M_FOO), so the if isn't needed.
:  :
:  : Wow.  That's bogus.  It should panic.
:  
:  It isn't bogus.  free(NULL) is defined to be OK in ansi-c.  The kernel
:  just mirrors that.
: 
: The free(NULL) in ANSI C is to permit invocation of the garbage
: collector; there are very specific semantics involved.  Specifically,
: if you do not call free(NULL), you are *guaranteed* that a malloc()
: followed by a free() followed by a subsequent malloc(), if the size
: of the area allocated by the subsequent malloc() is less than or
: equal to the size of the area freed, *will not fail*.

C99 just says:

section 7.20.3.2:
   [#2] The free function causes the space pointed to by ptr to
   be  deallocated,  that  is,  made  available   for   further
   allocation.   If  ptr  is  a null pointer, no action occurs.
   Otherwise, if the argument does not match a pointer  earlier
   returned  by  the calloc, malloc, or realloc function, or if
   the space has been deallocated by a call to free or realloc,
   the behavior is undefined.

If ptr is a null poter, no action occurs doesn't sound like GC to
me.  Like I said, free(NULL) is well defined, and unambiguous, in ansi
c 99.  In fact, I see nothing in the final c99 spec that even comes
close to what you are talking about.  Maybe c89 did that (I'm too lazy
to walk down stairs and find it), but it too is irrelevant as the base
system moves towards c99 compliance.

The rest is bogus too.  free(NULL, M_FOO) is well defined in the
kernel and does the right thing and likely isn't going to change.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-19 Thread Ben Stuyts
At 04:15 19/10/2002, Alfred Perlstein wrote:

* Jake Burkholder [EMAIL PROTECTED] [021018 18:26] wrote:
 semop() leaks memory.  An important free() was removed by alfred in
 rev 1.55.  Try this.

Oh' c'mon, isn't MP-safeness a bit more important than a some
little memory leak, ram is cheap! processors aren't!

Seriously, I just checked in slightly different fix (based on jake's
sleuthing)...  please let me know if works for you guys.


Thanks Alfred, I am building the kernel right now with your fix. Later 
today I will let you know if this fixes the problems I've been seeing.

Kind regards,
Ben


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


Re: [Ugly PATCH] Again: panic kmem_malloc(): SOLVED

2002-10-19 Thread Ben Stuyts
At 13:34 19/10/2002, Ben Stuyts wrote:

At 04:15 19/10/2002, Alfred Perlstein wrote:

* Jake Burkholder [EMAIL PROTECTED] [021018 18:26] wrote:
 semop() leaks memory.  An important free() was removed by alfred in
 rev 1.55.  Try this.

Seriously, I just checked in slightly different fix (based on jake's
sleuthing)...  please let me know if works for you guys.


Thanks Alfred, I am building the kernel right now with your fix. Later 
today I will let you know if this fixes the problems I've been seeing.

Looks good, the machine has been running for a couple of hours now, and 
vmstat -m says:

  sem 4 7K  8K 3556  16,1024,4096

Before it used to be 2 - 5 MB allocated, and qpopper/smbd do no longer eat 
memory each time they are invoked.

Many thanks!
Ben


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


Re: [Ugly PATCH] Again: panic kmem_malloc(): SOLVED

2002-10-19 Thread Alfred Perlstein
* Ben Stuyts [EMAIL PROTECTED] [021019 07:16] wrote:
 At 13:34 19/10/2002, Ben Stuyts wrote:
 At 04:15 19/10/2002, Alfred Perlstein wrote:
 * Jake Burkholder [EMAIL PROTECTED] [021018 18:26] wrote:
  semop() leaks memory.  An important free() was removed by alfred in
  rev 1.55.  Try this.
 
 Seriously, I just checked in slightly different fix (based on jake's
 sleuthing)...  please let me know if works for you guys.
 
 Thanks Alfred, I am building the kernel right now with your fix. Later 
 today I will let you know if this fixes the problems I've been seeing.
 
 Looks good, the machine has been running for a couple of hours now, and 
 vmstat -m says:
 
   sem 4 7K  8K 3556  16,1024,4096
 
 Before it used to be 2 - 5 MB allocated, and qpopper/smbd do no longer eat 
 memory each time they are invoked.
 
 Many thanks!

Great!  Thanks for the bug report and my apologies for jumping down
your throat initially.  Best of luck to you.

-Alfred

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-18 Thread Ben Stuyts
This is a repost. Forgive me if you see it twice, but it didn't turn up in 
the -current list.

Hi,

Just had another panic, same kmem_malloc(). I did a trace but forgot to 
write the traceback down. In any case, there was a semop() call in the 
traceback. Furthermore, this might be interesting: the last vmstat -m log 
before the panic. Maybe someone can check if these values are reasonable? 
The system has 64 MB memory and has been up for about 24 hrs with almost no 
load.

[terminus.stuyts.nl ben/bin]4: cat vmstatlog.4

Type  InUse MemUse HighUse Requests  Size(s)
 atkbddev 2 1K  1K2  32
   pfs_fileno 132K 32K1  32768
 nexusdev 2 1K  1K2  16
  memdesc 1 4K  4K1  4096
legacydrv 3 1K  1K3  16
VM pgdata 1 4K  4K1  4096
pfs_nodes20 3K  3K   20  128
MSDOSFS mount 1 8K  8K1  8192
UFS mount1223K 39K   14  256,2048,4096,16384
UFS ihash 116K 16K1  16384
  UFS dirhash   18936K 52K 1695  16,32,64,128,256,512
 FFS node 11086  2079K   2096K  1000908  128,256
newdirblk 0 0K  1K5  16
   dirrem 5 1K 34K18098  32
mkdir 0 0K  3K  524  32
   diradd 0 0K  9K18045  32
 freefile 5 1K 31K12022  32
 freeblks 7 2K247K10187  256
 freefrag 2 1K  2K36405  32
   allocindir 4 1K204K   257072  64
 indirdep 2 1K876K 2172  32,8192
  allocdirect 1 1K 33K51562  128
bmsafemap 8 1K  3K 6606  32
   newblk 1 1K  1K   308635  64,256
 inodedep1218K168K31012  128,16384
  pagedep10 3K  7K 6114  64,2048
 p1003.1b 1 1K  1K1  16
   NFS daemon 5 3K  3K5  256,512
  NFS srvsock 2 1K  1K2  128
 ip6_moptions 1 1K  1K1  16
in6_multi10 1K  1K   10  16,64
 syncache 1 8K  8K1  8192
  IpFw/IpAcct30 4K  4K   30  64,128
 in_multi 2 1K  1K2  32
 routetbl41 6K  6K   78  16,32,64,128,256
   lo 1 1K  1K1  512
clone 312K 12K3  4096
  ether_multi35 2K  2K   35  16,32,64
   ifaddr22 7K  7K   22  32,256,512,2048
  BPF 6 9K  9K6  128,256,4096
mount20 4K  4K   24  16,32,128,512
   vnodes23 6K  6K  137  16,32,64,128,256
cluster_save buffer 0 0K  1K 9793  32,64
 vfscache  5226   381K436K   534189  64,128,256,32768
   BIO buffer 810K317K 4611  512,1024,2048
DEVFS   12122K 22K  121  16,32,128,8192
  pcb38 5K  6K 1913  16,32,64,2048
   soname 4 1K  1K39624  16,32,128
 ptys 2 1K  1K2  512
 ttys   48865K 85K 6431  128,512
  shm 318K 19K9  16,1024,16384
  sem344456  5390K   5390K   344456  16,1024,4096
  msg 425K 25K4  512,4096,16384
 ioctlops 0 0K  1K   22  512,1024
   USBdev 1 1K  2K4  128,512
  USB1521K 22K   701353  16,32,128,256,4096
taskqueue 1 1K  1K1  128
 sbuf 0 0K  5K   34  32,64,4096
 rman99 7K  7K  496  16,64,128
  mbufmgr   11616K 16K  116  32,64,128,2048,8192
 kobj   127   508K508K  127  4096
 eventhandler22 2K  2K   22  32,128
  bus   47039K 40K 1363 
16,32,64,128,256,512,2048,4096,8192
 SWAP 273K 73K2  64
sysctltmp 0 0K  4K   802428  16,32,64,128,256,512,1024,4096
   sysctl 0 0K  1K19359  16,32,64
  uidinfo 7 1K  1K 7121  32,128
 cred38 5K  9K   142297  128
  subproc   10511K 15K52100  64,256
 proc 2 1K  1K2  512
  session33 5K  6K 2102  128
 pgrp39 5K  6K 2278  128
   module   17111K 11K  171  64
   ip6ndp 3 1K  1K4  64,128,512
 temp 954K286K   156410 
16,32,64,128,256,512,1024,2048,4096,8192,16384,32768
   devbuf   474   965K997K 2333 
16,32,64,128,256,512,1024,2048,4096,8192,32768
lockf 6 1K  1K29935  64
   feeder48 1K  1K   48  16
   linker6513K 18K   85  16,32,256,1024,4096,8192
   KTRACE   10013K 13K  100  128
  ithread40 7K  7K   

Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-18 Thread Ben Stuyts
Terry,

At 23:07 18/10/2002, you wrote:

Ben Stuyts wrote:
 Furthermore, this might be interesting: the last vmstat -m log
 before the panic. Maybe someone can check if these values are reasonable?
 The system has 64 MB memory and has been up for about 24 hrs with almost no
 load.
sem344456  5390K   5390K   344456  16,1024,4096

Almost 5.3M of unswappable physical memory dedicated to semaphores
seems like a bit much.


Yes, and it increases continuously, for example when I fetch new mail (over 
pop) from my windows pc. The pc stores this again on a network drive, so 
both qpopper and smbd are involved. For example, vmstat -m says:

vmstat -m | grep sem
  sem155886  2443K   2443K   155886  16,1024,4096

Now when I do a fetch-mail with Eudora on my pc, the same command says.

vmstat -m | grep sem
  sem156178  2448K   2448K   156178  16,1024,4096

I can repeat this at will, and each time I loose 4-5 KB. qpopper is started 
from inetd, and smbd runs as a daemon. I tried stopping smbd:

[terminus.stuyts.nl etc/rc.d]90: sudo /usr/local/etc/rc.d/samba.sh stop
[terminus.stuyts.nl etc/rc.d]91: !vm
vmstat -m | grep sem
  sem156524  2453K   2453K   156524  16,1024,4096

It doesn't free the sem allocated memory.

But without knowing what software you are running, it's hard to say
if the number is unreasonable, or not.


Well, it is really a lightly loaded server, just serving one windows pc 
here at home. Here is a ps, and the only thing that's missing from it is 
the occasional pop session. Also note that this system is not connected to 
the internet, so the http that's running is mostly for my own pleasure (and 
proxy/cache). I do run ppp and uucp every now and then.

USERPID %CPU %MEM   VSZ  RSS  TT  STAT STARTED  TIME COMMAND
dnetc   503 94.2  0.8   960  460  ??  RNs  Thu09PM 1529:56.87 
/usr/local/distributed.net/dnetc -quiet
root 10  0.0  0.0 0   12  ??  DL1Jan70   0:00.00  (ktrace)
root  1  0.0  0.3   700  184  ??  ILs   1Jan70   0:01.26 /sbin/init --
root 11  0.0  0.0 0   12  ??  RL1Jan70   1:24.27  (idle)
root 12  0.0  0.0 0   12  ??  WL1Jan70   1:01.85  (swi1: net)
root 13  0.0  0.0 0   12  ??  WL1Jan70   7:49.87  (swi6: 
tty:sio clock)
root 15  0.0  0.0 0   12  ??  DL1Jan70   0:17.51  (random)
root 18  0.0  0.0 0   12  ??  WL1Jan70   0:35.60  (swi3: cambio)
root 23  0.0  0.0 0   12  ??  DL1Jan70   0:33.97  (usb0)
root 24  0.0  0.0 0   12  ??  DL1Jan70   0:00.00  (usbtask)
root 25  0.0  0.0 0   12  ??  WL1Jan70   0:15.98  (irq12: sym0)
root 26  0.0  0.0 0   12  ??  WL1Jan70   0:33.34  (irq9: xl0)
root 27  0.0  0.0 0   12  ??  WL1Jan70   0:00.04  (irq1: atkbd0)
root 28  0.0  0.0 0   12  ??  WL1Jan70   0:00.00  (irq6: fdc0)
root 30  0.0  0.0 0   12  ??  WL1Jan70   0:00.25  (swi0: tty:sio)
root  2  0.0  0.0 0   12  ??  DL1Jan70   0:51.73  (pagedaemon)
root  3  0.0  0.0 0   12  ??  DL1Jan70   0:00.42  (vmdaemon)
root  4  0.0  0.0 0   12  ??  RL1Jan70   0:01.95  (pagezero)
root  5  0.0  0.0 0   12  ??  DL1Jan70   0:05.29  (bufdaemon)
root  6  0.0  0.0 0   12  ??  DL1Jan70   1:26.74  (syncer)
root  7  0.0  0.0 0   12  ??  DL1Jan70   0:04.12  (vnlru)
root123  0.0  0.0   2208  ??  IWs  - 0:00.00 adjkerntz -i
root194  0.0  0.4   628  244  ??  Is   Thu09PM   0:09.18 /sbin/natd 
-dynamic -log -n tun0
root241  0.0  0.7  1180  420  ??  Ss   Thu09PM   0:04.76 
/usr/sbin/syslogd -s -v
root255  0.0  2.6  2856 1580  ??  Is   Thu09PM   0:23.02 
/usr/sbin/named -d 1
root263  0.0  0.0  1332   12  ??  Is   Thu09PM   0:00.06 /usr/sbin/rpcbind
root340  0.0  0.0  1204   12  ??  Is   Thu09PM   0:00.03 
/usr/sbin/mountd -r
root342  0.0  0.0  1164   12  ??  Is   Thu09PM   0:00.30 nfsd: master 
(nfsd)
root343  0.0  0.0  11168  ??  IW   - 0:00.00 nfsd: server 
(nfsd)
root344  0.0  0.0  11168  ??  IW   - 0:00.00 nfsd: server 
(nfsd)
root345  0.0  0.0  11168  ??  IW   - 0:00.00 nfsd: server 
(nfsd)
root347  0.0  0.0  11168  ??  IW   - 0:00.00 nfsd: server 
(nfsd)
root376  0.0  0.0  1188   12  ??  Is   Thu09PM   0:00.05 /usr/sbin/lpd
root380  0.0  0.3  1188  168  ??  SThu09PM   0:02.57 /usr/sbin/lpd
root396  0.0  1.3  1552  804  ??  Ss   Thu09PM   0:26.59 /usr/sbin/ntpd 
-p /var/run/ntpd.pid
root418  0.0  0.1  1132   64  ??  Is   Thu09PM   0:00.97 /usr/sbin/usbd
root437  0.0  1.4  3036  820  ??  Ss   Thu09PM   0:19.39 sendmail: 
accepting connections (sendmail)
smmsp   440  0.0  0.9  3012  528  ??  Is   Thu09PM   0:00.38 sendmail: 
Queue runner@00:30:00 for /var/spool/clientmqueue (sendmail)
root467  0.0  1.5  2332  908  ??  Ss   Thu09PM   0:25.86 
/usr/local/sbin/httpd
www 485  0.0  0.0  2356   12  ??  IThu09PM   0:00.01 

Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-18 Thread Terry Lambert
Ben Stuyts wrote:
 Almost 5.3M of unswappable physical memory dedicated to semaphores
 seems like a bit much.
 
 Yes, and it increases continuously, for example when I fetch new mail (over
 pop) from my windows pc. The pc stores this again on a network drive, so
 both qpopper and smbd are involved. For example, vmstat -m says:
 
 vmstat -m | grep sem
sem155886  2443K   2443K   155886  16,1024,4096
 
 Now when I do a fetch-mail with Eudora on my pc, the same command says.
 
 vmstat -m | grep sem
sem156178  2448K   2448K   156178  16,1024,4096
 
 I can repeat this at will, and each time I loose 4-5 KB. qpopper is started
 from inetd, and smbd runs as a daemon. I tried stopping smbd:

None of us have been able to repeat your problem, up to now.  I
suppose now that we know you are running qpopper on -current,
we could repeat the problem, but, frankly, you already have a
test environment set up, and it would be a lot of work for us
to duplicate it, and even so, we won't know for sure if we
could repeat the problem.

Have you checked out your source tree with a date tag, so that
it's possible for everyone else to check out and get the same
source files?  Line number references in tracebacks are pretty
useless, if the lines don't match.


Unless you can identify the exact number of bytes being consumed,
and then identify a kernel structure used in the semaphore code
that is equal to that size, or for which that size is a least
common multiple, and there are a number of evets equal to the
size of the divisor, then that's no good.

This is why everyone keeps asking you to run the kernel debugger,
so that you can tell us exactly the code that's failing, and why,
and why a stack backtrace, more detailed than it contained a call
to sem is important.

This problem is evidently a memory leak in the semaphore code;
but that does not mean that the crash that results will be in
any way related to where the leak occurs.

In other words, the crash is a secondary effect.


Only by fully understanding the crash will anyone be able to help
you with the root cause.

I understand that it's frustrating to go step by step, when you
think you have isolated the problem to a smaller area, but the
information you gather from outside that area will tell you about
the inside much more clearly than staring at the outside of a
black box where we know the problem lives.

The only alternative to rewriting the black box from scratch, or
grovelling through it with a line-by-line code review (I'm not
interested in doing that; perhaps you could interest the author
of the changes that resulted in the problem) is to find a smoking
gun, and work from that, instead.

If this problem is in the way of you getting work done (one wonders
why you are using -current, if you need to get work done), then my
best suggestion to you is to back out the changes Alfred made, one
by one, and when it stops having the problem, you will have identified
a very small patch that causes the problem.


 But without knowing what software you are running, it's hard to say
 if the number is unreasonable, or not.
 
 Well, it is really a lightly loaded server, just serving one windows pc
 here at home. Here is a ps, and the only thing that's missing from it is
 the occasional pop session. Also note that this system is not connected to
 the internet, so the http that's running is mostly for my own pleasure (and
 proxy/cache). I do run ppp and uucp every now and then.

Perhaps I wasn't clear.

Not knowing what calls your software makes that cause the problem
to occur, it is not possible for us to create a cut-down test case
in less than 30 lines of C source code, so that we can repeat the
problem at will, without secondary effects.  As it is, you only
*suppose* that the qpopper usage alone is sufficient to cause the
problem; even if you are correct, that's insufficient to identify
where the problem is... it may not even really be in the semaphore
source code at all.. maybe it's in kevent code, for unfreed events,
etc..

I think you need to go back one email:

|  Just had another panic, same kmem_malloc(). I did a trace but forgot to
|  write the traceback down.
| 
| Wait until the next one, and remember to write it down; preferrably,
| obtain a system dump image, so you can examine it with the debugger,
| and make sure that the kernel you are running has a debuggable
| counterpart already there (i.e. you used config -g to create the
| kernel you are running).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-18 Thread Jake Burkholder
Apparently, On Sat, Oct 19, 2002 at 12:19:57AM +0200,
Ben Stuyts said words to the effect of;

 Terry,
 
 At 23:07 18/10/2002, you wrote:
 Ben Stuyts wrote:
   Furthermore, this might be interesting: the last vmstat -m log
   before the panic. Maybe someone can check if these values are reasonable?
   The system has 64 MB memory and has been up for about 24 hrs with almost no
   load.
  sem344456  5390K   5390K   344456  16,1024,4096
 
 Almost 5.3M of unswappable physical memory dedicated to semaphores
 seems like a bit much.
 
 Yes, and it increases continuously, for example when I fetch new mail (over 
 pop) from my windows pc. The pc stores this again on a network drive, so 
 both qpopper and smbd are involved. For example, vmstat -m says:
 

semop() leaks memory.  An important free() was removed by alfred in
rev 1.55.  Try this.

Jake

Index: sysv_sem.c
===
RCS file: /home/ncvs/src/sys/kern/sysv_sem.c,v
retrieving revision 1.55
diff -u -r1.55 sysv_sem.c
--- sysv_sem.c  13 Aug 2002 08:47:17 -  1.55
+++ sysv_sem.c  19 Oct 2002 01:20:35 -
@@ -1128,6 +1128,8 @@
td-td_retval[0] = 0;
 done2:
mtx_unlock(sema_mtxp);
+   if (sops)
+   free(sops, M_SEM);
return (error);
 }
 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-18 Thread Alfred Perlstein
* Jake Burkholder [EMAIL PROTECTED] [021018 18:26] wrote:
 semop() leaks memory.  An important free() was removed by alfred in
 rev 1.55.  Try this.

Oh' c'mon, isn't MP-safeness a bit more important than a some
little memory leak, ram is cheap! processors aren't!

Seriously, I just checked in slightly different fix (based on jake's
sleuthing)...  please let me know if works for you guys.

thanks,
-Alfred

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-17 Thread Ben Stuyts
Hello Alfred,

On Wed, Oct 16, 2002 at 02:26:19PM -0700, Alfred Perlstein wrote:
 * Ben Stuyts [EMAIL PROTECTED] [021016 14:05] wrote:
  
  No need to wait for tomorrow. :-) Just 1.5 hours later, vmstat -m says:
  
 sem167344  2622K   2622K   167344  16,1024,4096
  ---
 sem235512  3687K   3687K   235512  16,1024,4096
  
  So it looks indeed like sem is the problem,
 
 what does
 sysctl -a | grep ^p10
 say?

p1003_1b.asynchronous_io: 0
p1003_1b.mapped_files: 1
p1003_1b.memlock: 0
p1003_1b.memlock_range: 0
p1003_1b.memory_protection: 0
p1003_1b.message_passing: 0
p1003_1b.prioritized_io: 0
p1003_1b.priority_scheduling: 1
p1003_1b.realtime_signals: 0
p1003_1b.semaphores: 0
p1003_1b.fsync: 0
p1003_1b.shared_memory_objects: 1
p1003_1b.synchronized_io: 0
p1003_1b.timers: 0
p1003_1b.aio_listio_max: 0
p1003_1b.aio_max: 0
p1003_1b.aio_prio_delta_max: 0
p1003_1b.delaytimer_max: 0
p1003_1b.mq_open_max: 0
p1003_1b.pagesize: 4096
p1003_1b.rtsig_max: 0
p1003_1b.sem_nsems_max: 0
p1003_1b.sem_value_max: 0
p1003_1b.sigqueue_max: 0
p1003_1b.timer_max: 0

 My guess is that you don't have the module in question loaded.
 
 If you do, then why?  (it's marked experimental)

The only modules loaded are:

[terminus.stuyts.nl boot/kernel]21: kldstat
Id Refs AddressSize Name
 13 0xc010 3daa00   kernel
 21 0xc12fd000 2000 green_saver.ko

 And why aren't these bug reports a lot more detailed? (meaing why
 aren't you actually giving an hypothesys as to why the code is
 broken?)

I think it was Jeff Roberson hinting at that. I am only reporting a problem
and I hope I can help fixing it. I have however no knowledge of the kernel
internals, so forgive me for being too vague and let me know what more
information you need.

 *grumble*

Sorry...

Kind regards,
Ben

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Ugly PATCH] Again: panic kmem_malloc(): dmesg and kernel config

2002-10-17 Thread Ben Stuyts
Some info I did not include in the previous messages:
dmesg output and kernel config.

[terminus.stuyts.nl boot/kernel]26: dmesg
Copyright (c) 1992-2002 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #5: Sun Oct  6 01:50:54 CEST 2002
[EMAIL PROTECTED]:/var/obj/usr/src/sys/TERMINUS
Preloaded elf kernel /boot/kernel/kernel at 0xc04dc000.
Timecounter i8254  frequency 1193182 Hz
Timecounter TSC  frequency 233864671 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (233.86-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x634  Stepping = 4
  Features=0x80f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,MMX
real memory  = 67108864 (65536K bytes)
avail memory = 59920384 (58516K bytes)
Pentium Pro MTRR support enabled
npx0: math processor on motherboard
npx0: INT 16 interface
Using $PIR table, 6 entries at 0xc00fdc00
pcib0: Intel 82443LX (440 LX) host to PCI bridge at pcibus 0 on motherboard
pci0: PCI bus on pcib0
pcib1: PCIBIOS PCI-PCI bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
isab0: PCI-ISA bridge at device 7.0 on pci0
isa0: ISA bus on isab0
atapci0: Intel PIIX4 ATA33 controller port 0xf000-0xf00f at device 7.1 on pci0
atapci0: Busmastering DMA not supported
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0x6400-0x641f irq 11 at device 
7.2 on pci0
usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
pci0: bridge, PCI-unknown at device 7.3 (no driver attached)
sym0: 875 port 0x6800-0x68ff mem 0xe800-0xe8000fff,0xe8001000-0xe80010ff irq 12 
at device 11.0 on pci0
sym0: Symbios NVRAM, ID 7, Fast-20, SE, parity checking
sym0: open drain IRQ line driver, using on-chip SRAM
sym0: using LOAD/STORE-based firmware.
xl0: 3Com 3c905-TX Fast Etherlink XL port 0x6c00-0x6c3f irq 9 at device 13.0 on pci0
/usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from 
/usr/src/sys/pci/if_xl.c:1264
/usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from 
/usr/src/sys/pci/if_xl.c:1264
lock order reversal
 1st 0xc0ba1bd4 xl0 (network driver) @ /usr/src/sys/pci/if_xl.c:1264
 2nd 0xc03d2b00 allproc (allproc) @ /usr/src/sys/kern/kern_fork.c:318
/usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from 
/usr/src/sys/pci/if_xl.c:1264
/usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from 
/usr/src/sys/pci/if_xl.c:1264
/usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from 
/usr/src/sys/pci/if_xl.c:1264
xl0: Ethernet address: 00:60:08:a5:d4:ff
miibus0: MII bus on xl0
nsphy0: DP83840 10/100 media interface on miibus0
nsphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
/usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from 
/usr/src/sys/pci/if_xl.c:647
pci0: display, VGA at device 15.0 (no driver attached)
orm0: Option ROMs at iomem 0xc8000-0xcbfff,0xc-0xc7fff on isa0
atkbdc0: Keyboard controller (i8042) at port 0x64,0x60 on isa0
atkbd0: AT Keyboard irq 1 on atkbdc0
kbd0 at atkbd0
fdc0: enhanced floppy controller (i82077, NE72065 or clone) at port 
0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1440-KB 3.5 drive on fdc0 drive 0
ppc0: Parallel port at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode
ppbus0: IEEE1284 device found /NIBBLE
Probing for PnP devices on ppbus0:
ppbus0: EPSON Stylus Photo EX PRINTER ESCPL2,BDC
lpt0: Printer on ppbus0
lpt0: Interrupt-driven port
sc0: System console on isa0
sc0: VGA 16 virtual consoles, flags=0x200
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0
unknown: PNP0303 can't assign resources (port)
unknown: PNP0501 can't assign resources (port)
unknown: PNP0700 can't assign resources (port)
unknown: PNP0400 can't assign resources (port)
unknown: PNP0501 can't assign resources (port)
Timecounters tick every 10.000 msec
ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, 
logging unlimited
Waiting 5 seconds for SCSI devices to settle
(noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
Mounting root from ufs:/dev/da0s1a
da2 at sym0 bus 0 target 3 lun 0
da2: QUANTUM FIREBALL_TM3200S 300N Fixed Direct Access SCSI-2 device 
da2: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled
da2: 3067MB (6281856 512 byte sectors: 255H 63S/T 391C)
da1 at sym0 bus 0 target 2 lun 0
da1: QUANTUM FIREBALL ST3.2S 0F0C Fixed Direct Access SCSI-2 device 
da1: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled
da1: 3090MB (6328861 512 byte sectors: 255H 63S/T 393C)
da0 at sym0 bus 0 target 1 lun 0
da0: IBM DCAS-34330W S61A 

Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-16 Thread Jeff Roberson


On Wed, 16 Oct 2002, Ben Stuyts wrote:

 I just got the same panic without your patch. (I wanted to verify that it
 was still panic-ing with the latest src tree.) I am now building a kernel
 with your patch.

 I'll also run your vmstat script that you posted in a similar thread. One
 of the big memory users seems to be sem, and it's growing. Almost every
 time I do a vmstat -m, sem usage has grown a few k.


[snip]
sem167320  2622K   2622K   167320  16,1024,4096
[snip]

Thank you for looking into this.  It definitely looks like a memory leak.
I forwarded this to alfred.  He was just working on semaphores so he may
know something about it.


 I'll see what the stats are tomorrow.

 Kind regards,
 Ben

Much appreciated.

Cheers,
Jeff


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-16 Thread Ben Stuyts

At 21:20 11/10/2002, Terry Lambert wrote:
Please find a (relatively bogus) patch attached, which could cause
things to block for a long time, but will avoid the panic.

Terry,

I just got the same panic without your patch. (I wanted to verify that it 
was still panic-ing with the latest src tree.) I am now building a kernel 
with your patch.

I'll also run your vmstat script that you posted in a similar thread. One 
of the big memory users seems to be sem, and it's growing. Almost every 
time I do a vmstat -m, sem usage has grown a few k.

 Type  InUse MemUse HighUse Requests  Size(s)
  atkbddev 2 1K  1K2  32
pfs_fileno 132K 32K1  32768
  nexusdev 2 1K  1K2  16
   memdesc 1 4K  4K1  4096
 legacydrv 3 1K  1K3  16
 VM pgdata 1 4K  4K1  4096
 pfs_nodes20 3K  3K   20  128
MSDOSFS mount 1 8K  8K1  8192
 UFS mount1223K 39K   14  256,2048,4096,16384
 UFS ihash 116K 16K1  16384
   UFS dirhash5711K 11K  117  16,32,64,128,512
  FFS node  4976   933K936K40528  128,256
dirrem 0 0K 31K 5522  32
 mkdir 0 0K  3K  520  32
diradd14 1K  7K 3118  32
  freefile 0 0K 26K 4839  32
  freeblks 1 1K186K 3820  256
  freefrag 6 1K  1K 2494  32
allocindir10 1K 86K 8596  64
  indirdep 2 1K876K  577  32,8192
   allocdirect23 3K 16K 8457  128
 bmsafemap 3 1K  3K  365  32
newblk 1 1K  1K17054  64,256
  inodedep1618K168K 9570  128,16384
   pagedep 2 3K  7K  874  64,2048
  p1003.1b 1 1K  1K1  16
NFS daemon 5 3K  3K5  256,512
   NFS srvsock 2 1K  1K2  128
  ip6_moptions 1 1K  1K1  16
 in6_multi10 1K  1K   10  16,64
  syncache 1 8K  8K1  8192
   IpFw/IpAcct30 4K  4K   30  64,128
  in_multi 2 1K  1K2  32
  routetbl41 6K  6K   76  16,32,64,128,256
lo 1 1K  1K1  512
 clone 312K 12K3  4096
   ether_multi35 2K  2K   35  16,32,64
ifaddr22 7K  7K   22  32,256,512,2048
   BPF 6 9K  9K6  128,256,4096
 mount20 4K  4K   24  16,32,128,512
vnodes23 6K  6K  137  16,32,64,128,256
cluster_save buffer 0 0K  1K 1183  32,64
  vfscache  2634   197K198K32833  64,128,256,32768
BIO buffer2126K205K 1130  512,1024,2048
 DEVFS   12122K 22K  121  16,32,128,8192
   pcb38 5K  5K   58  16,32,64,2048
soname 4 1K  1K 1415  16,32,128
  ptys 2 1K  1K2  512
  ttys   61481K 81K 1121  128,512
   shm 318K 19K8  16,1024,16384
   sem167320  2622K   2622K   167320  16,1024,4096
   msg 425K 25K4  512,4096,16384
  ioctlops 0 0K  1K   22  512,1024
USBdev 1 1K  2K4  128,512
   USB1521K 22K15345  16,32,128,256,4096
 taskqueue 1 1K  1K1  128
  sbuf 0 0K  5K2  32,4096
  rman99 7K  7K  496  16,64,128
   mbufmgr   10615K 15K  106  32,64,128,2048,8192
  kobj   127   508K508K  127  4096
  eventhandler22 2K  2K   22  32,128
   bus   47039K 40K 1363 
16,32,64,128,256,512,2048,4096,8192
  SWAP 273K 73K2  64
 sysctltmp 0 0K  4K 8856  16,32,64,128,256,512,1024,4096
sysctl 0 0K  1K  386  16,32,64
   uidinfo 7 1K  1K  525  32,128
  cred34 5K  5K18178  128
   subproc   11411K 14K10613  64,256
  proc 2 1K  1K2  512
   session33 5K  5K   68  128
  pgrp40 5K  6K  117  128
module   17111K 11K  171  64
ip6ndp 3 1K  1K4  64,128,512
  temp1154K 55K19887 
16,32,64,128,256,512,1024,2048,4096,8192,16384,32768
devbuf   473   964K997K 2268 
16,32,64,128,256,512,1024,2048,4096,8192,32768
 lockf 6 1K  1K  549  64
feeder48 1K  1K   48  16
linker6513K 18K   85  16,32,256,1024,4096,8192
KTRACE   10013K 13K  100  128
 

Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-16 Thread Ben Stuyts

At 22:00 16/10/2002, Jeff Roberson wrote:

On Wed, 16 Oct 2002, Ben Stuyts wrote:

  I'll also run your vmstat script that you posted in a similar thread. One
  of the big memory users seems to be sem, and it's growing. Almost every
  time I do a vmstat -m, sem usage has grown a few k.
 

[snip]
 sem167320  2622K   2622K   167320  16,1024,4096
[snip]

Thank you for looking into this.  It definitely looks like a memory leak.
I forwarded this to alfred.  He was just working on semaphores so he may
know something about it.

 
  I'll see what the stats are tomorrow.
 
Much appreciated.

No need to wait for tomorrow. :-) Just 1.5 hours later, vmstat -m says:

   sem167344  2622K   2622K   167344  16,1024,4096
---
sem235512  3687K   3687K   235512  16,1024,4096

So it looks indeed like sem is the problem,

Kind regards,
Ben


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: [Ugly PATCH] Again: panic kmem_malloc()

2002-10-16 Thread Alfred Perlstein

* Ben Stuyts [EMAIL PROTECTED] [021016 14:05] wrote:
 At 22:00 16/10/2002, Jeff Roberson wrote:
 
 On Wed, 16 Oct 2002, Ben Stuyts wrote:
 
  I'll also run your vmstat script that you posted in a similar thread. One
  of the big memory users seems to be sem, and it's growing. Almost every
  time I do a vmstat -m, sem usage has grown a few k.
 
 
 [snip]
 sem167320  2622K   2622K   167320  16,1024,4096
 [snip]
 
 Thank you for looking into this.  It definitely looks like a memory leak.
 I forwarded this to alfred.  He was just working on semaphores so he may
 know something about it.
 
 
  I'll see what the stats are tomorrow.
 
 Much appreciated.
 
 No need to wait for tomorrow. :-) Just 1.5 hours later, vmstat -m says:
 
sem167344  2622K   2622K   167344  16,1024,4096
 ---
sem235512  3687K   3687K   235512  16,1024,4096
 
 So it looks indeed like sem is the problem,
 
 Kind regards,
 Ben

what does

sysctl -a | grep ^p10

say?

My guess is that you don't have the module in question loaded.

If you do, then why?  (it's marked experimental)

And why aren't these bug reports a lot more detailed? (meaing why
aren't you actually giving an hypothesys as to why the code is
broken?)

*grumble*

-- 
-Alfred Perlstein [[EMAIL PROTECTED]]
'Instead of asking why a piece of software is using 1970s technology,
 start asking why software is ignoring 30 years of accumulated wisdom.'

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message