Re: [Ugly PATCH] Again: panic kmem_malloc()
In message: <[EMAIL PROTECTED]> Terry Lambert <[EMAIL PROTECTED]> writes: : "M. Warner Losh" wrote: : > : > : + if (sops) : > : > : + free(sops, M_SEM); : > : > : > : > The kernel free() groks free(NULL, M_FOO), so the if isn't needed. : > : : > : Wow. That's bogus. It should panic. : > : > It isn't bogus. free(NULL) is defined to be OK in ansi-c. The kernel : > just mirrors that. : : The free(NULL) in ANSI C is to permit invocation of the garbage : collector; there are very specific semantics involved. Specifically, : if you do not call free(NULL), you are *guaranteed* that a malloc() : followed by a free() followed by a subsequent malloc(), if the size : of the area allocated by the subsequent malloc() is less than or : equal to the size of the area freed, *will not fail*. C99 just says: section 7.20.3.2: [#2] The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not match a pointer earlier returned by the calloc, malloc, or realloc function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined. "If ptr is a null poter, no action occurs" doesn't sound like GC to me. Like I said, free(NULL) is well defined, and unambiguous, in ansi c 99. In fact, I see nothing in the final c99 spec that even comes close to what you are talking about. Maybe c89 did that (I'm too lazy to walk down stairs and find it), but it too is irrelevant as the base system moves towards c99 compliance. The rest is bogus too. free(NULL, M_FOO) is well defined in the kernel and does the right thing and likely isn't going to change. Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
"M. Warner Losh" wrote: > : > : + if (sops) > : > : + free(sops, M_SEM); > : > > : > The kernel free() groks free(NULL, M_FOO), so the if isn't needed. > : > : Wow. That's bogus. It should panic. > > It isn't bogus. free(NULL) is defined to be OK in ansi-c. The kernel > just mirrors that. The free(NULL) in ANSI C is to permit invocation of the garbage collector; there are very specific semantics involved. Specifically, if you do not call free(NULL), you are *guaranteed* that a malloc() followed by a free() followed by a subsequent malloc(), if the size of the area allocated by the subsequent malloc() is less than or equal to the size of the area freed, *will not fail*. Of course, FreeBSD is a memory overcommit system, and fails to maintain this guarantee, as required by the standard (e.g. only do garbage collection when it is signalled that it is OK for a subsequent re-malloc() to fail, because the GC'ed memory has been released to the system. This is OK; we all realize that the standard, which permits a NULL argument to free(), allows this value for reasons of compatability with historical source code. But that begs the question: does the kernel interface also allow it for the purposes of compatability with legacy code? This seems unlikely in the extreme. Does the kernel interface use this as a trigger, as the user space interface historically did, to perform garbage collection? This also seems unlikely. Does it do it so that people can write code that doesn't check return values, and get away with it when they shouldn't? This seems highly likely. > : Or we should fix all of libc to take NULL arguments for strings, > : and treat them as if they were actually "". > > That's bogus. I agree that it's bogus, but it's the same argument in user space as in kernel space. Actually, it's not the same: the kernel argument is much poorer, not having legacy code it needs to support. In user space, there is plenty of legacy code that acts this way; in fact, one could trap a zero dereference (one does; one just faults on it, currently), map a page full of zeros at page zero, and then a dereference would in fact b giving a pointer to a NULL string. SVR4 does this, as a kernel option for compatability with legacy software. It is tunable to be able to turn it off, and you can not only run legacy software which will not run in FreeBSD ABI compatability (hardly compatable, that...), but you can know from the memory map of the process, as examined through /proc, that a NULL dereference has occurred. So it should arguably be controllable via sysctl, minimally for IBCS2 and similar ABI modules, for user space. But it's still unjustified in kernel space. And panic'ing on attempts to free NULL pointers would be a nice way of avoiding cascade failures later on, and keep the problem from being hidden a long ways away from its effect. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc(): SOLVED
* Ben Stuyts <[EMAIL PROTECTED]> [021019 07:16] wrote: > At 13:34 19/10/2002, Ben Stuyts wrote: > >At 04:15 19/10/2002, Alfred Perlstein wrote: > >>* Jake Burkholder <[EMAIL PROTECTED]> [021018 18:26] wrote: > >>> semop() leaks memory. An important free() was removed by alfred in > >>> rev 1.55. Try this. > >> > >>Seriously, I just checked in slightly different fix (based on jake's > >>sleuthing)... please let me know if works for you guys. > > > >Thanks Alfred, I am building the kernel right now with your fix. Later > >today I will let you know if this fixes the problems I've been seeing. > > Looks good, the machine has been running for a couple of hours now, and > vmstat -m says: > > sem 4 7K 8K 3556 16,1024,4096 > > Before it used to be 2 - 5 MB allocated, and qpopper/smbd do no longer eat > memory each time they are invoked. > > Many thanks! Great! Thanks for the bug report and my apologies for jumping down your throat initially. Best of luck to you. -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc(): SOLVED
At 13:34 19/10/2002, Ben Stuyts wrote: At 04:15 19/10/2002, Alfred Perlstein wrote: * Jake Burkholder <[EMAIL PROTECTED]> [021018 18:26] wrote: > semop() leaks memory. An important free() was removed by alfred in > rev 1.55. Try this. Seriously, I just checked in slightly different fix (based on jake's sleuthing)... please let me know if works for you guys. Thanks Alfred, I am building the kernel right now with your fix. Later today I will let you know if this fixes the problems I've been seeing. Looks good, the machine has been running for a couple of hours now, and vmstat -m says: sem 4 7K 8K 3556 16,1024,4096 Before it used to be 2 - 5 MB allocated, and qpopper/smbd do no longer eat memory each time they are invoked. Many thanks! Ben To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
At 04:15 19/10/2002, Alfred Perlstein wrote: * Jake Burkholder <[EMAIL PROTECTED]> [021018 18:26] wrote: > semop() leaks memory. An important free() was removed by alfred in > rev 1.55. Try this. Oh' c'mon, isn't MP-safeness a bit more important than a some little memory leak, ram is cheap! processors aren't! Seriously, I just checked in slightly different fix (based on jake's sleuthing)... please let me know if works for you guys. Thanks Alfred, I am building the kernel right now with your fix. Later today I will let you know if this fixes the problems I've been seeing. Kind regards, Ben To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
* Jake Burkholder <[EMAIL PROTECTED]> [021018 18:26] wrote: > semop() leaks memory. An important free() was removed by alfred in > rev 1.55. Try this. Oh' c'mon, isn't MP-safeness a bit more important than a some little memory leak, ram is cheap! processors aren't! Seriously, I just checked in slightly different fix (based on jake's sleuthing)... please let me know if works for you guys. thanks, -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
Apparently, On Sat, Oct 19, 2002 at 12:19:57AM +0200, Ben Stuyts said words to the effect of; > Terry, > > At 23:07 18/10/2002, you wrote: > >Ben Stuyts wrote: > > > Furthermore, this might be interesting: the last vmstat -m log > > > before the panic. Maybe someone can check if these values are reasonable? > > > The system has 64 MB memory and has been up for about 24 hrs with almost no > > > load. > > >sem344456 5390K 5390K 344456 16,1024,4096 > > > >Almost 5.3M of unswappable physical memory dedicated to semaphores > >seems like a bit much. > > Yes, and it increases continuously, for example when I fetch new mail (over > pop) from my windows pc. The pc stores this again on a network drive, so > both qpopper and smbd are involved. For example, vmstat -m says: > semop() leaks memory. An important free() was removed by alfred in rev 1.55. Try this. Jake Index: sysv_sem.c === RCS file: /home/ncvs/src/sys/kern/sysv_sem.c,v retrieving revision 1.55 diff -u -r1.55 sysv_sem.c --- sysv_sem.c 13 Aug 2002 08:47:17 - 1.55 +++ sysv_sem.c 19 Oct 2002 01:20:35 - @@ -1128,6 +1128,8 @@ td->td_retval[0] = 0; done2: mtx_unlock(sema_mtxp); + if (sops) + free(sops, M_SEM); return (error); } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
Ben Stuyts wrote: > >Almost 5.3M of unswappable physical memory dedicated to semaphores > >seems like a bit much. > > Yes, and it increases continuously, for example when I fetch new mail (over > pop) from my windows pc. The pc stores this again on a network drive, so > both qpopper and smbd are involved. For example, vmstat -m says: > > vmstat -m | grep sem >sem155886 2443K 2443K 155886 16,1024,4096 > > Now when I do a fetch-mail with Eudora on my pc, the same command says. > > vmstat -m | grep sem >sem156178 2448K 2448K 156178 16,1024,4096 > > I can repeat this at will, and each time I loose 4-5 KB. qpopper is started > from inetd, and smbd runs as a daemon. I tried stopping smbd: None of us have been able to repeat your problem, up to now. I suppose now that we know you are running qpopper on -current, we could repeat the problem, but, frankly, you already have a test environment set up, and it would be a lot of work for us to duplicate it, and even so, we won't know for sure if we could repeat the problem. Have you checked out your source tree with a date tag, so that it's possible for everyone else to check out and get the same source files? Line number references in tracebacks are pretty useless, if the lines don't match. Unless you can identify the exact number of bytes being consumed, and then identify a kernel structure used in the semaphore code that is equal to that size, or for which that size is a least common multiple, and there are a number of evets equal to the size of the divisor, then that's no good. This is why everyone keeps asking you to run the kernel debugger, so that you can tell us exactly the code that's failing, and why, and why a stack backtrace, more detailed than "it contained a call to sem" is important. This problem is evidently a memory leak in the semaphore code; but that does not mean that the crash that results will be in any way related to where the leak occurs. In other words, the crash is a secondary effect. Only by fully understanding the crash will anyone be able to help you with the root cause. I understand that it's frustrating to go step by step, when you think you have isolated the problem to a smaller area, but the information you gather from outside that area will tell you about the inside much more clearly than staring at the outside of a black box where we know the problem lives. The only alternative to rewriting the black box from scratch, or grovelling through it with a line-by-line code review (I'm not interested in doing that; perhaps you could interest the author of the changes that resulted in the problem) is to find a smoking gun, and work from that, instead. If this problem is in the way of you getting work done (one wonders why you are using -current, if you need to get work done), then my best suggestion to you is to back out the changes Alfred made, one by one, and when it stops having the problem, you will have identified a very small patch that causes the problem. > >But without knowing what software you are running, it's hard to say > >if the number is unreasonable, or not. > > Well, it is really a lightly loaded server, just serving one windows pc > here at home. Here is a ps, and the only thing that's missing from it is > the occasional pop session. Also note that this system is not connected to > the internet, so the http that's running is mostly for my own pleasure (and > proxy/cache). I do run ppp and uucp every now and then. Perhaps I wasn't clear. Not knowing what calls your software makes that cause the problem to occur, it is not possible for us to create a cut-down test case in less than 30 lines of C source code, so that we can repeat the problem at will, without secondary effects. As it is, you only *suppose* that the qpopper usage alone is sufficient to cause the problem; even if you are correct, that's insufficient to identify where the problem is... it may not even really be in the semaphore source code at all.. maybe it's in kevent code, for unfreed events, etc.. I think you need to go back one email: | > Just had another panic, same kmem_malloc(). I did a trace but forgot to | > write the traceback down. | | Wait until the next one, and remember to write it down; preferrably, | obtain a system dump image, so you can examine it with the debugger, | and make sure that the kernel you are running has a debuggable | counterpart already there (i.e. you used "config -g" to create the | kernel you are running). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
Terry, At 23:07 18/10/2002, you wrote: Ben Stuyts wrote: > Furthermore, this might be interesting: the last vmstat -m log > before the panic. Maybe someone can check if these values are reasonable? > The system has 64 MB memory and has been up for about 24 hrs with almost no > load. >sem344456 5390K 5390K 344456 16,1024,4096 Almost 5.3M of unswappable physical memory dedicated to semaphores seems like a bit much. Yes, and it increases continuously, for example when I fetch new mail (over pop) from my windows pc. The pc stores this again on a network drive, so both qpopper and smbd are involved. For example, vmstat -m says: vmstat -m | grep sem sem155886 2443K 2443K 155886 16,1024,4096 Now when I do a fetch-mail with Eudora on my pc, the same command says. vmstat -m | grep sem sem156178 2448K 2448K 156178 16,1024,4096 I can repeat this at will, and each time I loose 4-5 KB. qpopper is started from inetd, and smbd runs as a daemon. I tried stopping smbd: [terminus.stuyts.nl etc/rc.d]90: sudo /usr/local/etc/rc.d/samba.sh stop [terminus.stuyts.nl etc/rc.d]91: !vm vmstat -m | grep sem sem156524 2453K 2453K 156524 16,1024,4096 It doesn't free the sem allocated memory. But without knowing what software you are running, it's hard to say if the number is unreasonable, or not. Well, it is really a lightly loaded server, just serving one windows pc here at home. Here is a ps, and the only thing that's missing from it is the occasional pop session. Also note that this system is not connected to the internet, so the http that's running is mostly for my own pleasure (and proxy/cache). I do run ppp and uucp every now and then. USERPID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND dnetc 503 94.2 0.8 960 460 ?? RNs Thu09PM 1529:56.87 /usr/local/distributed.net/dnetc -quiet root 10 0.0 0.0 0 12 ?? DL1Jan70 0:00.00 (ktrace) root 1 0.0 0.3 700 184 ?? ILs 1Jan70 0:01.26 /sbin/init -- root 11 0.0 0.0 0 12 ?? RL1Jan70 1:24.27 (idle) root 12 0.0 0.0 0 12 ?? WL1Jan70 1:01.85 (swi1: net) root 13 0.0 0.0 0 12 ?? WL1Jan70 7:49.87 (swi6: tty:sio clock) root 15 0.0 0.0 0 12 ?? DL1Jan70 0:17.51 (random) root 18 0.0 0.0 0 12 ?? WL1Jan70 0:35.60 (swi3: cambio) root 23 0.0 0.0 0 12 ?? DL1Jan70 0:33.97 (usb0) root 24 0.0 0.0 0 12 ?? DL1Jan70 0:00.00 (usbtask) root 25 0.0 0.0 0 12 ?? WL1Jan70 0:15.98 (irq12: sym0) root 26 0.0 0.0 0 12 ?? WL1Jan70 0:33.34 (irq9: xl0) root 27 0.0 0.0 0 12 ?? WL1Jan70 0:00.04 (irq1: atkbd0) root 28 0.0 0.0 0 12 ?? WL1Jan70 0:00.00 (irq6: fdc0) root 30 0.0 0.0 0 12 ?? WL1Jan70 0:00.25 (swi0: tty:sio) root 2 0.0 0.0 0 12 ?? DL1Jan70 0:51.73 (pagedaemon) root 3 0.0 0.0 0 12 ?? DL1Jan70 0:00.42 (vmdaemon) root 4 0.0 0.0 0 12 ?? RL1Jan70 0:01.95 (pagezero) root 5 0.0 0.0 0 12 ?? DL1Jan70 0:05.29 (bufdaemon) root 6 0.0 0.0 0 12 ?? DL1Jan70 1:26.74 (syncer) root 7 0.0 0.0 0 12 ?? DL1Jan70 0:04.12 (vnlru) root123 0.0 0.0 2208 ?? IWs - 0:00.00 adjkerntz -i root194 0.0 0.4 628 244 ?? Is Thu09PM 0:09.18 /sbin/natd -dynamic -log -n tun0 root241 0.0 0.7 1180 420 ?? Ss Thu09PM 0:04.76 /usr/sbin/syslogd -s -v root255 0.0 2.6 2856 1580 ?? Is Thu09PM 0:23.02 /usr/sbin/named -d 1 root263 0.0 0.0 1332 12 ?? Is Thu09PM 0:00.06 /usr/sbin/rpcbind root340 0.0 0.0 1204 12 ?? Is Thu09PM 0:00.03 /usr/sbin/mountd -r root342 0.0 0.0 1164 12 ?? Is Thu09PM 0:00.30 nfsd: master (nfsd) root343 0.0 0.0 11168 ?? IW - 0:00.00 nfsd: server (nfsd) root344 0.0 0.0 11168 ?? IW - 0:00.00 nfsd: server (nfsd) root345 0.0 0.0 11168 ?? IW - 0:00.00 nfsd: server (nfsd) root347 0.0 0.0 11168 ?? IW - 0:00.00 nfsd: server (nfsd) root376 0.0 0.0 1188 12 ?? Is Thu09PM 0:00.05 /usr/sbin/lpd root380 0.0 0.3 1188 168 ?? SThu09PM 0:02.57 /usr/sbin/lpd root396 0.0 1.3 1552 804 ?? Ss Thu09PM 0:26.59 /usr/sbin/ntpd -p /var/run/ntpd.pid root418 0.0 0.1 1132 64 ?? Is Thu09PM 0:00.97 /usr/sbin/usbd root437 0.0 1.4 3036 820 ?? Ss Thu09PM 0:19.39 sendmail: accepting connections (sendmail) smmsp 440 0.0 0.9 3012 528 ?? Is Thu09PM 0:00.38 sendmail: Queue runner@00:30:00 for /var/spool/clientmqueue (sendmail) root467 0.0 1.5 2332 908 ?? Ss Thu09PM 0:25.86 /usr/local/sbin/httpd www 485 0.0 0.0 2356 12 ?? IThu09PM 0:00.01 /us
Re: [Ugly PATCH] Again: panic kmem_malloc()
This is a repost. Forgive me if you see it twice, but it didn't turn up in the -current list. Hi, Just had another panic, same kmem_malloc(). I did a trace but forgot to write the traceback down. In any case, there was a semop() call in the traceback. Furthermore, this might be interesting: the last vmstat -m log before the panic. Maybe someone can check if these values are reasonable? The system has 64 MB memory and has been up for about 24 hrs with almost no load. [terminus.stuyts.nl ben/bin]4: cat vmstatlog.4 Type InUse MemUse HighUse Requests Size(s) atkbddev 2 1K 1K2 32 pfs_fileno 132K 32K1 32768 nexusdev 2 1K 1K2 16 memdesc 1 4K 4K1 4096 legacydrv 3 1K 1K3 16 VM pgdata 1 4K 4K1 4096 pfs_nodes20 3K 3K 20 128 MSDOSFS mount 1 8K 8K1 8192 UFS mount1223K 39K 14 256,2048,4096,16384 UFS ihash 116K 16K1 16384 UFS dirhash 18936K 52K 1695 16,32,64,128,256,512 FFS node 11086 2079K 2096K 1000908 128,256 newdirblk 0 0K 1K5 16 dirrem 5 1K 34K18098 32 mkdir 0 0K 3K 524 32 diradd 0 0K 9K18045 32 freefile 5 1K 31K12022 32 freeblks 7 2K247K10187 256 freefrag 2 1K 2K36405 32 allocindir 4 1K204K 257072 64 indirdep 2 1K876K 2172 32,8192 allocdirect 1 1K 33K51562 128 bmsafemap 8 1K 3K 6606 32 newblk 1 1K 1K 308635 64,256 inodedep1218K168K31012 128,16384 pagedep10 3K 7K 6114 64,2048 p1003.1b 1 1K 1K1 16 NFS daemon 5 3K 3K5 256,512 NFS srvsock 2 1K 1K2 128 ip6_moptions 1 1K 1K1 16 in6_multi10 1K 1K 10 16,64 syncache 1 8K 8K1 8192 IpFw/IpAcct30 4K 4K 30 64,128 in_multi 2 1K 1K2 32 routetbl41 6K 6K 78 16,32,64,128,256 lo 1 1K 1K1 512 clone 312K 12K3 4096 ether_multi35 2K 2K 35 16,32,64 ifaddr22 7K 7K 22 32,256,512,2048 BPF 6 9K 9K6 128,256,4096 mount20 4K 4K 24 16,32,128,512 vnodes23 6K 6K 137 16,32,64,128,256 cluster_save buffer 0 0K 1K 9793 32,64 vfscache 5226 381K436K 534189 64,128,256,32768 BIO buffer 810K317K 4611 512,1024,2048 DEVFS 12122K 22K 121 16,32,128,8192 pcb38 5K 6K 1913 16,32,64,2048 soname 4 1K 1K39624 16,32,128 ptys 2 1K 1K2 512 ttys 48865K 85K 6431 128,512 shm 318K 19K9 16,1024,16384 sem344456 5390K 5390K 344456 16,1024,4096 msg 425K 25K4 512,4096,16384 ioctlops 0 0K 1K 22 512,1024 USBdev 1 1K 2K4 128,512 USB1521K 22K 701353 16,32,128,256,4096 taskqueue 1 1K 1K1 128 sbuf 0 0K 5K 34 32,64,4096 rman99 7K 7K 496 16,64,128 mbufmgr 11616K 16K 116 32,64,128,2048,8192 kobj 127 508K508K 127 4096 eventhandler22 2K 2K 22 32,128 bus 47039K 40K 1363 16,32,64,128,256,512,2048,4096,8192 SWAP 273K 73K2 64 sysctltmp 0 0K 4K 802428 16,32,64,128,256,512,1024,4096 sysctl 0 0K 1K19359 16,32,64 uidinfo 7 1K 1K 7121 32,128 cred38 5K 9K 142297 128 subproc 10511K 15K52100 64,256 proc 2 1K 1K2 512 session33 5K 6K 2102 128 pgrp39 5K 6K 2278 128 module 17111K 11K 171 64 ip6ndp 3 1K 1K4 64,128,512 temp 954K286K 156410 16,32,64,128,256,512,1024,2048,4096,8192,16384,32768 devbuf 474 965K997K 2333 16,32,64,128,256,512,1024,2048,4096,8192,32768 lockf 6 1K 1K29935 64 feeder48 1K 1K 48 16 linker6513K 18K 85 16,32,256,1024,4096,8192 KTRACE 10013K 13K 100 128 ithread40 7K 7K 41
Re: [Ugly PATCH] Again: panic kmem_malloc(): dmesg and kernel config
Some info I did not include in the previous messages: dmesg output and kernel config. [terminus.stuyts.nl boot/kernel]26: dmesg Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT #5: Sun Oct 6 01:50:54 CEST 2002 [EMAIL PROTECTED]:/var/obj/usr/src/sys/TERMINUS Preloaded elf kernel "/boot/kernel/kernel" at 0xc04dc000. Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 233864671 Hz CPU: Pentium II/Pentium II Xeon/Celeron (233.86-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x634 Stepping = 4 Features=0x80f9ff real memory = 67108864 (65536K bytes) avail memory = 59920384 (58516K bytes) Pentium Pro MTRR support enabled npx0: on motherboard npx0: INT 16 interface Using $PIR table, 6 entries at 0xc00fdc00 pcib0: at pcibus 0 on motherboard pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0xf000-0xf00f at device 7.1 on pci0 atapci0: Busmastering DMA not supported ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: port 0x6400-0x641f irq 11 at device 7.2 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered pci0: at device 7.3 (no driver attached) sym0: <875> port 0x6800-0x68ff mem 0xe800-0xe8000fff,0xe8001000-0xe80010ff irq 12 at device 11.0 on pci0 sym0: Symbios NVRAM, ID 7, Fast-20, SE, parity checking sym0: open drain IRQ line driver, using on-chip SRAM sym0: using LOAD/STORE-based firmware. xl0: <3Com 3c905-TX Fast Etherlink XL> port 0x6c00-0x6c3f irq 9 at device 13.0 on pci0 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 lock order reversal 1st 0xc0ba1bd4 xl0 (network driver) @ /usr/src/sys/pci/if_xl.c:1264 2nd 0xc03d2b00 allproc (allproc) @ /usr/src/sys/kern/kern_fork.c:318 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 xl0: Ethernet address: 00:60:08:a5:d4:ff miibus0: on xl0 nsphy0: on miibus0 nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:647 pci0: at device 15.0 (no driver attached) orm0: at iomem 0xc8000-0xcbfff,0xc-0xc7fff on isa0 atkbdc0: at port 0x64,0x60 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 fdc0: at port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 ppc0: at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode ppbus0: IEEE1284 device found /NIBBLE Probing for PnP devices on ppbus0: ppbus0: PRINTER ESCPL2,BDC lpt0: on ppbus0 lpt0: Interrupt-driven port sc0: on isa0 sc0: VGA <16 virtual consoles, flags=0x200> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A vga0: at port 0x3c0-0x3df iomem 0xa-0xb on isa0 unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) Timecounters tick every 10.000 msec ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, logging unlimited Waiting 5 seconds for SCSI devices to settle (noperiph:sym0:0:-1:-1): SCSI BUS reset delivered. Mounting root from ufs:/dev/da0s1a da2 at sym0 bus 0 target 3 lun 0 da2: Fixed Direct Access SCSI-2 device da2: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled da2: 3067MB (6281856 512 byte sectors: 255H 63S/T 391C) da1 at sym0 bus 0 target 2 lun 0 da1: Fixed Direct Access SCSI-2 device da1: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled da1: 3090MB (6328861 512 byte sectors: 255H 63S/T 393C) da0 at sym0 bus 0 target 1 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers (20.000MHz, offset 15, 16bit) da0: 4134MB (8467200 512 byte sectors: 255H 63S/T 527C) WARNING: / was not properly dismounted cd0 at sym0 bus 0 target 4 lun 0 cd0: Removable CD-ROM SCSI-2 device cd0: 20.000MB/s transfers (20.000MHz, offset 15) cd0: Attempt to query device size failed: NOT READY, Medium not present WARNING: /usr was not properly dismounted WARNING: /var was not properly dismounted /var: mount pending error: blocks 4 files 1 /var: superblock summary recomputed acquiring duplicate lock of same type: "inp" 1st inp @ /usr/s
Re: [Ugly PATCH] Again: panic kmem_malloc()
Hello Alfred, On Wed, Oct 16, 2002 at 02:26:19PM -0700, Alfred Perlstein wrote: > * Ben Stuyts <[EMAIL PROTECTED]> [021016 14:05] wrote: > > > > No need to wait for tomorrow. :-) Just 1.5 hours later, vmstat -m says: > > > > < sem167344 2622K 2622K 167344 16,1024,4096 > > --- > > > sem235512 3687K 3687K 235512 16,1024,4096 > > > > So it looks indeed like sem is the problem, > > what does > sysctl -a | grep ^p10 > say? p1003_1b.asynchronous_io: 0 p1003_1b.mapped_files: 1 p1003_1b.memlock: 0 p1003_1b.memlock_range: 0 p1003_1b.memory_protection: 0 p1003_1b.message_passing: 0 p1003_1b.prioritized_io: 0 p1003_1b.priority_scheduling: 1 p1003_1b.realtime_signals: 0 p1003_1b.semaphores: 0 p1003_1b.fsync: 0 p1003_1b.shared_memory_objects: 1 p1003_1b.synchronized_io: 0 p1003_1b.timers: 0 p1003_1b.aio_listio_max: 0 p1003_1b.aio_max: 0 p1003_1b.aio_prio_delta_max: 0 p1003_1b.delaytimer_max: 0 p1003_1b.mq_open_max: 0 p1003_1b.pagesize: 4096 p1003_1b.rtsig_max: 0 p1003_1b.sem_nsems_max: 0 p1003_1b.sem_value_max: 0 p1003_1b.sigqueue_max: 0 p1003_1b.timer_max: 0 > My guess is that you don't have the module in question loaded. > > If you do, then why? (it's marked experimental) The only modules loaded are: [terminus.stuyts.nl boot/kernel]21: kldstat Id Refs AddressSize Name 13 0xc010 3daa00 kernel 21 0xc12fd000 2000 green_saver.ko > And why aren't these bug reports a lot more detailed? (meaing why > aren't you actually giving an hypothesys as to why the code is > broken?) I think it was Jeff Roberson hinting at that. I am only reporting a problem and I hope I can help fixing it. I have however no knowledge of the kernel internals, so forgive me for being too vague and let me know what more information you need. > *grumble* Sorry... Kind regards, Ben To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
* Ben Stuyts <[EMAIL PROTECTED]> [021016 14:05] wrote: > At 22:00 16/10/2002, Jeff Roberson wrote: > > >On Wed, 16 Oct 2002, Ben Stuyts wrote: > > > >> I'll also run your vmstat script that you posted in a similar thread. One > >> of the big memory users seems to be sem, and it's growing. Almost every > >> time I do a vmstat -m, sem usage has grown a few k. > >> > > > >[snip] > >>sem167320 2622K 2622K 167320 16,1024,4096 > >[snip] > > > >Thank you for looking into this. It definitely looks like a memory leak. > >I forwarded this to alfred. He was just working on semaphores so he may > >know something about it. > > > >> > >> I'll see what the stats are tomorrow. > >> > >Much appreciated. > > No need to wait for tomorrow. :-) Just 1.5 hours later, vmstat -m says: > > < sem167344 2622K 2622K 167344 16,1024,4096 > --- > > sem235512 3687K 3687K 235512 16,1024,4096 > > So it looks indeed like sem is the problem, > > Kind regards, > Ben what does sysctl -a | grep ^p10 say? My guess is that you don't have the module in question loaded. If you do, then why? (it's marked experimental) And why aren't these bug reports a lot more detailed? (meaing why aren't you actually giving an hypothesys as to why the code is broken?) *grumble* -- -Alfred Perlstein [[EMAIL PROTECTED]] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
At 22:00 16/10/2002, Jeff Roberson wrote: >On Wed, 16 Oct 2002, Ben Stuyts wrote: > > > I'll also run your vmstat script that you posted in a similar thread. One > > of the big memory users seems to be sem, and it's growing. Almost every > > time I do a vmstat -m, sem usage has grown a few k. > > > >[snip] > >sem167320 2622K 2622K 167320 16,1024,4096 >[snip] > >Thank you for looking into this. It definitely looks like a memory leak. >I forwarded this to alfred. He was just working on semaphores so he may >know something about it. > > > > > I'll see what the stats are tomorrow. > > >Much appreciated. No need to wait for tomorrow. :-) Just 1.5 hours later, vmstat -m says: < sem167344 2622K 2622K 167344 16,1024,4096 --- > sem235512 3687K 3687K 235512 16,1024,4096 So it looks indeed like sem is the problem, Kind regards, Ben To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
At 21:20 11/10/2002, Terry Lambert wrote: >Please find a (relatively bogus) patch attached, which could cause >things to block for a long time, but will avoid the panic. Terry, I just got the same panic without your patch. (I wanted to verify that it was still panic-ing with the latest src tree.) I am now building a kernel with your patch. I'll also run your vmstat script that you posted in a similar thread. One of the big memory users seems to be sem, and it's growing. Almost every time I do a vmstat -m, sem usage has grown a few k. Type InUse MemUse HighUse Requests Size(s) atkbddev 2 1K 1K2 32 pfs_fileno 132K 32K1 32768 nexusdev 2 1K 1K2 16 memdesc 1 4K 4K1 4096 legacydrv 3 1K 1K3 16 VM pgdata 1 4K 4K1 4096 pfs_nodes20 3K 3K 20 128 MSDOSFS mount 1 8K 8K1 8192 UFS mount1223K 39K 14 256,2048,4096,16384 UFS ihash 116K 16K1 16384 UFS dirhash5711K 11K 117 16,32,64,128,512 FFS node 4976 933K936K40528 128,256 dirrem 0 0K 31K 5522 32 mkdir 0 0K 3K 520 32 diradd14 1K 7K 3118 32 freefile 0 0K 26K 4839 32 freeblks 1 1K186K 3820 256 freefrag 6 1K 1K 2494 32 allocindir10 1K 86K 8596 64 indirdep 2 1K876K 577 32,8192 allocdirect23 3K 16K 8457 128 bmsafemap 3 1K 3K 365 32 newblk 1 1K 1K17054 64,256 inodedep1618K168K 9570 128,16384 pagedep 2 3K 7K 874 64,2048 p1003.1b 1 1K 1K1 16 NFS daemon 5 3K 3K5 256,512 NFS srvsock 2 1K 1K2 128 ip6_moptions 1 1K 1K1 16 in6_multi10 1K 1K 10 16,64 syncache 1 8K 8K1 8192 IpFw/IpAcct30 4K 4K 30 64,128 in_multi 2 1K 1K2 32 routetbl41 6K 6K 76 16,32,64,128,256 lo 1 1K 1K1 512 clone 312K 12K3 4096 ether_multi35 2K 2K 35 16,32,64 ifaddr22 7K 7K 22 32,256,512,2048 BPF 6 9K 9K6 128,256,4096 mount20 4K 4K 24 16,32,128,512 vnodes23 6K 6K 137 16,32,64,128,256 cluster_save buffer 0 0K 1K 1183 32,64 vfscache 2634 197K198K32833 64,128,256,32768 BIO buffer2126K205K 1130 512,1024,2048 DEVFS 12122K 22K 121 16,32,128,8192 pcb38 5K 5K 58 16,32,64,2048 soname 4 1K 1K 1415 16,32,128 ptys 2 1K 1K2 512 ttys 61481K 81K 1121 128,512 shm 318K 19K8 16,1024,16384 sem167320 2622K 2622K 167320 16,1024,4096 msg 425K 25K4 512,4096,16384 ioctlops 0 0K 1K 22 512,1024 USBdev 1 1K 2K4 128,512 USB1521K 22K15345 16,32,128,256,4096 taskqueue 1 1K 1K1 128 sbuf 0 0K 5K2 32,4096 rman99 7K 7K 496 16,64,128 mbufmgr 10615K 15K 106 32,64,128,2048,8192 kobj 127 508K508K 127 4096 eventhandler22 2K 2K 22 32,128 bus 47039K 40K 1363 16,32,64,128,256,512,2048,4096,8192 SWAP 273K 73K2 64 sysctltmp 0 0K 4K 8856 16,32,64,128,256,512,1024,4096 sysctl 0 0K 1K 386 16,32,64 uidinfo 7 1K 1K 525 32,128 cred34 5K 5K18178 128 subproc 11411K 14K10613 64,256 proc 2 1K 1K2 512 session33 5K 5K 68 128 pgrp40 5K 6K 117 128 module 17111K 11K 171 64 ip6ndp 3 1K 1K4 64,128,512 temp1154K 55K19887 16,32,64,128,256,512,1024,2048,4096,8192,16384,32768 devbuf 473 964K997K 2268 16,32,64,128,256,512,1024,2048,4096,8192,32768 lockf 6 1K 1K 549 64 feeder48 1K 1K 48 16 linker6513K 18K 85 16,32,256,1024,4096,8192 KTRACE 10013K 13K 100 128
Re: [Ugly PATCH] Again: panic kmem_malloc()
On Wed, 16 Oct 2002, Ben Stuyts wrote: > I just got the same panic without your patch. (I wanted to verify that it > was still panic-ing with the latest src tree.) I am now building a kernel > with your patch. > > I'll also run your vmstat script that you posted in a similar thread. One > of the big memory users seems to be sem, and it's growing. Almost every > time I do a vmstat -m, sem usage has grown a few k. > [snip] >sem167320 2622K 2622K 167320 16,1024,4096 [snip] Thank you for looking into this. It definitely looks like a memory leak. I forwarded this to alfred. He was just working on semaphores so he may know something about it. > > I'll see what the stats are tomorrow. > > Kind regards, > Ben > Much appreciated. Cheers, Jeff To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Re: Again: panic kmem_malloc()
Jeff Roberson wrote: > On Fri, 11 Oct 2002, Terry Lambert wrote: > > Ben Stuyts wrote: > > > Is there a way to check the free list of the kernel? Maybe I can find out > > > what action triggers eating al its memory. > > Maybe you should just increase the size of your kmem_map? I'll look into > a better fix but that should do it short term. I think he's in an overload condition. Basically, it will always use all available resources, and then some. So incresing "all available" still leaves you with your foot shot off when the system asks for "and then some". 8-(. > > 1)page_alloc() in uma is using kmem_malloc() without the > > M_NOWAIT flag > > Why is this bogus? Because it's possible to panic under heavy load. The only thing heavy load should cause is slowdown... ideally, proportional to the load... ;^). > Yes, I agree, it should wait. It might be interesting to see what the > effects of allocating straight out of the kernel_map would be. Or perhaps > positioning the kmem_map in such a way that it might be able to expand. I > don't like the hard limit here anymore than anyone else does. I looked at this. I don't think you can reasonable expand maps other than the kernel_map. The allocations are wrong... kernel_map is, unfortunately, "special". This is about the 1,000,000th time I've wanted to establish mappings for all physical RAM, and maybe the entire address space, up front, instead of leaving it to demand-time. The problem with this is that you are talking 4M (or 8M, if you use PSE-36 or PAE). Then *everything* becomes allocable at interrupt time, and there's no such thing as tiered access semantics, only whether or not a mapping is assigned to the pool you want to allocate out of, or not. > On reasonable architectures your only worry is wasting pages and not kva. Heh... "It's not an unreasonable architecture, you're just using an unreasonable amount of memory on a reasonable architecture". > > Jeffrey Roberson is going to need to fix UMA allocations, per the > > comment in this patch, for a more permanent fix. I've specifically > > Cc:'ed him on this message. > > Thanks for looking into this terry. Please, call me Jeff though. No problem; I copied the name out of the source file; I tend to use "Terrence" in source files (you can see this in init_main.c and other places where I did enough work I felt it merited a copyright to be able to give it away). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [Ugly PATCH] Re: Again: panic kmem_malloc()
On Fri, 11 Oct 2002, Terry Lambert wrote: > Ben Stuyts wrote: > > Is there a way to check the free list of the kernel? Maybe I can find out > > what action triggers eating al its memory. Maybe you should just increase the size of your kmem_map? I'll look into a better fix but that should do it short term. > > ] panic: kmem_malloc(4096): kmem_map too small: 28246016 total allocated. > > That's easy: you're calling kmem_malloc() without M_NOWAIT. > > That function only operates on the maps kmem_map or mb_map. > > It calls vm_map_findspace(), which fails to find space in the > map. > > vm_map_findspace() fails to add space to the map, because it > only adds space tot he map if the map is kernel_map; all other > maps fail catastrophically. Well, for the old allocator this was a panicing situation. It never returned memory and kva back to kmem_map. So you were pretty much done. > > The map you are calling it with is kmem_map (if it were mb_map, > you would get an "Out of mbuf clusters" message on your console, > and the allocation would fail, regardless of the value of the > M_NOWAIT flags bit (mbuf allocations do not properly honor the > lack of an M_NOWAIT flags bit). It is documented that they do not. Notice we only have M_TRYWAIT now for mbufs. > > The panic message occurs becuause you asked it to wait for memory > to be available. > > But the code is stupid, and refuses to wait for memory to be > available, in the case that space can not be found in the map, > because it does not properly realize that the freeing of memory > elsewhere can result in freed space in the map. So it calls > "panic" instead of waiting. It will wait for pages to be available. It just wont wait for KVA to be available. This was a somewhat less bogus with the old allocator. > > Therefore, it's technically illegal to call kmem_malloc() with > a third argument that does not include the M_NOWAIT bit, even > though the function is documented, and obviously intended, to > permit the use of this flags bit. No, that's not true. It will wait if you're low on pages. It will not wait if you've run out of kva. There's a big difference. Also, WAITOK more appropriately means "Never return NULL even if you have to panic." You just assumed it would mean "Never return NULL even if you have to wait". ;-) I agree that it should mean the latter though. > > > In -current, there is exactly one place where kmem_malloc() is > called with the kmem_map as its first argument: in the function > page_alloc() in vm/uma_core.c. > > > So, you have two bogus things happening: > > 1)page_alloc() in uma is using kmem_malloc() without the > M_NOWAIT flag Why is this bogus? > > 2)kmem_malloc() without the M_NOWAIT flag panics, INSTEAD > OF FRICKING WAITING, LIKE YOU ARE TELLING IT TO DO. > :-0_ <- Dr. Evil Yes, I agree, it should wait. It might be interesting to see what the effects of allocating straight out of the kernel_map would be. Or perhaps positioning the kmem_map in such a way that it might be able to expand. I don't like the hard limit here anymore than anyone else does. On reasonable architectures your only worry is wasting pages and not kva. Now with UMA the VM can tell it that it's using too many pages and so the system is self tuning. We'll have to do something to help kva crippled architectures (x86) in the near term. > > Probably, page_alloc should be rewritten to not use kmem_malloc(), > and to use the kmem_alloc_wait() instead. > > Please find a (relatively bogus) patch attached, which could cause > things to block for a long time, but will avoid the panic. > > Jeffrey Roberson is going to need to fix UMA allocations, per the > comment in this patch, for a more permanent fix. I've specifically > Cc:'ed him on this message. Thanks for looking into this terry. Please, call me Jeff though. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
[Ugly PATCH] Re: Again: panic kmem_malloc()
Ben Stuyts wrote: > Is there a way to check the free list of the kernel? Maybe I can find out > what action triggers eating al its memory. ] panic: kmem_malloc(4096): kmem_map too small: 28246016 total allocated. That's easy: you're calling kmem_malloc() without M_NOWAIT. That function only operates on the maps kmem_map or mb_map. It calls vm_map_findspace(), which fails to find space in the map. vm_map_findspace() fails to add space to the map, because it only adds space tot he map if the map is kernel_map; all other maps fail catastrophically. The map you are calling it with is kmem_map (if it were mb_map, you would get an "Out of mbuf clusters" message on your console, and the allocation would fail, regardless of the value of the M_NOWAIT flags bit (mbuf allocations do not properly honor the lack of an M_NOWAIT flags bit). The panic message occurs becuause you asked it to wait for memory to be available. But the code is stupid, and refuses to wait for memory to be available, in the case that space can not be found in the map, because it does not properly realize that the freeing of memory elsewhere can result in freed space in the map. So it calls "panic" instead of waiting. Therefore, it's technically illegal to call kmem_malloc() with a third argument that does not include the M_NOWAIT bit, even though the function is documented, and obviously intended, to permit the use of this flags bit. In -current, there is exactly one place where kmem_malloc() is called with the kmem_map as its first argument: in the function page_alloc() in vm/uma_core.c. So, you have two bogus things happening: 1) page_alloc() in uma is using kmem_malloc() without the M_NOWAIT flag 2) kmem_malloc() without the M_NOWAIT flag panics, INSTEAD OF FRICKING WAITING, LIKE YOU ARE TELLING IT TO DO. :-0_ <- Dr. Evil Probably, page_alloc should be rewritten to not use kmem_malloc(), and to use the kmem_alloc_wait() instead. Please find a (relatively bogus) patch attached, which could cause things to block for a long time, but will avoid the panic. Jeffrey Roberson is going to need to fix UMA allocations, per the comment in this patch, for a more permanent fix. I've specifically Cc:'ed him on this message. -- Terry Index: uma_core.c === RCS file: /cvs/src/sys/vm/uma_core.c,v retrieving revision 1.38 diff -c -r1.38 uma_core.c *** uma_core.c 28 Sep 2002 17:15:33 - 1.38 --- uma_core.c 11 Oct 2002 15:19:15 - *** *** 781,787 void *p;/* Returned page */ *pflag = UMA_SLAB_KMEM; ! p = (void *) kmem_malloc(kmem_map, bytes, wait); return (p); } --- 781,798 void *p;/* Returned page */ *pflag = UMA_SLAB_KMEM; ! /* !* XXX Bogus !* kmem_malloc() can panic if called without M_NOWAIT on kmem_map; !* work around this by calling kmem_alloc_wait() instead. This is !* really bogus, because it can hang indefinitely. Jeffrey Roberson !* needs to fix this to do all UMA allocations out of kernel_map, !* instead, so pmap_growkernel() can be used, instead of hanging. !*/ ! if (wait & M_NOWAIT) ! p = (void *) kmem_malloc(kmem_map, bytes, wait); ! else ! p = (void *) kmem_alloc_wait(kmem_map, bytes); return (p); }
Re: Again: panic kmem_malloc()
At 00:23 11/10/2002, Terry Lambert wrote: >Robert Watson wrote: > > I've run into this on a couple of boxes, but those boxes were diskless > > root boxes, and used md backed ffs for /tmp and /var. Apparently if you > > do that, you're likely to exceed the kernel's auto-tuned kmem map size. > > That said, they didn't do it as frequently, so perhaps there's been a > > chance. A glance at the malloc buckets on the machine suggested that this > > wasn't a memory leak (the normal candidate in this sort of scenario). > >Use of swapping on additional space not known to the kernel at >boot time, will also cause this; That's not happening on my box. It has 64 MB real memory, 128 MB swap area mounted like this: /dev/da0s1b noneswapsw 0 0 pstat -s says: Device 1K-blocks UsedAvail Capacity Type /dev/da0s1b13107210244 120828 8%Interleaved Is there a way to check the free list of the kernel? Maybe I can find out what action triggers eating al its memory. Ben To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Again: panic kmem_malloc()
Robert Watson wrote: > I've run into this on a couple of boxes, but those boxes were diskless > root boxes, and used md backed ffs for /tmp and /var. Apparently if you > do that, you're likely to exceed the kernel's auto-tuned kmem map size. > That said, they didn't do it as frequently, so perhaps there's been a > chance. A glance at the malloc buckets on the machine suggested that this > wasn't a memory leak (the normal candidate in this sort of scenario). Use of swapping on additional space not known to the kernel at boot time, will also cause this; the auto-sizing of the map can only know about the things it knows about, so adding more later with consume the available KVA space, without growing it (all physical memory and all swap is assumed to have mappings allocated since the mappings need to be filled in at fault time. It's very tempting to seperate the mapping allocations; this would be a pretty big chore. 8-(. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Again: panic kmem_malloc()
At 23:55 10/10/2002, Robert Watson wrote: >On Thu, 10 Oct 2002, Ben Stuyts wrote: > > > panic: kmem_malloc(4096): kmem_map too small: 28246016 total allocated. >I've run into this on a couple of boxes, but those boxes were diskless >root boxes, and used md backed ffs for /tmp and /var. Apparently if you >do that, you're likely to exceed the kernel's auto-tuned kmem map size. >That said, they didn't do it as frequently, so perhaps there's been a >chance. A glance at the malloc buckets on the machine suggested that this >wasn't a memory leak (the normal candidate in this sort of scenario). There's nothing special going on here. No diskless, no md, not many processes. Here's a process list in case this helps: USERPID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND dnetc 502 98.0 0.8 960 508 ?? RNs 9:27PM 159:14.77 /usr/local/distributed.net/dnetc -quiet root 10 0.0 0.0 0 12 ?? DL1Jan70 0:00.00 (ktrace) root 1 0.0 0.4 700 220 ?? ILs 1Jan70 0:00.32 /sbin/init -- root 11 0.0 0.0 0 12 ?? RL1Jan70 1:19.57 (idle) root 12 0.0 0.0 0 12 ?? WL1Jan70 0:06.32 (swi1: net) root 13 0.0 0.0 0 12 ?? WL1Jan70 0:48.51 (swi6: tty:sio clock) root 15 0.0 0.0 0 12 ?? DL1Jan70 0:02.37 (random) root 18 0.0 0.0 0 12 ?? WL1Jan70 0:05.64 (swi3: cambio) root 23 0.0 0.0 0 12 ?? DL1Jan70 0:03.42 (usb0) root 24 0.0 0.0 0 12 ?? DL1Jan70 0:00.00 (usbtask) root 25 0.0 0.0 0 12 ?? WL1Jan70 0:03.83 (irq12: sym0) root 26 0.0 0.0 0 12 ?? WL1Jan70 0:02.96 (irq9: xl0) root 27 0.0 0.0 0 12 ?? WL1Jan70 0:00.00 (irq1: atkbd0) root 28 0.0 0.0 0 12 ?? WL1Jan70 0:00.00 (irq6: fdc0) root 30 0.0 0.0 0 12 ?? WL1Jan70 0:00.02 (swi0: tty:sio) root 2 0.0 0.0 0 12 ?? DL1Jan70 0:01.28 (pagedaemon) root 3 0.0 0.0 0 12 ?? DL1Jan70 0:00.00 (vmdaemon) root 4 0.0 0.0 0 12 ?? RL1Jan70 0:00.95 (pagezero) root 5 0.0 0.0 0 12 ?? DL1Jan70 0:00.32 (bufdaemon) root 6 0.0 0.0 0 12 ?? DL1Jan70 0:06.65 (syncer) root 7 0.0 0.0 0 12 ?? DL1Jan70 0:00.00 (vnlru) root123 0.0 0.1 220 76 ?? Is 11:27PM 0:00.00 adjkerntz -i root194 0.0 0.3 628 180 ?? Is9:27PM 0:01.16 /sbin/natd -dynamic -log -n tun0 root241 0.0 1.1 1180 688 ?? Ss9:27PM 0:01.09 /usr/sbin/syslogd -s -v root255 0.0 3.0 2536 1788 ?? Ss9:27PM 0:04.59 /usr/sbin/named -d 1 root263 0.0 1.3 1332 768 ?? Is9:27PM 0:00.06 /usr/sbin/rpcbind root340 0.0 1.1 1204 684 ?? Is9:27PM 0:00.03 /usr/sbin/mountd -r root342 0.0 1.2 1164 704 ?? Is9:27PM 0:00.30 nfsd: master (nfsd) root343 0.0 1.0 1116 620 ?? I 9:27PM 0:00.00 nfsd: server (nfsd) root344 0.0 1.0 1116 620 ?? I 9:27PM 0:00.00 nfsd: server (nfsd) root345 0.0 1.0 1116 620 ?? I 9:27PM 0:00.00 nfsd: server (nfsd) root346 0.0 1.0 1116 620 ?? I 9:27PM 0:00.00 nfsd: server (nfsd) root376 0.0 1.1 1188 664 ?? Is9:27PM 0:00.05 /usr/sbin/lpd root395 0.0 1.5 1552 920 ?? Ss9:27PM 0:02.51 /usr/sbin/ntpd -p /var/run/ntpd.pid root401 0.0 1.6 1616 972 ?? I 9:27PM 0:00.17 /usr/sbin/ntpd -p /var/run/ntpd.pid root417 0.0 0.9 1132 552 ?? Is9:27PM 0:00.09 /usr/sbin/usbd root436 0.0 2.7 3036 1624 ?? Ss9:27PM 0:02.26 sendmail: accepting connections (sendmail) smmsp 439 0.0 2.5 3012 1480 ?? Is9:27PM 0:00.06 sendmail: Queue runner@00:30:00 for /var/spool/clientmqueue (sendmail) root466 0.0 2.7 2332 1652 ?? Ss9:27PM 0:02.34 /usr/local/sbin/httpd www 484 0.0 2.8 2356 1676 ?? I 9:27PM 0:00.01 /usr/local/sbin/httpd www 485 0.0 2.8 2356 1676 ?? I 9:27PM 0:00.01 /usr/local/sbin/httpd www 486 0.0 2.8 2356 1676 ?? I 9:27PM 0:00.01 /usr/local/sbin/httpd www 487 0.0 2.8 2356 1676 ?? I 9:27PM 0:00.01 /usr/local/sbin/httpd www 488 0.0 2.8 2356 1676 ?? I 9:27PM 0:00.01 /usr/local/sbin/httpd root489 0.0 2.0 1960 1228 ?? Is9:27PM 0:00.11 /usr/local/sbin/dhcpd news494 0.0 1.1 1372 688 con- I 9:27PM 0:00.03 nntpd: accepting connections: loadav -1 (nntpd) root500 0.0 1.9 2672 1168 ?? Is9:27PM 0:00.01 /usr/local/sbin/smbd -D root503 0.0 2.0 2232 1224 ?? Ss9:27PM 0:01.70 /usr/local/sbin/nmbd -D root507 0.0 2.0 2248 1196 ?? I 9:27PM 0:00.02 /usr/local/sbin/nmbd -D root509 0.0 1.3 2452 804 ?? Is9:27PM 0:00.02 /usr/X11R6/bin/xdm root522 0.0 1.3 1324 768 ?? Is9:27PM 0:00.18 /usr/sbin/inetd -wW root533 0
Re: Again: panic kmem_malloc()
I've run into this on a couple of boxes, but those boxes were diskless root boxes, and used md backed ffs for /tmp and /var. Apparently if you do that, you're likely to exceed the kernel's auto-tuned kmem map size. That said, they didn't do it as frequently, so perhaps there's been a chance. A glance at the malloc buckets on the machine suggested that this wasn't a memory leak (the normal candidate in this sort of scenario). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories On Thu, 10 Oct 2002, Ben Stuyts wrote: > Hi, > > A couple of days ago I reported a panic, which I just got again: > > panic: kmem_malloc(4096): kmem_map too small: 28246016 total allocated. > > I don't know where to start looking for this, so I'd appreciate some help. > This is on a lightly loaded server. I've pasted the dmesg below. Latest > cvsup is oct 5. > > Thanks, > Ben > > Copyright (c) 1992-2002 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD 5.0-CURRENT #5: Sun Oct 6 01:50:54 CEST 2002 > [EMAIL PROTECTED]:/var/obj/usr/src/sys/TERMINUS > Preloaded elf kernel "/boot/kernel/kernel" at 0xc04dc000. > Timecounter "i8254" frequency 1193182 Hz > Timecounter "TSC" frequency 233864867 Hz > CPU: Pentium II/Pentium II Xeon/Celeron (233.86-MHz 686-class CPU) >Origin = "GenuineIntel" Id = 0x634 Stepping = 4 >Features=0x80f9ff > real memory = 67108864 (65536K bytes) > avail memory = 59920384 (58516K bytes) > Pentium Pro MTRR support enabled > npx0: on motherboard > npx0: INT 16 interface > Using $PIR table, 6 entries at 0xc00fdc00 > pcib0: at pcibus 0 on motherboard > pci0: on pcib0 > pcib1: at device 1.0 on pci0 > pci1: on pcib1 > isab0: at device 7.0 on pci0 > isa0: on isab0 > atapci0: port 0xf000-0xf00f at device 7.1 on > pci0 > atapci0: Busmastering DMA not supported > ata0: at 0x1f0 irq 14 on atapci0 > ata1: at 0x170 irq 15 on atapci0 > uhci0: port 0x6400-0x641f irq 11 > at device 7.2 on pci0 > usb0: on uhci0 > usb0: USB revision 1.0 > uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub0: 2 ports with 2 removable, self powered > pci0: at device 7.3 (no driver attached) > sym0: <875> port 0x6800-0x68ff mem > 0xe800-0xe8000fff,0xe8001000-0xe80010ff irq 12 at device 11.0 on pci0 > sym0: Symbios NVRAM, ID 7, Fast-20, SE, parity checking > sym0: open drain IRQ line driver, using on-chip SRAM > sym0: using LOAD/STORE-based firmware. > xl0: <3Com 3c905-TX Fast Etherlink XL> port 0x6c00-0x6c3f irq 9 at device > 13.0 on pci0 > /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from > /usr/src/sys/pci/if_xl.c:1264 > /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from > /usr/src/sys/pci/if_xl.c:1264 > lock order reversal > 1st 0xc0ba1bd4 xl0 (network driver) @ /usr/src/sys/pci/if_xl.c:1264 > 2nd 0xc03d2b00 allproc (allproc) @ /usr/src/sys/kern/kern_fork.c:318 > /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from > /usr/src/sys/pci/if_xl.c:1264 > /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from > /usr/src/sys/pci/if_xl.c:1264 > /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from > /usr/src/sys/pci/if_xl.c:1264 > xl0: Ethernet address: 00:60:08:a5:d4:ff > miibus0: on xl0 > nsphy0: on miibus0 > nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from > /usr/src/sys/pci/if_xl.c:647 > pci0: at device 15.0 (no driver attached) > orm0: at iomem 0xc8000-0xcbfff,0xc-0xc7fff on isa0 > atkbdc0: at port 0x64,0x60 on isa0 > atkbd0: irq 1 on atkbdc0 > kbd0 at atkbd0 > fdc0: at port > 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0 > fdc0: FIFO enabled, 8 bytes threshold > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > ppc0: at port 0x378-0x37f irq 7 on isa0 > ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode > lpt0: on ppbus0 > lpt0: Interrupt-driven port > sc0: on isa0 > sc0: VGA <16 virtual consoles, flags=0x200> > sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 > sio0: type 16550A > sio1 at port 0x2f8-0x2ff irq 3 on isa0 > sio1: type 16550A > vga0: at port 0x3c0-0x3df iomem 0xa-0xb on isa0 > unknown: can't assign resources (port) > unknown: can't assign resources (port) > unknown: can't assign resources (port) > unknown: can
Again: panic kmem_malloc()
Hi, A couple of days ago I reported a panic, which I just got again: panic: kmem_malloc(4096): kmem_map too small: 28246016 total allocated. I don't know where to start looking for this, so I'd appreciate some help. This is on a lightly loaded server. I've pasted the dmesg below. Latest cvsup is oct 5. Thanks, Ben Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT #5: Sun Oct 6 01:50:54 CEST 2002 [EMAIL PROTECTED]:/var/obj/usr/src/sys/TERMINUS Preloaded elf kernel "/boot/kernel/kernel" at 0xc04dc000. Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 233864867 Hz CPU: Pentium II/Pentium II Xeon/Celeron (233.86-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x634 Stepping = 4 Features=0x80f9ff real memory = 67108864 (65536K bytes) avail memory = 59920384 (58516K bytes) Pentium Pro MTRR support enabled npx0: on motherboard npx0: INT 16 interface Using $PIR table, 6 entries at 0xc00fdc00 pcib0: at pcibus 0 on motherboard pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0xf000-0xf00f at device 7.1 on pci0 atapci0: Busmastering DMA not supported ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: port 0x6400-0x641f irq 11 at device 7.2 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered pci0: at device 7.3 (no driver attached) sym0: <875> port 0x6800-0x68ff mem 0xe800-0xe8000fff,0xe8001000-0xe80010ff irq 12 at device 11.0 on pci0 sym0: Symbios NVRAM, ID 7, Fast-20, SE, parity checking sym0: open drain IRQ line driver, using on-chip SRAM sym0: using LOAD/STORE-based firmware. xl0: <3Com 3c905-TX Fast Etherlink XL> port 0x6c00-0x6c3f irq 9 at device 13.0 on pci0 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 lock order reversal 1st 0xc0ba1bd4 xl0 (network driver) @ /usr/src/sys/pci/if_xl.c:1264 2nd 0xc03d2b00 allproc (allproc) @ /usr/src/sys/kern/kern_fork.c:318 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:1264 xl0: Ethernet address: 00:60:08:a5:d4:ff miibus0: on xl0 nsphy0: on miibus0 nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto /usr/src/sys/vm/uma_core.c:1307: could sleep with "xl0" locked from /usr/src/sys/pci/if_xl.c:647 pci0: at device 15.0 (no driver attached) orm0: at iomem 0xc8000-0xcbfff,0xc-0xc7fff on isa0 atkbdc0: at port 0x64,0x60 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 fdc0: at port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 ppc0: at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode lpt0: on ppbus0 lpt0: Interrupt-driven port sc0: on isa0 sc0: VGA <16 virtual consoles, flags=0x200> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A vga0: at port 0x3c0-0x3df iomem 0xa-0xb on isa0 unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) unknown: can't assign resources (port) Timecounters tick every 10.000 msec ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, logging unlimited Waiting 5 seconds for SCSI devices to settle (noperiph:sym0:0:-1:-1): SCSI BUS reset delivered. Mounting root from ufs:/dev/da0s1a da2 at sym0 bus 0 target 3 lun 0 da2: Fixed Direct Access SCSI-2 device da2: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled da2: 3067MB (6281856 512 byte sectors: 255H 63S/T 391C) da1 at sym0 bus 0 target 2 lun 0 da1: Fixed Direct Access SCSI-2 device da1: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled da1: 3090MB (6328861 512 byte sectors: 255H 63S/T 393C) da0 at sym0 bus 0 target 1 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers (20.000MHz, offset 15, 16bit) da0: 4134MB (8467200 512 byte sectors: 255H 63S/T 527C) WARNING: / was not properly dismounted cd0 at sym0 bus 0 target 4 lun 0 cd0: Removable CD-ROM SCSI-2 device cd0: 20.000MB/s transfers (20.000MHz, offset 15) cd0: Attempt to query device s