Re: [Ugly PATCH] Again: panic kmem_malloc()
M. Warner Losh wrote: : : + if (sops) : : + free(sops, M_SEM); : : The kernel free() groks free(NULL, M_FOO), so the if isn't needed. : : Wow. That's bogus. It should panic. It isn't bogus. free(NULL) is defined to be OK in ansi-c. The kernel just mirrors that. The free(NULL) in ANSI C is to permit invocation of the garbage collector; there are very specific semantics involved. Specifically, if you do not call free(NULL), you are *guaranteed* that a malloc() followed by a free() followed by a subsequent malloc(), if the size of the area allocated by the subsequent malloc() is less than or equal to the size of the area freed, *will not fail*. Of course, FreeBSD is a memory overcommit system, and fails to maintain this guarantee, as required by the standard (e.g. only do garbage collection when it is signalled that it is OK for a subsequent re-malloc() to fail, because the GC'ed memory has been released to the system. This is OK; we all realize that the standard, which permits a NULL argument to free(), allows this value for reasons of compatability with historical source code. But that begs the question: does the kernel interface also allow it for the purposes of compatability with legacy code? This seems unlikely in the extreme. Does the kernel interface use this as a trigger, as the user space interface historically did, to perform garbage collection? This also seems unlikely. Does it do it so that people can write code that doesn't check return values, and get away with it when they shouldn't? This seems highly likely. : Or we should fix all of libc to take NULL arguments for strings, : and treat them as if they were actually . That's bogus. I agree that it's bogus, but it's the same argument in user space as in kernel space. Actually, it's not the same: the kernel argument is much poorer, not having legacy code it needs to support. In user space, there is plenty of legacy code that acts this way; in fact, one could trap a zero dereference (one does; one just faults on it, currently), map a page full of zeros at page zero, and then a dereference would in fact b giving a pointer to a NULL string. SVR4 does this, as a kernel option for compatability with legacy software. It is tunable to be able to turn it off, and you can not only run legacy software which will not run in FreeBSD ABI compatability (hardly compatable, that...), but you can know from the memory map of the process, as examined through /proc, that a NULL dereference has occurred. So it should arguably be controllable via sysctl, minimally for IBCS2 and similar ABI modules, for user space. But it's still unjustified in kernel space. And panic'ing on attempts to free NULL pointers would be a nice way of avoiding cascade failures later on, and keep the problem from being hidden a long ways away from its effect. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
In message: [EMAIL PROTECTED] Terry Lambert [EMAIL PROTECTED] writes: : M. Warner Losh wrote: : : : + if (sops) : : : + free(sops, M_SEM); : : : : The kernel free() groks free(NULL, M_FOO), so the if isn't needed. : : : : Wow. That's bogus. It should panic. : : It isn't bogus. free(NULL) is defined to be OK in ansi-c. The kernel : just mirrors that. : : The free(NULL) in ANSI C is to permit invocation of the garbage : collector; there are very specific semantics involved. Specifically, : if you do not call free(NULL), you are *guaranteed* that a malloc() : followed by a free() followed by a subsequent malloc(), if the size : of the area allocated by the subsequent malloc() is less than or : equal to the size of the area freed, *will not fail*. C99 just says: section 7.20.3.2: [#2] The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not match a pointer earlier returned by the calloc, malloc, or realloc function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined. If ptr is a null poter, no action occurs doesn't sound like GC to me. Like I said, free(NULL) is well defined, and unambiguous, in ansi c 99. In fact, I see nothing in the final c99 spec that even comes close to what you are talking about. Maybe c89 did that (I'm too lazy to walk down stairs and find it), but it too is irrelevant as the base system moves towards c99 compliance. The rest is bogus too. free(NULL, M_FOO) is well defined in the kernel and does the right thing and likely isn't going to change. Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
At 04:15 19/10/2002, Alfred Perlstein wrote: * Jake Burkholder [EMAIL PROTECTED] [021018 18:26] wrote: semop() leaks memory. An important free() was removed by alfred in rev 1.55. Try this. Oh' c'mon, isn't MP-safeness a bit more important than a some little memory leak, ram is cheap! processors aren't! Seriously, I just checked in slightly different fix (based on jake's sleuthing)... please let me know if works for you guys. Thanks Alfred, I am building the kernel right now with your fix. Later today I will let you know if this fixes the problems I've been seeing. Kind regards, Ben To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc(): SOLVED
At 13:34 19/10/2002, Ben Stuyts wrote: At 04:15 19/10/2002, Alfred Perlstein wrote: * Jake Burkholder [EMAIL PROTECTED] [021018 18:26] wrote: semop() leaks memory. An important free() was removed by alfred in rev 1.55. Try this. Seriously, I just checked in slightly different fix (based on jake's sleuthing)... please let me know if works for you guys. Thanks Alfred, I am building the kernel right now with your fix. Later today I will let you know if this fixes the problems I've been seeing. Looks good, the machine has been running for a couple of hours now, and vmstat -m says: sem 4 7K 8K 3556 16,1024,4096 Before it used to be 2 - 5 MB allocated, and qpopper/smbd do no longer eat memory each time they are invoked. Many thanks! Ben To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc(): SOLVED
* Ben Stuyts [EMAIL PROTECTED] [021019 07:16] wrote: At 13:34 19/10/2002, Ben Stuyts wrote: At 04:15 19/10/2002, Alfred Perlstein wrote: * Jake Burkholder [EMAIL PROTECTED] [021018 18:26] wrote: semop() leaks memory. An important free() was removed by alfred in rev 1.55. Try this. Seriously, I just checked in slightly different fix (based on jake's sleuthing)... please let me know if works for you guys. Thanks Alfred, I am building the kernel right now with your fix. Later today I will let you know if this fixes the problems I've been seeing. Looks good, the machine has been running for a couple of hours now, and vmstat -m says: sem 4 7K 8K 3556 16,1024,4096 Before it used to be 2 - 5 MB allocated, and qpopper/smbd do no longer eat memory each time they are invoked. Many thanks! Great! Thanks for the bug report and my apologies for jumping down your throat initially. Best of luck to you. -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
This is a repost. Forgive me if you see it twice, but it didn't turn up in the -current list. Hi, Just had another panic, same kmem_malloc(). I did a trace but forgot to write the traceback down. In any case, there was a semop() call in the traceback. Furthermore, this might be interesting: the last vmstat -m log before the panic. Maybe someone can check if these values are reasonable? The system has 64 MB memory and has been up for about 24 hrs with almost no load. [terminus.stuyts.nl ben/bin]4: cat vmstatlog.4 Type InUse MemUse HighUse Requests Size(s) atkbddev 2 1K 1K2 32 pfs_fileno 132K 32K1 32768 nexusdev 2 1K 1K2 16 memdesc 1 4K 4K1 4096 legacydrv 3 1K 1K3 16 VM pgdata 1 4K 4K1 4096 pfs_nodes20 3K 3K 20 128 MSDOSFS mount 1 8K 8K1 8192 UFS mount1223K 39K 14 256,2048,4096,16384 UFS ihash 116K 16K1 16384 UFS dirhash 18936K 52K 1695 16,32,64,128,256,512 FFS node 11086 2079K 2096K 1000908 128,256 newdirblk 0 0K 1K5 16 dirrem 5 1K 34K18098 32 mkdir 0 0K 3K 524 32 diradd 0 0K 9K18045 32 freefile 5 1K 31K12022 32 freeblks 7 2K247K10187 256 freefrag 2 1K 2K36405 32 allocindir 4 1K204K 257072 64 indirdep 2 1K876K 2172 32,8192 allocdirect 1 1K 33K51562 128 bmsafemap 8 1K 3K 6606 32 newblk 1 1K 1K 308635 64,256 inodedep1218K168K31012 128,16384 pagedep10 3K 7K 6114 64,2048 p1003.1b 1 1K 1K1 16 NFS daemon 5 3K 3K5 256,512 NFS srvsock 2 1K 1K2 128 ip6_moptions 1 1K 1K1 16 in6_multi10 1K 1K 10 16,64 syncache 1 8K 8K1 8192 IpFw/IpAcct30 4K 4K 30 64,128 in_multi 2 1K 1K2 32 routetbl41 6K 6K 78 16,32,64,128,256 lo 1 1K 1K1 512 clone 312K 12K3 4096 ether_multi35 2K 2K 35 16,32,64 ifaddr22 7K 7K 22 32,256,512,2048 BPF 6 9K 9K6 128,256,4096 mount20 4K 4K 24 16,32,128,512 vnodes23 6K 6K 137 16,32,64,128,256 cluster_save buffer 0 0K 1K 9793 32,64 vfscache 5226 381K436K 534189 64,128,256,32768 BIO buffer 810K317K 4611 512,1024,2048 DEVFS 12122K 22K 121 16,32,128,8192 pcb38 5K 6K 1913 16,32,64,2048 soname 4 1K 1K39624 16,32,128 ptys 2 1K 1K2 512 ttys 48865K 85K 6431 128,512 shm 318K 19K9 16,1024,16384 sem344456 5390K 5390K 344456 16,1024,4096 msg 425K 25K4 512,4096,16384 ioctlops 0 0K 1K 22 512,1024 USBdev 1 1K 2K4 128,512 USB1521K 22K 701353 16,32,128,256,4096 taskqueue 1 1K 1K1 128 sbuf 0 0K 5K 34 32,64,4096 rman99 7K 7K 496 16,64,128 mbufmgr 11616K 16K 116 32,64,128,2048,8192 kobj 127 508K508K 127 4096 eventhandler22 2K 2K 22 32,128 bus 47039K 40K 1363 16,32,64,128,256,512,2048,4096,8192 SWAP 273K 73K2 64 sysctltmp 0 0K 4K 802428 16,32,64,128,256,512,1024,4096 sysctl 0 0K 1K19359 16,32,64 uidinfo 7 1K 1K 7121 32,128 cred38 5K 9K 142297 128 subproc 10511K 15K52100 64,256 proc 2 1K 1K2 512 session33 5K 6K 2102 128 pgrp39 5K 6K 2278 128 module 17111K 11K 171 64 ip6ndp 3 1K 1K4 64,128,512 temp 954K286K 156410 16,32,64,128,256,512,1024,2048,4096,8192,16384,32768 devbuf 474 965K997K 2333 16,32,64,128,256,512,1024,2048,4096,8192,32768 lockf 6 1K 1K29935 64 feeder48 1K 1K 48 16 linker6513K 18K 85 16,32,256,1024,4096,8192 KTRACE 10013K 13K 100 128 ithread40 7K 7K
Re: [Ugly PATCH] Again: panic kmem_malloc()
Terry, At 23:07 18/10/2002, you wrote: Ben Stuyts wrote: Furthermore, this might be interesting: the last vmstat -m log before the panic. Maybe someone can check if these values are reasonable? The system has 64 MB memory and has been up for about 24 hrs with almost no load. sem344456 5390K 5390K 344456 16,1024,4096 Almost 5.3M of unswappable physical memory dedicated to semaphores seems like a bit much. Yes, and it increases continuously, for example when I fetch new mail (over pop) from my windows pc. The pc stores this again on a network drive, so both qpopper and smbd are involved. For example, vmstat -m says: vmstat -m | grep sem sem155886 2443K 2443K 155886 16,1024,4096 Now when I do a fetch-mail with Eudora on my pc, the same command says. vmstat -m | grep sem sem156178 2448K 2448K 156178 16,1024,4096 I can repeat this at will, and each time I loose 4-5 KB. qpopper is started from inetd, and smbd runs as a daemon. I tried stopping smbd: [terminus.stuyts.nl etc/rc.d]90: sudo /usr/local/etc/rc.d/samba.sh stop [terminus.stuyts.nl etc/rc.d]91: !vm vmstat -m | grep sem sem156524 2453K 2453K 156524 16,1024,4096 It doesn't free the sem allocated memory. But without knowing what software you are running, it's hard to say if the number is unreasonable, or not. Well, it is really a lightly loaded server, just serving one windows pc here at home. Here is a ps, and the only thing that's missing from it is the occasional pop session. Also note that this system is not connected to the internet, so the http that's running is mostly for my own pleasure (and proxy/cache). I do run ppp and uucp every now and then. USERPID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND dnetc 503 94.2 0.8 960 460 ?? RNs Thu09PM 1529:56.87 /usr/local/distributed.net/dnetc -quiet root 10 0.0 0.0 0 12 ?? DL1Jan70 0:00.00 (ktrace) root 1 0.0 0.3 700 184 ?? ILs 1Jan70 0:01.26 /sbin/init -- root 11 0.0 0.0 0 12 ?? RL1Jan70 1:24.27 (idle) root 12 0.0 0.0 0 12 ?? WL1Jan70 1:01.85 (swi1: net) root 13 0.0 0.0 0 12 ?? WL1Jan70 7:49.87 (swi6: tty:sio clock) root 15 0.0 0.0 0 12 ?? DL1Jan70 0:17.51 (random) root 18 0.0 0.0 0 12 ?? WL1Jan70 0:35.60 (swi3: cambio) root 23 0.0 0.0 0 12 ?? DL1Jan70 0:33.97 (usb0) root 24 0.0 0.0 0 12 ?? DL1Jan70 0:00.00 (usbtask) root 25 0.0 0.0 0 12 ?? WL1Jan70 0:15.98 (irq12: sym0) root 26 0.0 0.0 0 12 ?? WL1Jan70 0:33.34 (irq9: xl0) root 27 0.0 0.0 0 12 ?? WL1Jan70 0:00.04 (irq1: atkbd0) root 28 0.0 0.0 0 12 ?? WL1Jan70 0:00.00 (irq6: fdc0) root 30 0.0 0.0 0 12 ?? WL1Jan70 0:00.25 (swi0: tty:sio) root 2 0.0 0.0 0 12 ?? DL1Jan70 0:51.73 (pagedaemon) root 3 0.0 0.0 0 12 ?? DL1Jan70 0:00.42 (vmdaemon) root 4 0.0 0.0 0 12 ?? RL1Jan70 0:01.95 (pagezero) root 5 0.0 0.0 0 12 ?? DL1Jan70 0:05.29 (bufdaemon) root 6 0.0 0.0 0 12 ?? DL1Jan70 1:26.74 (syncer) root 7 0.0 0.0 0 12 ?? DL1Jan70 0:04.12 (vnlru) root123 0.0 0.0 2208 ?? IWs - 0:00.00 adjkerntz -i root194 0.0 0.4 628 244 ?? Is Thu09PM 0:09.18 /sbin/natd -dynamic -log -n tun0 root241 0.0 0.7 1180 420 ?? Ss Thu09PM 0:04.76 /usr/sbin/syslogd -s -v root255 0.0 2.6 2856 1580 ?? Is Thu09PM 0:23.02 /usr/sbin/named -d 1 root263 0.0 0.0 1332 12 ?? Is Thu09PM 0:00.06 /usr/sbin/rpcbind root340 0.0 0.0 1204 12 ?? Is Thu09PM 0:00.03 /usr/sbin/mountd -r root342 0.0 0.0 1164 12 ?? Is Thu09PM 0:00.30 nfsd: master (nfsd) root343 0.0 0.0 11168 ?? IW - 0:00.00 nfsd: server (nfsd) root344 0.0 0.0 11168 ?? IW - 0:00.00 nfsd: server (nfsd) root345 0.0 0.0 11168 ?? IW - 0:00.00 nfsd: server (nfsd) root347 0.0 0.0 11168 ?? IW - 0:00.00 nfsd: server (nfsd) root376 0.0 0.0 1188 12 ?? Is Thu09PM 0:00.05 /usr/sbin/lpd root380 0.0 0.3 1188 168 ?? SThu09PM 0:02.57 /usr/sbin/lpd root396 0.0 1.3 1552 804 ?? Ss Thu09PM 0:26.59 /usr/sbin/ntpd -p /var/run/ntpd.pid root418 0.0 0.1 1132 64 ?? Is Thu09PM 0:00.97 /usr/sbin/usbd root437 0.0 1.4 3036 820 ?? Ss Thu09PM 0:19.39 sendmail: accepting connections (sendmail) smmsp 440 0.0 0.9 3012 528 ?? Is Thu09PM 0:00.38 sendmail: Queue runner@00:30:00 for /var/spool/clientmqueue (sendmail) root467 0.0 1.5 2332 908 ?? Ss Thu09PM 0:25.86 /usr/local/sbin/httpd www 485 0.0 0.0 2356 12 ?? IThu09PM 0:00.01
Re: [Ugly PATCH] Again: panic kmem_malloc()
Ben Stuyts wrote: Almost 5.3M of unswappable physical memory dedicated to semaphores seems like a bit much. Yes, and it increases continuously, for example when I fetch new mail (over pop) from my windows pc. The pc stores this again on a network drive, so both qpopper and smbd are involved. For example, vmstat -m says: vmstat -m | grep sem sem155886 2443K 2443K 155886 16,1024,4096 Now when I do a fetch-mail with Eudora on my pc, the same command says. vmstat -m | grep sem sem156178 2448K 2448K 156178 16,1024,4096 I can repeat this at will, and each time I loose 4-5 KB. qpopper is started from inetd, and smbd runs as a daemon. I tried stopping smbd: None of us have been able to repeat your problem, up to now. I suppose now that we know you are running qpopper on -current, we could repeat the problem, but, frankly, you already have a test environment set up, and it would be a lot of work for us to duplicate it, and even so, we won't know for sure if we could repeat the problem. Have you checked out your source tree with a date tag, so that it's possible for everyone else to check out and get the same source files? Line number references in tracebacks are pretty useless, if the lines don't match. Unless you can identify the exact number of bytes being consumed, and then identify a kernel structure used in the semaphore code that is equal to that size, or for which that size is a least common multiple, and there are a number of evets equal to the size of the divisor, then that's no good. This is why everyone keeps asking you to run the kernel debugger, so that you can tell us exactly the code that's failing, and why, and why a stack backtrace, more detailed than it contained a call to sem is important. This problem is evidently a memory leak in the semaphore code; but that does not mean that the crash that results will be in any way related to where the leak occurs. In other words, the crash is a secondary effect. Only by fully understanding the crash will anyone be able to help you with the root cause. I understand that it's frustrating to go step by step, when you think you have isolated the problem to a smaller area, but the information you gather from outside that area will tell you about the inside much more clearly than staring at the outside of a black box where we know the problem lives. The only alternative to rewriting the black box from scratch, or grovelling through it with a line-by-line code review (I'm not interested in doing that; perhaps you could interest the author of the changes that resulted in the problem) is to find a smoking gun, and work from that, instead. If this problem is in the way of you getting work done (one wonders why you are using -current, if you need to get work done), then my best suggestion to you is to back out the changes Alfred made, one by one, and when it stops having the problem, you will have identified a very small patch that causes the problem. But without knowing what software you are running, it's hard to say if the number is unreasonable, or not. Well, it is really a lightly loaded server, just serving one windows pc here at home. Here is a ps, and the only thing that's missing from it is the occasional pop session. Also note that this system is not connected to the internet, so the http that's running is mostly for my own pleasure (and proxy/cache). I do run ppp and uucp every now and then. Perhaps I wasn't clear. Not knowing what calls your software makes that cause the problem to occur, it is not possible for us to create a cut-down test case in less than 30 lines of C source code, so that we can repeat the problem at will, without secondary effects. As it is, you only *suppose* that the qpopper usage alone is sufficient to cause the problem; even if you are correct, that's insufficient to identify where the problem is... it may not even really be in the semaphore source code at all.. maybe it's in kevent code, for unfreed events, etc.. I think you need to go back one email: | Just had another panic, same kmem_malloc(). I did a trace but forgot to | write the traceback down. | | Wait until the next one, and remember to write it down; preferrably, | obtain a system dump image, so you can examine it with the debugger, | and make sure that the kernel you are running has a debuggable | counterpart already there (i.e. you used config -g to create the | kernel you are running). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
Apparently, On Sat, Oct 19, 2002 at 12:19:57AM +0200, Ben Stuyts said words to the effect of; Terry, At 23:07 18/10/2002, you wrote: Ben Stuyts wrote: Furthermore, this might be interesting: the last vmstat -m log before the panic. Maybe someone can check if these values are reasonable? The system has 64 MB memory and has been up for about 24 hrs with almost no load. sem344456 5390K 5390K 344456 16,1024,4096 Almost 5.3M of unswappable physical memory dedicated to semaphores seems like a bit much. Yes, and it increases continuously, for example when I fetch new mail (over pop) from my windows pc. The pc stores this again on a network drive, so both qpopper and smbd are involved. For example, vmstat -m says: semop() leaks memory. An important free() was removed by alfred in rev 1.55. Try this. Jake Index: sysv_sem.c === RCS file: /home/ncvs/src/sys/kern/sysv_sem.c,v retrieving revision 1.55 diff -u -r1.55 sysv_sem.c --- sysv_sem.c 13 Aug 2002 08:47:17 - 1.55 +++ sysv_sem.c 19 Oct 2002 01:20:35 - @@ -1128,6 +1128,8 @@ td-td_retval[0] = 0; done2: mtx_unlock(sema_mtxp); + if (sops) + free(sops, M_SEM); return (error); } To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
* Jake Burkholder [EMAIL PROTECTED] [021018 18:26] wrote: semop() leaks memory. An important free() was removed by alfred in rev 1.55. Try this. Oh' c'mon, isn't MP-safeness a bit more important than a some little memory leak, ram is cheap! processors aren't! Seriously, I just checked in slightly different fix (based on jake's sleuthing)... please let me know if works for you guys. thanks, -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
Hello Alfred, On Wed, Oct 16, 2002 at 02:26:19PM -0700, Alfred Perlstein wrote: * Ben Stuyts [EMAIL PROTECTED] [021016 14:05] wrote: No need to wait for tomorrow. :-) Just 1.5 hours later, vmstat -m says: sem167344 2622K 2622K 167344 16,1024,4096 --- sem235512 3687K 3687K 235512 16,1024,4096 So it looks indeed like sem is the problem, what does sysctl -a | grep ^p10 say? p1003_1b.asynchronous_io: 0 p1003_1b.mapped_files: 1 p1003_1b.memlock: 0 p1003_1b.memlock_range: 0 p1003_1b.memory_protection: 0 p1003_1b.message_passing: 0 p1003_1b.prioritized_io: 0 p1003_1b.priority_scheduling: 1 p1003_1b.realtime_signals: 0 p1003_1b.semaphores: 0 p1003_1b.fsync: 0 p1003_1b.shared_memory_objects: 1 p1003_1b.synchronized_io: 0 p1003_1b.timers: 0 p1003_1b.aio_listio_max: 0 p1003_1b.aio_max: 0 p1003_1b.aio_prio_delta_max: 0 p1003_1b.delaytimer_max: 0 p1003_1b.mq_open_max: 0 p1003_1b.pagesize: 4096 p1003_1b.rtsig_max: 0 p1003_1b.sem_nsems_max: 0 p1003_1b.sem_value_max: 0 p1003_1b.sigqueue_max: 0 p1003_1b.timer_max: 0 My guess is that you don't have the module in question loaded. If you do, then why? (it's marked experimental) The only modules loaded are: [terminus.stuyts.nl boot/kernel]21: kldstat Id Refs AddressSize Name 13 0xc010 3daa00 kernel 21 0xc12fd000 2000 green_saver.ko And why aren't these bug reports a lot more detailed? (meaing why aren't you actually giving an hypothesys as to why the code is broken?) I think it was Jeff Roberson hinting at that. I am only reporting a problem and I hope I can help fixing it. I have however no knowledge of the kernel internals, so forgive me for being too vague and let me know what more information you need. *grumble* Sorry... Kind regards, Ben To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc(): dmesg and kernel config
Some info I did not include in the previous messages: dmesg output and kernel config. [terminus.stuyts.nl boot/kernel]26: dmesg Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT #5: Sun Oct 6 01:50:54 CEST 2002 [EMAIL PROTECTED]:/var/obj/usr/src/sys/TERMINUS Preloaded elf kernel /boot/kernel/kernel at 0xc04dc000. Timecounter i8254 frequency 1193182 Hz Timecounter TSC frequency 233864671 Hz CPU: Pentium II/Pentium II Xeon/Celeron (233.86-MHz 686-class CPU) Origin = GenuineIntel Id = 0x634 Stepping = 4 Features=0x80f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,MMX real memory = 67108864 (65536K bytes) avail memory = 59920384 (58516K bytes) Pentium Pro MTRR support enabled npx0: math processor on motherboard npx0: INT 16 interface Using $PIR table, 6 entries at 0xc00fdc00 pcib0: Intel 82443LX (440 LX) host to PCI bridge at pcibus 0 on motherboard pci0: PCI bus on pcib0 pcib1: PCIBIOS PCI-PCI bridge at device 1.0 on pci0 pci1: PCI bus on pcib1 isab0: PCI-ISA bridge at device 7.0 on pci0 isa0: ISA bus on isab0 atapci0: Intel PIIX4 ATA33 controller port 0xf000-0xf00f at device 7.1 on pci0 atapci0: Busmastering DMA not supported ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0x6400-0x641f irq 11 at device 7.2 on pci0 usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered pci0: bridge, PCI-unknown at device 7.3 (no driver attached) sym0: 875 port 0x6800-0x68ff mem 0xe800-0xe8000fff,0xe8001000-0xe80010ff irq 12 at device 11.0 on pci0 sym0: Symbios NVRAM, ID 7, Fast-20, SE, parity checking sym0: open drain IRQ line driver, using on-chip SRAM sym0: using LOAD/STORE-based firmware. xl0: 3Com 3c905-TX Fast Etherlink XL port 0x6c00-0x6c3f irq 9 at device 13.0 on pci0 /usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from /usr/src/sys/pci/if_xl.c:1264 /usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from /usr/src/sys/pci/if_xl.c:1264 lock order reversal 1st 0xc0ba1bd4 xl0 (network driver) @ /usr/src/sys/pci/if_xl.c:1264 2nd 0xc03d2b00 allproc (allproc) @ /usr/src/sys/kern/kern_fork.c:318 /usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from /usr/src/sys/pci/if_xl.c:1264 /usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from /usr/src/sys/pci/if_xl.c:1264 /usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from /usr/src/sys/pci/if_xl.c:1264 xl0: Ethernet address: 00:60:08:a5:d4:ff miibus0: MII bus on xl0 nsphy0: DP83840 10/100 media interface on miibus0 nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto /usr/src/sys/vm/uma_core.c:1307: could sleep with xl0 locked from /usr/src/sys/pci/if_xl.c:647 pci0: display, VGA at device 15.0 (no driver attached) orm0: Option ROMs at iomem 0xc8000-0xcbfff,0xc-0xc7fff on isa0 atkbdc0: Keyboard controller (i8042) at port 0x64,0x60 on isa0 atkbd0: AT Keyboard irq 1 on atkbdc0 kbd0 at atkbd0 fdc0: enhanced floppy controller (i82077, NE72065 or clone) at port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: 1440-KB 3.5 drive on fdc0 drive 0 ppc0: Parallel port at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode ppbus0: IEEE1284 device found /NIBBLE Probing for PnP devices on ppbus0: ppbus0: EPSON Stylus Photo EX PRINTER ESCPL2,BDC lpt0: Printer on ppbus0 lpt0: Interrupt-driven port sc0: System console on isa0 sc0: VGA 16 virtual consoles, flags=0x200 sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0 unknown: PNP0303 can't assign resources (port) unknown: PNP0501 can't assign resources (port) unknown: PNP0700 can't assign resources (port) unknown: PNP0400 can't assign resources (port) unknown: PNP0501 can't assign resources (port) Timecounters tick every 10.000 msec ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, logging unlimited Waiting 5 seconds for SCSI devices to settle (noperiph:sym0:0:-1:-1): SCSI BUS reset delivered. Mounting root from ufs:/dev/da0s1a da2 at sym0 bus 0 target 3 lun 0 da2: QUANTUM FIREBALL_TM3200S 300N Fixed Direct Access SCSI-2 device da2: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled da2: 3067MB (6281856 512 byte sectors: 255H 63S/T 391C) da1 at sym0 bus 0 target 2 lun 0 da1: QUANTUM FIREBALL ST3.2S 0F0C Fixed Direct Access SCSI-2 device da1: 20.000MB/s transfers (20.000MHz, offset 15), Tagged Queueing Enabled da1: 3090MB (6328861 512 byte sectors: 255H 63S/T 393C) da0 at sym0 bus 0 target 1 lun 0 da0: IBM DCAS-34330W S61A
Re: [Ugly PATCH] Again: panic kmem_malloc()
On Wed, 16 Oct 2002, Ben Stuyts wrote: I just got the same panic without your patch. (I wanted to verify that it was still panic-ing with the latest src tree.) I am now building a kernel with your patch. I'll also run your vmstat script that you posted in a similar thread. One of the big memory users seems to be sem, and it's growing. Almost every time I do a vmstat -m, sem usage has grown a few k. [snip] sem167320 2622K 2622K 167320 16,1024,4096 [snip] Thank you for looking into this. It definitely looks like a memory leak. I forwarded this to alfred. He was just working on semaphores so he may know something about it. I'll see what the stats are tomorrow. Kind regards, Ben Much appreciated. Cheers, Jeff To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
At 21:20 11/10/2002, Terry Lambert wrote: Please find a (relatively bogus) patch attached, which could cause things to block for a long time, but will avoid the panic. Terry, I just got the same panic without your patch. (I wanted to verify that it was still panic-ing with the latest src tree.) I am now building a kernel with your patch. I'll also run your vmstat script that you posted in a similar thread. One of the big memory users seems to be sem, and it's growing. Almost every time I do a vmstat -m, sem usage has grown a few k. Type InUse MemUse HighUse Requests Size(s) atkbddev 2 1K 1K2 32 pfs_fileno 132K 32K1 32768 nexusdev 2 1K 1K2 16 memdesc 1 4K 4K1 4096 legacydrv 3 1K 1K3 16 VM pgdata 1 4K 4K1 4096 pfs_nodes20 3K 3K 20 128 MSDOSFS mount 1 8K 8K1 8192 UFS mount1223K 39K 14 256,2048,4096,16384 UFS ihash 116K 16K1 16384 UFS dirhash5711K 11K 117 16,32,64,128,512 FFS node 4976 933K936K40528 128,256 dirrem 0 0K 31K 5522 32 mkdir 0 0K 3K 520 32 diradd14 1K 7K 3118 32 freefile 0 0K 26K 4839 32 freeblks 1 1K186K 3820 256 freefrag 6 1K 1K 2494 32 allocindir10 1K 86K 8596 64 indirdep 2 1K876K 577 32,8192 allocdirect23 3K 16K 8457 128 bmsafemap 3 1K 3K 365 32 newblk 1 1K 1K17054 64,256 inodedep1618K168K 9570 128,16384 pagedep 2 3K 7K 874 64,2048 p1003.1b 1 1K 1K1 16 NFS daemon 5 3K 3K5 256,512 NFS srvsock 2 1K 1K2 128 ip6_moptions 1 1K 1K1 16 in6_multi10 1K 1K 10 16,64 syncache 1 8K 8K1 8192 IpFw/IpAcct30 4K 4K 30 64,128 in_multi 2 1K 1K2 32 routetbl41 6K 6K 76 16,32,64,128,256 lo 1 1K 1K1 512 clone 312K 12K3 4096 ether_multi35 2K 2K 35 16,32,64 ifaddr22 7K 7K 22 32,256,512,2048 BPF 6 9K 9K6 128,256,4096 mount20 4K 4K 24 16,32,128,512 vnodes23 6K 6K 137 16,32,64,128,256 cluster_save buffer 0 0K 1K 1183 32,64 vfscache 2634 197K198K32833 64,128,256,32768 BIO buffer2126K205K 1130 512,1024,2048 DEVFS 12122K 22K 121 16,32,128,8192 pcb38 5K 5K 58 16,32,64,2048 soname 4 1K 1K 1415 16,32,128 ptys 2 1K 1K2 512 ttys 61481K 81K 1121 128,512 shm 318K 19K8 16,1024,16384 sem167320 2622K 2622K 167320 16,1024,4096 msg 425K 25K4 512,4096,16384 ioctlops 0 0K 1K 22 512,1024 USBdev 1 1K 2K4 128,512 USB1521K 22K15345 16,32,128,256,4096 taskqueue 1 1K 1K1 128 sbuf 0 0K 5K2 32,4096 rman99 7K 7K 496 16,64,128 mbufmgr 10615K 15K 106 32,64,128,2048,8192 kobj 127 508K508K 127 4096 eventhandler22 2K 2K 22 32,128 bus 47039K 40K 1363 16,32,64,128,256,512,2048,4096,8192 SWAP 273K 73K2 64 sysctltmp 0 0K 4K 8856 16,32,64,128,256,512,1024,4096 sysctl 0 0K 1K 386 16,32,64 uidinfo 7 1K 1K 525 32,128 cred34 5K 5K18178 128 subproc 11411K 14K10613 64,256 proc 2 1K 1K2 512 session33 5K 5K 68 128 pgrp40 5K 6K 117 128 module 17111K 11K 171 64 ip6ndp 3 1K 1K4 64,128,512 temp1154K 55K19887 16,32,64,128,256,512,1024,2048,4096,8192,16384,32768 devbuf 473 964K997K 2268 16,32,64,128,256,512,1024,2048,4096,8192,32768 lockf 6 1K 1K 549 64 feeder48 1K 1K 48 16 linker6513K 18K 85 16,32,256,1024,4096,8192 KTRACE 10013K 13K 100 128
Re: [Ugly PATCH] Again: panic kmem_malloc()
At 22:00 16/10/2002, Jeff Roberson wrote: On Wed, 16 Oct 2002, Ben Stuyts wrote: I'll also run your vmstat script that you posted in a similar thread. One of the big memory users seems to be sem, and it's growing. Almost every time I do a vmstat -m, sem usage has grown a few k. [snip] sem167320 2622K 2622K 167320 16,1024,4096 [snip] Thank you for looking into this. It definitely looks like a memory leak. I forwarded this to alfred. He was just working on semaphores so he may know something about it. I'll see what the stats are tomorrow. Much appreciated. No need to wait for tomorrow. :-) Just 1.5 hours later, vmstat -m says: sem167344 2622K 2622K 167344 16,1024,4096 --- sem235512 3687K 3687K 235512 16,1024,4096 So it looks indeed like sem is the problem, Kind regards, Ben To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: [Ugly PATCH] Again: panic kmem_malloc()
* Ben Stuyts [EMAIL PROTECTED] [021016 14:05] wrote: At 22:00 16/10/2002, Jeff Roberson wrote: On Wed, 16 Oct 2002, Ben Stuyts wrote: I'll also run your vmstat script that you posted in a similar thread. One of the big memory users seems to be sem, and it's growing. Almost every time I do a vmstat -m, sem usage has grown a few k. [snip] sem167320 2622K 2622K 167320 16,1024,4096 [snip] Thank you for looking into this. It definitely looks like a memory leak. I forwarded this to alfred. He was just working on semaphores so he may know something about it. I'll see what the stats are tomorrow. Much appreciated. No need to wait for tomorrow. :-) Just 1.5 hours later, vmstat -m says: sem167344 2622K 2622K 167344 16,1024,4096 --- sem235512 3687K 3687K 235512 16,1024,4096 So it looks indeed like sem is the problem, Kind regards, Ben what does sysctl -a | grep ^p10 say? My guess is that you don't have the module in question loaded. If you do, then why? (it's marked experimental) And why aren't these bug reports a lot more detailed? (meaing why aren't you actually giving an hypothesys as to why the code is broken?) *grumble* -- -Alfred Perlstein [[EMAIL PROTECTED]] 'Instead of asking why a piece of software is using 1970s technology, start asking why software is ignoring 30 years of accumulated wisdom.' To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message