Re: A possible bug in the interrupt thread preemption code [Was:
On 22-Feb-01 Dag-Erling Smorgrav wrote: > John Baldwin <[EMAIL PROTECTED]> writes: >> On 22-Feb-01 Maxim Sobolev wrote: >> > It's already have INVARIANTS, MUTEX_DEBUG, WITNESS and WITNESS_DDB. >> Hmm, ouch, you do'nt want MUTEX_DEBUG, that'll slow your system to a crawl. > > For the same reason, you probably want WITNESS_SKIPSPIN. Not really. WITNESS doesn't really bog down spin mutexes all that much. It has a very simple order checking that is nothing like the order checking for sleep mutexes. The killer for MUTEX_DEBUG is that each mtx_init() involves walking a linked list of _all_ of the mutexes in the system and checking each one with the one beign init'd to check for a duplicate init. > WITNESS_DDB is a bad idea, BTW, there's a (presumably harmless) lock > order reversal in the FS code that you're practically guaranteed to to > hit during boot. Well, they aren't necessarily harmless, but they've been around for a very long time, so if they do cause rare lockups, they are rare at least. > DES > -- > Dag-Erling Smorgrav - [EMAIL PROTECTED] -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: A possible bug in the interrupt thread preemption code [Was:
On 22-Feb-01 Maxim Sobolev wrote: > John Baldwin wrote: > >> On 22-Feb-01 Maxim Sobolev wrote: >> >> >> > Here it is (from DDB): >> >> > panic(c027de93,c0297409,c027f878,368,80286) >> >> > _mtx_assert(c02ea000,9,c027f878,368,80286) >> >> > mi_switch(c32c5da0,3,c02cea44,c357be98) >> >> > ithread_schedule(c0747c00,1) >> >> > sched_ithd(e) >> >> > Xresume14() >> >> > --- interrupt, eip = 0xc025b60f, esp = 0x80296, ebp = 0xc357bf08 --- >> >> > trap(18, 10, 10,c01597b6,20) >> >> > calltrap() >> >> > --- trap 0x9, eip = 0xc025a5de, esp = 0xc357bf50, ebp = 0xc357bf64 --- >> >> > sw1b(c0146cbc,c0146cbc,c32c5da0,c357bf94) >> >> > ithread_loop(c0747c00,c357bfa8) >> >> > fork_exit(c0146cbc,c0747c00,c357bfa8) >> >> > fork_trampoline() >> >> >> >> *sigh* This is why enabling interrupts in trap() is such a bad idea. If >> >> we >> >> get a trap in the scheduler, then lots of bad crap starts to happen >> >> because >> >> we >> >> can get an interrupt while we are in a trap. :( Can you compile your >> >> kernel >> >> with >> >> INVARIANTS on though, as I think the kernel should've panic'd earlier if >> >> it >> >> is >> >> doing what I think it is doing. >> > >> > It's already have INVARIANTS, MUTEX_DEBUG, WITNESS and WITNESS_DDB. >> >> Hmm, ouch, you do'nt want MUTEX_DEBUG, that'll slow your system to a crawl. > > It doesn't really matter, because system can't even boot into single-user due > to > panic. > >> >> Also, if you are feeling industrious, edit >> >> sys/i386/i386/trap.c and comment out the enable_intr() call near the >> >> beginning >> >> of the trap() function right after the printf for 'kernel trap %d with >> >> interrupts disabled'. >> > >> > Ok, I'll try so. >> > >> > -Maxim >> >> It will still panic, just hopefully a better panic. > > I did understand that, but the panic I see after the change is exactly the > same as > before. Any other ideas? A recursive sched_lock? Erm, well, stick these options in your kernel config: options KTR options KTR_EXTEND options KTR_COMPILE=KTR_LOCK options KTR_MASK=KTR_MASK Then when it panics, use the 'show ktr' command to list the mutex operations up until that point. Hopefully you can see where it is grabbing sched lock the first time and then not releasing it. Also, hsa the backtrace changed at all? If not, then you may have commented out the wrong enable_intr(). :) > -Maxim -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: A possible bug in the interrupt thread preemption code [Was:
On 22-Feb-01 Maxim Sobolev wrote: > John Baldwin wrote: >> >> A recursive sched_lock? Erm, well, stick these options in your kernel >> config: >> >> options KTR >> options KTR_EXTEND >> options KTR_COMPILE=KTR_LOCK >> options KTR_MASK=KTR_MASK > > Bah, it even doesn't compile with these options: > cc -c -pipe -O -march=pentium -Wall -Wredundant-decls -Wnested-externs > -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline > -Wcast-qual > -fformat-extensions -ansi -nostdinc -I- -I. -I../.. -I../../dev > -I../../../include -I../../contrib/dev/acpica/Subsystem/Include -D_KERNEL > -include > opt_global.h -elf -mpreferred-stack-boundary=2 ../../kern/kern_ktr.c > ../../kern/kern_ktr.c: In function `__Tunable_ktr_mask': > ../../kern/kern_ktr.c:95: `KTR_MASK' undeclared (first use in this function) > ../../kern/kern_ktr.c:95: (Each undeclared identifier is reported only once > ../../kern/kern_ktr.c:95: for each function it appears in.) > *** Error code 1 > 1 error Oh, whoops, that should be: options KTR_MASK=KTR_LOCK > -Maxim -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Release b0rked..
===> lib/libgssapi rm -f .depend mkdep -f .depend -a -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/krb5 -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/asn1 -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/roken -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/des -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/include -I/usr/obj/usr/src/kerberos5/lib/libgssapi/../../lib/libasn1 -I/usr/src/kerberos5/lib/libgssapi/../../include -I/usr/src/kerberos5/lib/libgssapi/../../include -DHAVE_CONFIG_H -DINET6 /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/8003.c ... In file included from /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/krb5/krb5_locl.h:14 2, from /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/gssapi_locl. h:39, from /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/8003.c:34: /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/krb5/krb5.h:43: krb5_err.h: No such file or directory /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/krb5/krb5.h:44: heim_err.h: No such file or directory In file included from /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/gssapi_locl. h:39, from /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/8003.c:34: /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/krb5/krb5_locl.h:14 3: krb5_err.h: No such file or directory ... (repeat about 50 times) mkdep: compile failed *** Error code 1 Stop in /usr/src/kerberos5/lib/libgssapi. *** Error code 1 Stop in /usr/src/kerberos5/lib. *** Error code 1 Stop in /usr/src/kerberos5. *** Error code 1 Stop in /usr/src/release. *** Error code 1 Stop in /usr/src/release. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.Baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: System hangs with -current ...
On 22-Feb-01 The Hermit Hacker wrote: > > Okay, I have to pick up a NULL modem cable tomorrow and dive into this ... > finally ... > > The various KTR_ that you mention below, these are kernel settings that I > compile into the kernel? Yes. You want this: options KTR options KTR_EXTEND options KTR_COMPILE=0x1208 The mtx_quiet.patch is old and won't apply to current now I'm afraid. > On Tue, 2 Jan 2001, John Baldwin wrote: > >> >> On 02-Jan-01 The Hermit Hacker wrote: >> > >> > Over the past several months, as others have reported, I've been getting >> > system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but >> > ctl-alt-esc doesn't break me to the debugger ... >> > >> > I'm not complaining about the hangs, if I was overly concerned, I'd run >> > -STABLE, but I'm wondering how one goes about providing debug information >> > on them other then through DDB? >> >> Not easily. :( If you can make the problem easily repeatable, then you can >> try >> turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND), setting >> up >> a serial console that you log the output of, create a shell script that runs >> the following commands: >> >> #!/bin/sh >> >> # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK >> sysctl -w debug.ktr_mask=0x1208 >> sysctl -w debug.ktr_verbose=2 >> >> run_magic_command_that_hangs_my_machine >> >> and run the script. You probably want to run it over a tty or remote login >> so >> tthat the serial console output is just the logging (warning, it will be >> very >> verbose!). Also, you probably want to use >> http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of the >> irrelevant and cluttery mutex trace messages. Note that having this much >> logging on will probably slow the machine to a crawl as well, so you may >> have >> to just start this up and go off and do something else until it hangs. :-/ >> Another alternative is to rig up a NMI debouncer and use it to break into >> the >> debugger. Then you can start poking around to see who owns sched_lock, etc. >> >> > Thanks ... >> >> -- >> >> John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ >> PGP Key: http://www.baldwin.cx/~john/pgpkey.asc >> "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ >> > > Marc G. Fournier ICQ#7615664 IRC Nick: > Scrappy > Systems Administrator @ hub.org > primary: [EMAIL PROTECTED] secondary: > scrappy@{freebsd|postgresql}.org > -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: System hangs with -current ...
On 23-Feb-01 The Hermit Hacker wrote: > On Thu, 22 Feb 2001, John Baldwin wrote: > >> >> On 22-Feb-01 The Hermit Hacker wrote: >> > >> > Okay, I have to pick up a NULL modem cable tomorrow and dive into this ... >> > finally ... >> > >> > The various KTR_ that you mention below, these are kernel settings that I >> > compile into the kernel? >> >> Yes. You want this: >> >> options KTR >> options KTR_EXTEND >> options KTR_COMPILE=0x1208 > > okay, just so that I understand ... I compile my kernel with these > options, and then run the two sysctl commands you list below? the > KTR_COMPILE arg looks similar to the ktr_mask one below, which is why I'm > confirming ... Yes. KTR_COMPILE controls what KTR tracepoints are actually compiled into the kernel. The ktr_mask sysctl controls a runtime mask that lets you choose which of the compiled in masks you want to enable. I have manpages for this stuff, but they are waiting for doc guys to review them. >> The mtx_quiet.patch is old and won't apply to current now I'm afraid. >> >> > On Tue, 2 Jan 2001, John Baldwin wrote: >> > >> >> >> >> On 02-Jan-01 The Hermit Hacker wrote: >> >> > >> >> > Over the past several months, as others have reported, I've been >> >> > getting >> >> > system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but >> >> > ctl-alt-esc doesn't break me to the debugger ... >> >> > >> >> > I'm not complaining about the hangs, if I was overly concerned, I'd run >> >> > -STABLE, but I'm wondering how one goes about providing debug >> >> > information >> >> > on them other then through DDB? >> >> >> >> Not easily. :( If you can make the problem easily repeatable, then you >> >> can >> >> try >> >> turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND), >> >> setting >> >> up >> >> a serial console that you log the output of, create a shell script that >> >> runs >> >> the following commands: >> >> >> >> #!/bin/sh >> >> >> >> # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK >> >> sysctl -w debug.ktr_mask=0x1208 >> >> sysctl -w debug.ktr_verbose=2 >> >> >> >> run_magic_command_that_hangs_my_machine >> >> >> >> and run the script. You probably want to run it over a tty or remote >> >> login >> >> so >> >> tthat the serial console output is just the logging (warning, it will be >> >> very >> >> verbose!). Also, you probably want to use >> >> http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of >> >> the >> >> irrelevant and cluttery mutex trace messages. Note that having this much >> >> logging on will probably slow the machine to a crawl as well, so you may >> >> have >> >> to just start this up and go off and do something else until it hangs. >> >> :-/ >> >> Another alternative is to rig up a NMI debouncer and use it to break into >> >> the >> >> debugger. Then you can start poking around to see who owns sched_lock, >> >> etc. >> >> >> >> > Thanks ... -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Release b0rked..
On 23-Feb-01 John Hay wrote: >> ===> lib/libgssapi >> rm -f .depend >> mkdep -f .depend -a > ... >> /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/krb5/krb5.h:43: >> krb5_err.h: No such file or directory >> /usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/krb5/krb5.h:44: >> heim_err.h: No such file or directory > > You can get past this error with this patch, but it will just die a little > later in another part of kerberos. You should have better luck by building > a release with NOKERBEROS=YES. I was able to build one yesterday. Hrm, I already got release fixed here locally with this patch which looks more like what the other libraries do: Index: kerberos5/lib/libgssapi//Makefile === RCS file: /host/ares/usr/home/ncvs/src/kerberos5/lib/libgssapi/Makefile,v retrieving revision 1.1 diff -u -r1.1 Makefile --- kerberos5/lib/libgssapi//Makefile 2001/02/13 16:56:50 1.1 +++ kerberos5/lib/libgssapi//Makefile 2001/02/23 02:34:46 @@ -7,7 +7,8 @@ -I${KRB5DIR}/lib/roken \ -I${KRB5DIR}/lib/des\ -I${KRB5DIR}/include\ - -I${ASN1OBJDIR} + -I${ASN1OBJDIR} \ + -I${.OBJDIR} SRCS= \ 8003.c \ @@ -49,8 +50,10 @@ wrap.c \ address_to_krb5addr.c -INCLUDES=${KRB5DIR}/lib/gssapi/gssapi.h +INCLUDES=${KRB5DIR}/lib/gssapi/gssapi.h heim_err.h krb5_err.h .include .PATH: ${KRB5DIR}/lib/gssapi + +beforedepend all: heim_err.h krb5_err.h Oops, and I've already committed this. I'm sure it will get backed out if it isn't correct. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.Baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: dirty buffers on reboot again?'
On 23-Feb-01 Soren Schmidt wrote: > It seems David O'Brien wrote: >> On Thu, Feb 22, 2001 at 03:20:10PM -0800, Matthew Jacob wrote: >> > login: (da0:ahc0:0:0:0): tagged openings now 16 >> ... >> > syncing disks... 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 >> > giving up on 3 buffers >> ... >> > I'm seeing a lot of this again. Anyone else? >> >> Yep. ahc controller also. By chance is that the commonality to this >> problem? > > No, its generically broken... Hmm, I must have Magic Boxes(tm) becaues none of them exhibit this problem. > -Søren -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.Baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: cvs commit: src/sys/kern kern_mutex.c
On 24-Feb-01 John Baldwin wrote: > jhb 2001/02/24 11:36:13 PST > > Modified files: > sys/kern kern_mutex.c > Log: > ... > - Make the _mtx_assert() function be compiled in if INVARIANTS_SUPPORT is > defined rather than if INVARIANTS is defined so that a KLD compiled > with INVARIANTS that uses mtx_assert() can be used with a kernel that > just has INVARIANT_SUPPORT compiled in. With this, building a kernel with INVARIANTS is back to requiring INVARIANT_SUPPORT to be present in the kernel. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Today's panic :-)
On 24-Feb-01 Julian Elischer wrote: > Warner Losh wrote: >> >> I've added INVARIANTS and WITNESS to my kernel. Today I get a random >> panic on boot sometimes: >> >> lock order reseral (this doesn't cause the panic, but >> does seem to happen all the time) >> 1st vnode interlock last acquired @ ../../usr/ffs/ffs_fsops.c:396 >> 2nd 0xc04837a0 mntvnode @ ../../ufs/ffs/ffs_vfsops.c:457 >> 3rd 0xc80b9e8c vnode interlock @ ../../kern/vfs_subr.c:1872 > > I keep dropping into ddb with: Don't use WITNESS_DDB and you won't drop into ddb. We're not far enough along to make that very livable yet. :( -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Today's panic :-)
On 24-Feb-01 Warner Losh wrote: > In message <[EMAIL PROTECTED]> Bruce > Evans writes: >: It seems to be another trap while holding sched_lock. This should be >: fatal, but the problem is only detected because trap() enables >: interrupts. Then an interrupt causes bad things to happen. Unfortunately, >: the above omits the critical information: the instruction at sw1b+0x6b. >: There is no instruction at that address here. It is apparently just an >: access to a swapped-out page for the new process. I can't see how this >: ever worked. The page must be faulted in, but this can't be done while >: sched_lock is held (not to mention after we have committed to switching >: contexts). > > sw1b+0x6b is ltr %si > > I note that this doesn't happen when the disks are clean on boot, but > does happen when they are dirty. The kernel is as of a cvsup 3pm MST > today. The kernel from 1am last night doesn't seem to have this > problem. Other people have reported this and I can't reproduce this. The one case I managed to track down so far involved proc0 having pcb_ext bogusly set, resulting in cpu_switch() setting up a bogus GDT entry for a TSS and thus generating a GPF which is the trap you see. The enable_intr() in trap() then sends things downhill fast. I'm not sure yet why processes are having pcb_ext bogusly set. Hmm. Make sure you have rev 1.35 or later of pcb.h. Also, try build a kernel from scratch from fresh sources.. > Warner -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: panic -- sched lock recursed
On 26-Feb-01 Steve Kargl wrote: > It appears that lpd is again triggering a panic. Sources > are from 25 Feb 01 at 1039 PST. Kernel.debug, vmcore.0, kernel.0 > available for the asking. This is a different panic the dreaded 'ltr' panics everyone has been seeing, but the enable_intr() in trap keeps hiding it. I think I'll go commit a hack for that right now. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: does SMP i386 work at all right now?
On 26-Feb-01 Matthew Jacob wrote: > > I finally realized I hadn't been running my 2xPPro box with SMP enabled for a > month or so. The top of tree, with the exception of Jake's last fix, hangs: > > Mounting root from ufs:/dev/da0a > da0s1: type 0xa5, start 63, end = 8385929, size 8385867 : OK > WARNING: / was not properly dismounted > p/MsPb:i nC/PiUn1i taLiatu:n cted! > ic_initialize(): > lint0: 0x00010700 lint1: 0x00010400 TPR: 0x0010 SVR: 0x01ff > swapon: adding /dev/da0b as swap device > Automatic boot in progress... > > > (note the mangeld serial output) It's not mangled per se, it's a panic message frome one CPU while the other CPU was initializing. It looks like using a serial console exacerbated the race on the kernel console output and ended up dropping characters. Of the 'SMP: AP CPU #1 Launched!' message you seem to have 'MP: CPU1 Launced!' with 'p/sbin/initaittic_initialize()' mixed in with it. Hmm, are you booting verbose? Hmm, a verbose boot on my dual 600 with a non-serial console has this: SMP: AP CPU #1 Lstart_init: trying /sbin/init uanched! SMP: CPU1 apic_initialize(): ... So perhaps it is not panicing, but just the usual console mangling resulting from multiple processors writing to the kernel console at the same time. I'm not sure why it hangs for you though.. > Is this supposed to work right now? Yes. > -matt -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: make kernel failure: pecoff: machine/lock.h
On 27-Feb-01 Leif Neland wrote: > This happens with both my custom and GENERIC kernel. > > It has failed for some days, and also with source cvsup'ed today. > A kernel built with "make buildkernel -k" works... > > Leif Have you tried running make depend? -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: make kernel failure: pecoff: machine/lock.h
On 27-Feb-01 Leif Neland wrote: > > > On Tue, 27 Feb 2001, Gary Jennejohn wrote: > >> John Baldwin writes: >> > >> > On 27-Feb-01 Leif Neland wrote: >> > > This happens with both my custom and GENERIC kernel. >> > > >> > > It has failed for some days, and also with source cvsup'ed today. >> > > A kernel built with "make buildkernel -k" works... >> > > >> > > Leif >> > >> > Have you tried running make depend? >> > >> >> >> Failing that, trying deleting your /sys/compile/ directory >> and re-config'ing your kernel. This has always worked for me. >> > I'm building the kernel "the new way", ie cd /usr/src > make buildkernel KERNCONF= > > So the kernel is build in /usr/obj/usr/src/sys/GENERIC > > I deleted this, which buildkernel does itself, and config'ing it does too, > and as I expected, it didn't make any difference. > > Leif Ok. It may be that we are overflowing the kernel stack and corrupting the pcb in the process. One idea atm is to move the pcb off of the stack (since it stores persistent data it's a bad place for it anyways) and to add a red zone at the bottom of the stack to catch overflows. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: make kernel failure: pecoff: machine/lock.h
On 27-Feb-01 Leif Neland wrote: > > > On Tue, 27 Feb 2001, John Baldwin wrote: > >> >> On 27-Feb-01 Leif Neland wrote: >> > >> > >> > On Tue, 27 Feb 2001, Gary Jennejohn wrote: >> > >> >> John Baldwin writes: >> >> > >> >> > On 27-Feb-01 Leif Neland wrote: >> >> > > This happens with both my custom and GENERIC kernel. >> >> > > >> >> > > It has failed for some days, and also with source cvsup'ed today. >> >> > > A kernel built with "make buildkernel -k" works... >> >> > > >> >> > > Leif >> >> > >> >> > Have you tried running make depend? >> >> > >> >> >> >> >> >> Failing that, trying deleting your /sys/compile/ directory >> >> and re-config'ing your kernel. This has always worked for me. >> >> >> > I'm building the kernel "the new way", ie cd /usr/src >> > make buildkernel KERNCONF= >> > >> > So the kernel is build in /usr/obj/usr/src/sys/GENERIC >> > >> > I deleted this, which buildkernel does itself, and config'ing it does too, >> > and as I expected, it didn't make any difference. >> > >> > Leif >> >> Ok. It may be that we are overflowing the kernel stack and corrupting the >> pcb >> in the process. One idea atm is to move the pcb off of the stack (since it >> stores persistent data it's a bad place for it anyways) and to add a red >> zone >> at the bottom of the stack to catch overflows. >> > Do you really thinks it is something this complicated? > To me it just sounds like a makefile bug, as going to the pecoff directory > and typing make gives the same error. But what do I know... Oh, crossed wires. I was referring to the 'ltr' panics. Umm, you should only get this error if you have a stale .depend file. Note that config -r doesn't exist anymore, so it actually doesn't get automatically deleted by config or buildkernel. Can you build a kernel the old way? > Leif -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: make kernel failure: pecoff: machine/lock.h
On 28-Feb-01 Bruce Evans wrote: > On Tue, 27 Feb 2001, John Baldwin wrote: > >> Ok. It may be that we are overflowing the kernel stack and corrupting the >> pcb >> in the process. One idea atm is to move the pcb off of the stack (since it >> stores persistent data it's a bad place for it anyways) and to add a red >> zone >> at the bottom of the stack to catch overflows. > > Most of the pcb actually has the same persistence as the kernel stack > (both mainly store the process's context while the process is in the > kernel). But it is silly to put the pcb below the stack instead of > above it. Perhaps the idea is to get a panic sooner when something > is corrupted. That is the idea. Not all of the pcb is just used while in the kernel. The pcb_ext that points to a TSS on the i386 for example. The problem I think people are having with the ltr panic is that the stack gets deep enough to overwrite that field of the pcb, and we die later on when we try to access an invalid pointer there. Perhaps pcb_ext, pcb_ldt, and other things that are persistent across kernel entry/exit should be stored in p_md instead of p_addr. However, I would like the machine to panic when it overflows the stack rather than trash the pcb, yes. > Bruce -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: lock order reversal under -current
On 28-Feb-01 Michael Reifenberger wrote: > Hi, > with -current sources (as of -now) I get during startup: > > lock order reversal > 1st vnode interlock last acquired @ ../../kern/vfs_vnops.c:625 > 2nd 0xc0306840 mntvnode @ ../../ufs/ffs/ffs_vfsops.c:940 > 3rd 0xcbd20a0c vnode interlock @ ../../ufs/ffs/ffs_vfsops.c:949 > 32 > > Is that bad? Yes and no. It's a bug yes, but it has probably been around since at least 4.4BSD, so you can ignore it for now. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Never Mind: Re: (device hints) okay, what's wrong with this?
On 01-Mar-01 Matthew Jacob wrote: > Clearly I'm going blind. Sorry for the noise. You're not the only one. Was hard for me to see it as well. :) >> I have 3 hints in /boot/device.hints >> >> hint.isp.0.portwnn="w5000" >> hint.isp.0.nodewnn="w50000001" >> hint.isp.0.role=3 >> >> >> resource_get_int picks up 'hint.isp.0.role=3' with no problem. >> resource_get_string fails to pick up either hint.isp.0.portwnn >> or hint.isp.0.nodewwn. Nor does a getenv on a handcrafted >> "hint.isp.0.portwwn" string work. What am I doing wrong? >> >> What's this? >> >> -matt >> >> >> > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: New entropy harvesting sysctl's enabled in rc
On 01-Mar-01 Doug Barton wrote: > I was unable to test the ppp bits, but I've every reason to believe that > this will work. Comments and suggestions are welcome. The goal is to turn > on the appropriate harvesters for ethernet, and/or ppp/slip/tun based on > the presence of a configured device of that nature. So, the ethernet bits > check to see if there is an ethernet card configured, and turns on that > harvester if so. The same should be true for the ppp harvester, based on > the suggestions I received for detecting whether a tun device is or will be > in use. Erm, it doesn't hurt to turn the sysctl on if no device is present, so why not just turn it on w/o bothering to check? If there's no ether device then having the sysctl on or off doesn't change anything except that if one plugs in an ether device later or kldloads a driver later entropy won't be collected, where as if you just turn the sysctl on if network entropy is enabled regardless of what devices are configured it will all just work and DTRT. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.Baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: undefined reference _mtx_assert
On 01-Mar-01 Alexander Leidinger wrote: > Hi, > > since 2 or 3 days I can't build a new kernel: Add INVARIANT_SUPPORT to your kernel config, and you should be reading your cvs-all and -current mail. :) -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.Baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: System hangs with -current ...
On 01-Mar-01 The Hermit Hacker wrote: > > any comments on this? any way of doing this without a serial console? > > thanks ... The data is too much to make a normal console feasible, although you could try cranking up the console to hte highest res (80x60 or 132x60, etc.) you can and let it freeze and then write down those 60 lines adn maybe that will be enough to figure it out. However, if its looping this won't work. :( I've no idea atm why the serial console isn't working for you. > On Wed, 28 Feb 2001, The Hermit Hacker wrote: > >> >> Yup, definitely doesn't like me using the console ... just tried it again, >> and its as if it can't scroll up the screen to send more data or >> something? >> >> I just rebooted, and then ssh'd in from remote ... type'd the two sysctl >> commands, and got: >> >> cpu1 ../../i386/i386/trap.c.181 GOT (spin) sched lock [0xc0320f20] r=0 at >> ../../i386/i386/trap.c:181 >> cpcsocp/../i386/i386/trap.c.217 REL (spin) sched l >> >> on my screen ... type'd exactly as seen ... and that's it ... console is >> now locked again ... >> >> On Tue, 27 Feb 2001, The Hermit Hacker wrote: >> >> > >> > Okay, can't seem to find a 9pin->9pin NULL modem cable in this 'pit of the >> > earth' town, so figured I'd do the sysctl commands on my console and use >> > an ssh connection into the machine to run the 'hanging sequence' ... the >> > console flashed a bunch of 'debugging info' and then hung solid ... I >> > could still login remotely and whatnot, type commands, just nothing was >> > happening on the console, couldn't change vty's, nothing ... >> > >> > is it supposed to do that? *raised eyebrow* >> > >> > On Thu, 22 Feb 2001, John Baldwin wrote: >> > >> > > >> > > On 23-Feb-01 The Hermit Hacker wrote: >> > > > On Thu, 22 Feb 2001, John Baldwin wrote: >> > > > >> > > >> >> > > >> On 22-Feb-01 The Hermit Hacker wrote: >> > > >> > >> > > >> > Okay, I have to pick up a NULL modem cable tomorrow and dive into >> > > >> > this ... >> > > >> > finally ... >> > > >> > >> > > >> > The various KTR_ that you mention below, these are kernel settings >> > > >> > that I >> > > >> > compile into the kernel? >> > > >> >> > > >> Yes. You want this: >> > > >> >> > > >> options KTR >> > > >> options KTR_EXTEND >> > > >> options KTR_COMPILE=0x1208 >> > > > >> > > > okay, just so that I understand ... I compile my kernel with these >> > > > options, and then run the two sysctl commands you list below? the >> > > > KTR_COMPILE arg looks similar to the ktr_mask one below, which is why >> > > > I'm >> > > > confirming ... >> > > >> > > Yes. KTR_COMPILE controls what KTR tracepoints are actually compiled >> > > into >> > > the kernel. The ktr_mask sysctl controls a runtime mask that lets you >> > > choose >> > > which of the compiled in masks you want to enable. I have manpages for >> > > this >> > > stuff, but they are waiting for doc guys to review them. >> > > >> > > >> The mtx_quiet.patch is old and won't apply to current now I'm afraid. >> > > >> >> > > >> > On Tue, 2 Jan 2001, John Baldwin wrote: >> > > >> > >> > > >> >> >> > > >> >> On 02-Jan-01 The Hermit Hacker wrote: >> > > >> >> > >> > > >> >> > Over the past several months, as others have reported, I've been >> > > >> >> > getting >> > > >> >> > system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, >> > > >> >> > but >> > > >> >> > ctl-alt-esc doesn't break me to the debugger ... >> > > >> >> > >> > > >> >> > I'm not complaining about the hangs, if I was overly concerned, >> > > >> >> > I'd run >> > > >> >> > -STABLE, but I'm wondering how one goes about providing debug >> > > >> >
Re: undefined reference _mtx_assert
On 02-Mar-01 Warner Losh wrote: > In message <[EMAIL PROTECTED]> John Baldwin writes: >: Add INVARIANT_SUPPORT to your kernel config, and you should be reading your >: cvs-all and -current mail. :) > > Is there a reason to not have INVARIANT_SUPPORT be default for a while > in -current? I'm already working on this. See www.freebsd.org/~jhb/patches/newkern.patch. This is verified to work on x86 just fine (it affects release, so it requires more testing than might seem obvious at first.) It should work on the alpha, but I can't get my alpha machine to stay up through a buildworld. Right now it is getting a page fault while holding a spinlock, which blows up since it tries to grab a sleep lock with interrupts disabled. Then again, it may just be at a raised IPL above 0 but not at IPL_HIGH. Not sure yet. *sigh* > Warner -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Scheduler panic
On 02-Mar-01 Jake Burkholder wrote: >> > On Sun, Feb 25, 2001 at 10:29:42PM -0800, Kris Kennaway wrote: >> > > This is on a UP system. >> > >> > Had another one of these, under the same conditions. Both times I was >> > running more(1) on a stdin stream which was generated by a "find | >> > grep | more" operation, and I suspended the process with ^Z, >> > triggering the panic. Perhaps this will help in tracking down the >> > root cause. >> >> I'm pretty sure I know what this is; I'll work up a patch tonight. >> > > Sorry this is taking so long. Its turned out to be a little more > complex to fix properly than I originally thought. We're going to > have to change the way one of the fields of struct proc (p_pptr) > is locked. The problem is that a process is getting preempted > when its not SRUN, which should be protected by the scheduler > lock so that the preemption can't occur. > > This is the best workaround I can think of: > > Index: kern/kern_intr.c > === > RCS file: /home/ncvs/src/sys/kern/kern_intr.c,v > retrieving revision 1.47 > diff -u -r1.47 kern_intr.c > --- kern/kern_intr.c2001/02/28 02:53:43 1.47 > +++ kern/kern_intr.c2001/03/02 02:28:08 > @@ -366,7 +366,7 @@ > */ > ithread->it_need = 1; > mtx_lock_spin(&sched_lock); > - if (p->p_stat == SWAIT) { > + if (p->p_stat == SWAIT && curproc->p_stat == SRUN) { > CTR1(KTR_INTR, __func__ ": setrunqueue %d", p->p_pid); > p->p_stat = SRUN; > setrunqueue(p); > > Jake Eek, this is wrong. We need to always put it on the runqueue, the trick is we just need to avoid the actual task switch. This is what I have here: @@ -369,7 +374,7 @@ CTR1(KTR_INTR, __func__ ": setrunqueue %d", p->p_pid); p->p_stat = SRUN; setrunqueue(p); - if (do_switch) { + if (do_switch && curproc->p_stat == SRUN) { saveintr = sched_lock.mtx_saveintr; mtx_intr_enable(&sched_lock); if (curproc != PCPU_GET(idleproc)) (Among other fixes.) I'll try and get this committed tonight if no one screams bloody murder. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Labeling Vinum partitions in the sysinstall(8) [patch]
On 02-Mar-01 Maxim Sobolev wrote: > Hi folks, > > I'm currently creating a Vinum(4) configuration wizard for sysinstall(8), > which > would simplify Vinum configuration procedure for the vinum newbies. So far I > finished a patch that allows create vinum partitions using sysinstall's > disklabel editor and would like to commit it. Please review attached patches. Heh, I wrote http://www.FreeBSD.org/~jhb/patches/sysinstall.vinum.patch probably a year ago now, but because it only changes the disklabel editor and doesn't add a full vinum configurator the patch was rejected. :-/ Hopefully you will have better luck than I did... > Thanks! > > -Maxim -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: mount: /dev/ad0s1e: File name too long
On 02-Mar-01 Edwin Culp wrote: > I just found something new in current. When I rebooted with todays current, > it > put me into single user with the following message: > > mount: /dev/ad0s1e: File name too long > > The problem seems to be the directory that I have been mounting it under for > a > couple of years. /var/ftp/release If I make it shorter like, /mnt. it > works > fine. > > Not a big deal, easy to work around and I haven't made a release since the > begining of February :-). Blame Adrian Chadd ([EMAIL PROTECTED]) :) Apparently the limit he's enforcing on mount names is rather short... :) > ed -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Problem with sio in -current ... possible cause of hangs?
On 03-Mar-01 The Hermit Hacker wrote: > > Morning all ... > > I'm trying to get my serial console to work on my desktop, and > appear to be failing miserably at even just getting it to accept a 'getty' > serial connection, let alone serial console ... > > First, my X/mouse runs on /dev/ttyd1 ... if I startx, my mouse > does work, but X hangs *very* quickly. Based on this, I know that > /dev/ttyd1 does work, at least for a short time. > > Now, to confirm ... a NULL modem cable *is* pin 2->3, 3->2, right? > rx->tx, tx->rx? I've tested the cable using a multi-meter, just to make > sure that it is doing what I expect ... > > If I plug my cable from /dev/ttyd0 -> /dev/ttyd1 on the same > machine, run getty on /dev/ttyd1 and use kermit to connect to /dev/cuaa0, > I get no response back, which is why I'm wondering about sio ... Try turning clocal off on the host and port you are running kermit on. Even then, I still have yet to get getty to work at all, it's always stuck in 'siodcd'. I've noticed via debugging output that the DCD change bit does raise for a read, but that teh DCD status bit stays at zero the entire time. The sio driver seems to ignore the change bit and only read the status bit, so it thinks DCD is never raised and hangs forever on open. Note that I can get a getty fine on a serial console, just not on a /dev/ttydX that's not also the serial console. :( I've had this problem since before PRE_SMPNG however. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Using serial console to debug system hangs ...
On 04-Mar-01 The Hermit Hacker wrote: > > Well, after some hurdles with getting the serial console to work, I've now > go it to work ... I put the two sysctl commands into a file so that I > could run it as a script: > >#!/bin/sh > sysctl -w debug.ktr.mask=0x1208 > sysctl -w debug.ktr.verbose=2 > > When I 'try' to run it, I get all the 'KTR'(?) messages on my serial > console, something is changing/happening so fast that my ssh connection > into the machine hangs before I finish typing in the shell script: > > > enable_kernel_debug: 3 lines, 72 characters. > thelab# !./ > Put the sysctl's and the command that hangs the machine into one script and run that one shell script.. > needless to say, running the command to hang the computer is proving > difficult :) > > Then again, if I do a cold boot of the machine, the messages stop > scrolling up the console, but a cut-n-paste of them is sort of illegible: > > > k0clo.c/k...c/.k4e3r8n /RkEeLr n(_scplionc)k .scch2e0d9 lRoEcLk > (s[p0xicn0) 32c1al1l8o0]ut r[=00x ca0t 31.d./8.2.0/] ker=r0n/ akte > r.n._c/l.o./ckk.ecrn:4/k3e8r > e_cpluo1c k..c/:.2.0/9k > rcnp/uk0e r.n./.l.o/ckke.rcn./3k5e0r nG_OcTl o(cskp.icn.)2 0s3c > hGeOdT l(oscpki n[)0 xccall2o1u1t8 0[]0 xrc=003 1adt8 2.0.]/ ..0/k aertn > ./.k/er..n_/ckllorcnk/.kecr:n35_0c > .ok.uc1: 2.0.3/ > .c/pkue0r.n/.k/.er./n_kcelrnoc/kke.crn.4_c38l ocRkEL. c(.s2p09in )R > ELs ch(sepd iln)oc kc al[0loxuct03 [210x18c003] 1rd=802 0a] t r=..0/ a..t > /.k.er/.n./.ckerenrn_c/lkoerckn_.ccl:4oc3k8 > :20u91 > .c.p/u.0. /.k.e/r.n.//kkeerrnn_/ckleorcnk_.ccl.o3c5k0. cG.O2T0 3( > sGpOiTn )( sspcihne)d claolclko u[t0 x[c00x3213118d08]2 0r]= 0r =a0t a.t. > ///.k.e/0ke/rkne/rkne_rcnl_occlko.cck:.3c5:02 > p3c > uc1p u.0. ///.k.e/rkne/rkne/rkne_rcnl_occlko.cck..4c3.82 RE LR > E(Ls p(isnp)i ns)c hceadl lloouctk [[00xxcc003312d1812800]] rr==00 aatt > ////kkeerrnn//kkeerrnn__cclloocckk..cc::24398 > > ccppuu01 ////kkeerrnn//kkeerrnn__cclloocckk..cc..230530 > GGOOTT ((ssppiinn)) csaclhleodu tl o[c0kx c[003x1cd0832201]1 8r0=]0 ra=t0 > .a.t > Hmm, it's colliding with itself a lot. Unfortunately, to make this useful over the serial console, you need to shut up all the sio lock messages. Hmmm, well for now try just using a 'debug.ktr.mask' of 0x1200 to skip all the mutex operations. If we need them later on, then I will try and get some other work done to make it easier to shut up certain mutexes in the log output without having to change each individual mutex operation. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Using serial console to debug system hangs ...
On 04-Mar-01 The Hermit Hacker wrote: > > Wow, that was painful ... after 2 hrs, I got as far as: Yeah, it spews out a lot of crap. :-/ You prolly want to use a 115200 serial console if at all possible. Should've mentioned that earlier.. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Using serial console to debug system hangs ...
On 04-Mar-01 The Hermit Hacker wrote: > On Sat, 3 Mar 2001, John Baldwin wrote: > >> >> On 04-Mar-01 The Hermit Hacker wrote: >> > >> > Wow, that was painful ... after 2 hrs, I got as far as: >> >> Yeah, it spews out a lot of crap. :-/ You prolly want to use a 115200 >> serial >> console if at all possible. Should've mentioned that earlier.. > > Okay, reading NOTES, it says that 9600 is the default ... does that mean I > should be able to attach at 115200 and it should auto-upgrade, or do I > have to recompile kernel iwth CONSPEED=115200 for this? Either recompile or use the loader tunable 'machdep.conspeed' I think. However, you'll probably want the bootstrap to work on teh serial console as well, in which case you need to set the speed in make.conf (see /etc/defaults/make.conf) and recompile and reinstall boot2 and the loader. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Problem with sio in -current ... possible cause of hangs?
On 04-Mar-01 Bruce Evans wrote: > On Sat, 3 Mar 2001, John Baldwin wrote: > >> On 03-Mar-01 The Hermit Hacker wrote: >> > If I plug my cable from /dev/ttyd0 -> /dev/ttyd1 on the same >> > machine, run getty on /dev/ttyd1 and use kermit to connect to /dev/cuaa0, >> > I get no response back, which is why I'm wondering about sio ... >> >> Try turning clocal off on the host and port you are running kermit on. Even > > Or just use /dev/cuaa*. Erm, well, that is what I use, but clocal defaults to off on those I think. >> then, I still have yet to get getty to work at all, it's always stuck in >> 'siodcd'. I've noticed via debugging output that the DCD change bit does >> raise >> for a read, but that teh DCD status bit stays at zero the entire time. The > > This seems to indicate a cabling problem. DCD and DCD change shouldn't > change for a read; they should only change when "carrier" changes. I meant 'read' as in a read of the status register containing the DCD. >> sio driver seems to ignore the change bit and only read the status bit, so >> it >> thinks DCD is never raised and hangs forever on open. Note that I can get a > > Very short transients in DCD would be missed by the driver. The DCD change > bit is for not missing transients, but in normal modem applications missing > transients is probably a feature. Anyway, it's not clear what state change > should occur in the driver if the DCD state hasn't changed when the driver > looks at it. > >> getty fine on a serial console, just not on a /dev/ttydX that's not also the >> serial console. :( I've had this problem since before PRE_SMPNG however. > > Certainly a hardware problem :-). Well, it seems to be now as serial console in the loader doesn't even work. The odd thing is that the same exact cables in the same exact setup worked on the SMPng tree a week before PRE_SMPNG because I used it for remote gdb and serial console. About a week before the commit I started getting the siodcd hangs and haven't been able to shake them ever since. :( > Bruce -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Using serial console to debug system hangs ...
On 04-Mar-01 The Hermit Hacker wrote: > On Sun, 4 Mar 2001, Manfred Antar wrote: > >> You have to recompile the boot stuff also, after changing the line in >> make.conf: >> # The default serial console speed is 9600. Set the speed to a larger value >> # for better interactive response. >> # >> BOOT_COMCONSOLE_SPEED= 57600 >> >> Then cd /sys/boot ; make depend all install. >> I forget if you then need to relabel the disk or not. >> ie : >> disklabel -B da0 > > this worked great, thanks ...disklabel wasn't required, it appears ... It is to update boot2. However, you don't have to do that if you don't want. You can just boot on a normal console then break into the loader and type 'set console=comconsole' to switch over to the serial console. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Using serial console to debug system hangs ...
On 04-Mar-01 The Hermit Hacker wrote: > On Sat, 3 Mar 2001, John Baldwin wrote: > >> >> On 04-Mar-01 The Hermit Hacker wrote: >> > >> > Wow, that was painful ... after 2 hrs, I got as far as: >> >> Yeah, it spews out a lot of crap. :-/ You prolly want to use a 115200 >> serial >> console if at all possible. Should've mentioned that earlier.. > > Okay, I'm up to 115200, still a lot of crap ;) Any lower mask I can set > things at to get enough info, without so much flowing up the screen? I > can understand why she's so unresponsive, just wondering if there is a way > of reducing the amount of debugging without losing too much ... Well. Currnetly what 0x1200 logs is KTR_PROC and KTR_INTR, so it logs all incoming threaded interrupts as well as context switches, etc. If the machine hagns, it is very helpful to know if a) we are still getting interrupts at all (such as clock interrupts, which I can tell beecause the softclock swi thread get scheduled to execute.) and b) what processes were doing what when the machien hung. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: More on system hangs ... IRQ related?
On 04-Mar-01 The Hermit Hacker wrote: > > Morning ... What exactly hangs the machine, just starting X? Can you get it to hang doing, say, a buildworld? -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: 5.0-20010304-CURRENT panics during boot on Sony Vaio
On 04-Mar-01 Tom Uffner wrote: > all of the snapshots since the 24th have exhibited this same or > very similar behavior. Does it happen for snapshots before the 24th? -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: More on system hangs ... IRQ related?
On 04-Mar-01 The Hermit Hacker wrote: > On Sun, 4 Mar 2001, Alex Zepeda wrote: > >> On Sun, Mar 04, 2001 at 03:18:39PM -0400, The Hermit Hacker wrote: >> >> >My sound card overlaps with fxp0, and fxp1 overlaps with the >> > HighPoint controller ... >> > >> >Grasping at straws here ... >> >> Ditch the HPT366, it's crap and will cause system instabilities with >> "fast" hard drives. I'm not sure what you have attached to it, but I've >> had problems with {either,both} an IBM ATA100 HDD and a Western Digital >> ATA66 drive attached. Apparently the ATA100 counterpart from HighPoint >> isn't so bad. > > Okay, everything in my box is SCSI, so I'm not suspecting the HPT... its > more the pcm & fxp that I'm thinking ... > > how does the OS handle having both devices hit simultaneously on the same > IRQ? The IRQ fires and is masked. We then run both handlers, one after the other, and when they have finished re-enable the IRQ. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: 5.0-20010304-CURRENT panics during boot on Sony Vaio
On 05-Mar-01 Tom Uffner wrote: > John Baldwin wrote: >> >> On 04-Mar-01 Tom Uffner wrote: >> > all of the snapshots since the 24th have exhibited this same or >> > very similar behavior. >> >> Does it happen for snapshots before the 24th? >> > no, it does not, at least not for the 5.0-20010210-CURRENT snap. > it boots from the floppies and once installed, from the disk. > > oh well, so much for the idea that it would be easier to get past > the libc change by installing a snapshot... Can you try cvsupping the src/sys tree one day at a time to see what day the kernel starts breaking for you? -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: More on system hangs ... IRQ related?
On 05-Mar-01 GH wrote: > On Sun, Mar 04, 2001 at 10:40:37PM -0800, some SMTP stream spewed forth: >> On Sun, Mar 04, 2001 at 11:42:22PM -0400, The Hermit Hacker wrote: >> > Okay, are there any known problems with the SB128 cards? Figuring that it >> > couldn't hurt to remove it, I did ... so far, X hasn't hung ... not >> >> Hum... interesting. I also have a PCI SB128 card and one hang when I was >> using mpg123 and then started a newfs. The SB128's IRQ isn't shared with >> anything else. But I was also having much hangs on heavy disk traffic. >> >> http://people.freebsd.org/~jhb/intr2.patch has fixed my problems so far. > > This seems to be my luck: > -- > The file > > http://people.freebsd.org/~jhb/intr2.patch > >does not exist at this server. > -- It's ~jhb/patches/intr2.patch, but it has been committed, so just cvusp. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
HEADSUP: Kernel b0rked for a little while..
On 07-Mar-01 John Baldwin wrote: > jhb 2001/03/06 18:59:54 PST > > Modified files: > sys/kern kern_sig.c > Log: > - Proc locking. Most of signal handling is now MP safe and doesn't require > Giant. The only exception is the CANSIGNAL() macro. Unlocking the proc > lock around sendsig() in trapsignal() is also questionable. Note that > the functions sigexit(), psignal(), and issignal() must be called with > the proc lock of the process in question held. postsig() and > trapsignal() should not be called with the proc lock held, but they > also do not require Giant anymore either. > - Remove spl's that are now no longer needed as they are fully replaced. > > Revision ChangesPath > 1.110 +163 -71 src/sys/kern/kern_sig.c Until I finish committing all the changes to lock the process when calling psignal() and sigexit(), a kernel with INVARIANTS will panic. I'll send another message when I'm done. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: HEADSUP: Kernel b0rked for a little while..
On 07-Mar-01 John Baldwin wrote: > > On 07-Mar-01 John Baldwin wrote: >> jhb 2001/03/06 18:59:54 PST >> >> Modified files: >> sys/kern kern_sig.c >> Log: >> - Proc locking. Most of signal handling is now MP safe and doesn't >> require >> Giant. The only exception is the CANSIGNAL() macro. Unlocking the proc >> lock around sendsig() in trapsignal() is also questionable. Note that >> the functions sigexit(), psignal(), and issignal() must be called with >> the proc lock of the process in question held. postsig() and >> trapsignal() should not be called with the proc lock held, but they >> also do not require Giant anymore either. >> - Remove spl's that are now no longer needed as they are fully replaced. >> >> Revision ChangesPath >> 1.110 +163 -71 src/sys/kern/kern_sig.c > > Until I finish committing all the changes to lock the process when calling > psignal() and sigexit(), a kernel with INVARIANTS will panic. I'll send > another message when I'm done. Ok, it should work again. I'll be making one or two more commits in a bit, but the kernel as it stands now should work fine. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Reproducable panic
On 07-Mar-01 Dag-Erling Smorgrav wrote: > At some point in the past 24 hours, someone broke the kernel so I can > no longer run linux-opera: > > root@des /var/crash# gdb -k > GNU gdb 4.18 > Copyright 1998 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-unknown-freebsd". > (kgdb) source ~des/kgdb > (kgdb) kernel 5 > IdlePTD 4034560 > initial pcb at 326020 > panicstr: from debugger > panic messages: > --- > panic: cpu_switch: not SRUN > panic: from debugger > Uptime: 7m24s > > dumping to dev ad0b, offset 131104 > dump ata0: resetting devices .. done > 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 > 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 > 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 > 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 > 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 > 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 > 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 > 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 > 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 > --- >#0 dumpsys () at ../../kern/kern_shutdown.c:478 > 478 if (dumping++) { > (kgdb) where >#0 dumpsys () at ../../kern/kern_shutdown.c:478 >#1 0xc019131b in boot (howto=260) at ../../kern/kern_shutdown.c:321 >#2 0xc01916e5 in panic (fmt=0xc02940b4 "from debugger") > at ../../kern/kern_shutdown.c:571 >#3 0xc011f1d5 in db_panic (addr=-1071226260, have_addr=0, count=-1, > modif=0xd06ffc34 "") at ../../ddb/db_command.c:433 >#4 0xc011f175 in db_command (last_cmdp=0xc02cd880, cmd_table=0xc02cd6e0, > aux_cmd_tablep=0xc0310cdc) at ../../ddb/db_command.c:333 >#5 0xc011f23a in db_command_loop () at ../../ddb/db_command.c:455 >#6 0xc0121403 in db_trap (type=3, code=0) at ../../ddb/db_trap.c:71 >#7 0xc0265ffe in kdb_trap (type=3, code=0, regs=0xd06ffd34) > at ../../i386/i386/db_interface.c:164 >#8 0xc0274a3b in trap (frame={tf_fs = 24, tf_es = 16, tf_ds = 16, > tf_edi = -798482176, tf_esi = 256, tf_ebp = -797966976, > tf_isp = -797967008, tf_ebx = 2, tf_edx = 1017, tf_ecx = 1021, > tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1071226260, tf_cs = > 8, > tf_eflags = 70, tf_esp = -1070857153, tf_ss = -1070967837}) > at ../../i386/i386/trap.c:608 >#9 0xc026626c in Debugger (msg=0xc02a53e3 "panic") at machine/cpufunc.h:60 >#10 0xc01916dc in panic (fmt=0xc0272cdd "cpu_switch: not SRUN") > at ../../kern/kern_shutdown.c:569 >#11 0xc0272cdd in sw0_2 () >#12 0xc019b3d8 in msleep (ident=0xc0356158, mtx=0xd0682100, priority=344, > wmesg=0xc02a7fd8 "poll", timo=201) at ../../kern/kern_synch.c:459 >#13 0xc01b00f3 in poll (p=0xd0681fe0, uap=0xd06fff80) > at ../../kern/sys_generic.c:927 >#14 0xc0276510 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, > tf_edi = 142872464, tf_esi = 2000, tf_ebp = 142872424, > tf_isp = -797966380, tf_ebx = 142872464, tf_edx = 2000, tf_ecx = 1, > tf_eax = 168, tf_trapno = 12, tf_err = 2, tf_eip = 678761248, > tf_cs = 31, tf_eflags = 647, tf_esp = 142872400, tf_ss = 47}) > at ../../i386/i386/trap.c:1184 >#15 0xc026690d in syscall_with_err_pushed () >#16 0x285cac35 in ?? () > (kgdb) p *(struct proc *)0xd0681fe0 Wrong process. This is the process that we just put to sleep. The panic is complaining about the new process we just chose to run. I'll try and add in an appropriate KASSERT() to the runqueue code to actually check this before we get into cpu_switch and print out what process we are switching to, what its p_stat is, etc. This may be my fault in the recent change to linux_machdep.c. Hmmm, nope: mtx_lock_spin(&sched_lock); p2->p_stat = SRUN; setrunqueue(p2); mtx_unlock_spin(&sched_lock); It should be fine. :( (The change was to linux_clone()). -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Panic mounting msdos fs
On 08-Mar-01 Andrea Campi wrote: > Yesterday -current: > ># mount /msdos > > Acquring duplicate lock of same type: "lockmgr interlock" > 1st @ ../../kern/kern_lock.c:239 > 2nd @ ../../kern/kern_lock.c:239 > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x0 > fault code = supervisor write, page not present > instruction pointer = 0x8:0xc015c3f5 > stack pointer = 0x10:0xc7821ce4 > frame pointer = 0x10:0xc7821cf0 > code segment= base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags= interrupt enabled, resume, IOPL = 0 > current process = 489 (mount_msdos) > kernel: type 12 trap, code=0 > Stopped at witness_exit+0x23d: movl%eax,0(%edx) > db> (kgdb) l *(witness_exit+0x23d) 0xc0200e41 is in witness_exit (../../kern/kern_mutex.c:1303). 1300if ((flags & MTX_NOSWITCH) == 0 && !mtx_legal2block() && !cold) 1301panic("switchable mtx_unlock() of %s when not legal @ %s:%d", 1302m->mtx_description, file, line); 1303LIST_REMOVE(m, mtx_held); 1304m->mtx_held.le_prev = NULL; 1305 } H. It dereferenced NULL in the LIST_REMOVE(). Did you get a dump by any chance? -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Panic mounting msdos fs
On 08-Mar-01 Andrea Campi wrote: > Yesterday -current: It seems modules are broken with witness right now. Before you were ok as long as you didn't unload the darn things, now they seem to be toast altogether, so you will want to use a static kernel for now until this is fixed. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Panic mounting msdos fs
On 09-Mar-01 Andrea Campi wrote: > Hi John, > > didn't have time to reply to your previous email (I was sleeping ;-)) before > getting this. Do you still need the dump? No, jlemon and I figured it out on irc. I'm not sure why it's broken yet. >> On 08-Mar-01 Andrea Campi wrote: >> > Yesterday -current: >> >> It seems modules are broken with witness right now. Before you were ok as >> long >> as you didn't unload the darn things, now they seem to be toast altogether, >> so >> you will want to use a static kernel for now until this is fixed. > > Doesn't it warrant a very big HEADSUP? Probably, they've already been partially broken, though the partial brokenness that I know how to fix will be fixed hopefully before too long. I'm not sure yet why sticking a mutex in a kld causes us to walk off a NULL pointer. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Entropy harvesting? Grim reaper is more like it...
On 09-Mar-01 Matthew Jacob wrote: > On Fri, 9 Mar 2001, Mark Murray wrote: > >> > I changed nothing from whatever the default is. It seems like a bit of >> > POLA to >> > freeze now. >> > >> > But I'll check this - if I can get that machine up again :-)... >> >> OK - if this is the entropy driver, then typing about 2 lines of shit >> will unlock it. > > Neither ^T or typing does anything. I'll have to do some surgery from another > boot disk to find out what's what. Uk. Erm, just so you know. The 4100 here at WC doesn't even make it past the SCSI probe due to interrupt issues. If it's running really up to date current, try changing sys/alpha/include/mutex.h to define mtx_intr_enable() to nothing, which will hackishly run ithreads with a raised IPL and might solve the problem if its an interrupt storm you are seeing. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: -CURRENT no longer boots
On 12-Mar-01 Alexander N. Kabaev wrote: > Do you have WITNESS_SKIPSPIN option in your kernel config? > > Here is what supposedly causing the trouble: > > a) the process p_spinlocks variable is initialized to one in fork1 during >the process creation > b) the sched_lock is released later in fork_exit, but the process' >p_spinlocks field is not decreased because sched_lock is not tracked by > the >witness subsystem > b) process tried to grab Giant but sees p_spinlocks > 0 ... instant panic :) c) As part of the new witness code, move p_spinlocks (well, a variation thereof) to be a per-CPU variable. > The quick and dirty fix is to either set debug.witness_skipspin=0 in > /boot/loader.conf or modify witness_enter function to ignore p_skipspin > counter > if debug.witness_skipspin is non-zero. Just don't use the skipspin stuff, it shouldn't hurt at all. The new witness code will hopefully be in by the end of the week. *crosses fingers* -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: panic trying to play Civillization (with trace, etc.)
On 12-Mar-01 Dag-Erling Smorgrav wrote: > Mikhail Teterin <[EMAIL PROTECTED]> writes: >> > If you can, please reproduce the panic on a kernel compiled with the >> > INVARIANTS, INVARIANT_SUPPORT and WITNESS options. >> Well, with this options on, the machine does not crash, but the >> program segfaults on startup: > > The trace you're showing looks like it's from a shell script that > starts civctp. I need to see the trace from the civctp binary itself. > >> lock order reversal >> 1st lockmgr interlock last acquired @ ../../kern/kern_lock.c:239 >> 2nd 0xcefa0520 process lock @ ../../kern/kern_sig.c:183 >> 3rd 0xc1029f80 lockmgr interlock @ ../../kern/kern_lock.c:560 > > Haven't seen this one before... If it's reproducible, could you do the > following: It's stupidness due to proctree and allproc locks being backed by lockmgr I think. I'm waiting on looking at this one until proctree and allproc are converted to sx locks. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: random as module needs work
On 13-Mar-01 Andrew Gallatin wrote: > Gdb says: > > (gdb) l* 0xfc42f824 > 0xfc42f824 is in name2oid (../../kern/kern_sysctl.c:621). > 616 *p = '\0'; > 617 > 618 oidp = SLIST_FIRST(lsp); > 619 > 620 while (oidp && *len < CTL_MAXNAME) { > 621 if (strcmp(name, oidp->oid_name)) { > 622 oidp = SLIST_NEXT(oidp, oid_link); > 623 continue; > 624 } > 625 *oid++ = oidp->oid_number; Perhaps static sysctls in modules are broken for some reason? The sysctls were all recently changed from dynamic to static. > When I boot into single user mode and try to load the module after boot, this > happens: > Enter full pathname of shell or RETURN for /bin/sh: ># kldload random > panic: cpu_fork: curproc This is a bug. For kernel threads, we fork off of proc0, not curproc, so that check in the alpha cpu_fork() is bogus. > syncing disks... > done > Uptime: 27s -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: new breakage in mounting root? a devfs issue?
On 13-Mar-01 Matthew Jacob wrote: > > To refresh memory: > >> fatal kernel trap: >> >> trap entry = 0x4 (unaligned access fault) >> a0 = 0xc3615fe1a88f382 >> a1 = 0x29 >> a2 = 0x1b >> pc = 0xfc467578 >> ra = 0xfc4627c4 >> curproc= 0xfe0009f5dbe0 >> pid = 1, comm = init >> >> Stopped at vfs_object_create+0x38: jsr >> ra,(pv),vfs_object_create+0x3c >> >> db> t >> vfs_object_create() at vfs_object_create+0x38 >> getnewvnode() at getnewvnode+0x564 >> devfs_allocv() at devfs_allocv+0xe0 >> devfs_root() at devfs_root+0x38 >> devfs_mount() at devfs_mount+0xf0 >> vfs_mount() at vfs_mount+0x910 >> mount() at mount+0xd8 >> syscall() at syscall+0x3f4 >> XentSys1() at XentSys1+0x10 > > > Interestingly enough, as Christian had also reported, a build of a GENERIC > kernel seems to solve this problem. > > This is almost more alarming than a potential bug in vfs_object_create- as > the > difference between the config file I was using should not cause this. > > *I* sure can't spot what config option might be different. I also had done a > complete removal of the build directory and complete fresh build of GPLUS. > (sounds of hair tearing). Can you possibly try to narrow the differences down by tring out various kernel configs in between GPLUS and GENERIC? -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: new breakage in mounting root? a devfs issue?
On 13-Mar-01 Matthew Jacob wrote: >> Can you possibly try to narrow the differences down by tring out various >> kernel >> configs in between GPLUS and GENERIC? > > Actually- look at the diffs at least and tell me which you think it might > be. All of the diffs are either kernel support flavors for alpha, which > shouldn't matter to devfs for a damn, or drivers. > > If nobody can get to this, I'll try and look at this further, but this is > looking very very strange. It could be some driver screwing up with makedev() though one would think we'd have hit that before now. It could be something really odd relating to the size of the kernel. It could be the maxuers change resulting in kernel memory being laid out differently. *shrug* I didn't see anything in that diff that would have broken this either. Does it happen w/o devfs? I'm updating my alpha to today's current, so perhaps I'll run into this here. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Latest version of mega header file POSIX update
On 15-Mar-01 Garrett Wollman wrote: > The patch has now gotten too large for some e-mail systems, so I'm > making it available via the Web at > <http://khavrinen.lcs.mit.edu/includes.patch>. I don't think the sys/conf/Makefile.i386 change is needed. :) Nothing else jumped out at me while I glanced over it however, and it seems fine at first glance. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Latest version of mega header file POSIX update
On 16-Mar-01 Garrett Wollman wrote: > < > said: > >> Nothing else jumped out at me while I glanced over it however, and >> it seems fine at first glance. > > But did you *test* it? I know it compiles. No, not yet. I can try it out on my SMP and alpha testboxes here, though my the witness_exit panic deadlocks my alpha under heavy load. :-P > -GAWollman -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: very strange problem with ps
On 17-Mar-01 David Malone wrote: > On Fri, Mar 16, 2001 at 06:21:46PM -0800, Brooks Davis wrote: >> Ah, you are correct. I should have tried that. What a strange bug. > > It happens for any option which causes the sysctl to return no > processes to libkvm. (Try ps -p 10). I think the following > patch should fix the problem. > > (Kirk changed the way the struct proc size was checked, and the > old way happened to work OK if no data was returned. Kirk, should > I go ahead and commit this?) > > David. I actually prefer the ESRCH patch as a) it better describes what happens and b) it returns a proper error when no processes are found, making it easier for other programs to detect this error condition. Programs should already be checking for a error return from the sysctlbyname() that they use to get this (or else they allow for kvm to inform them of errors) and thus won't need to add in special case checks for 'size > 0'. errno is the standard way of returning errors after all. :) -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Interesting backtrace...
On 18-Mar-01 Dag-Erling Smorgrav wrote: > I finally caught a backtrace from one of those recurring stack smash > panics. I've been getting a few of these every day for a couple of > weeks now but never caught a dump; I caught this one by typing 'panic' > immediately instead of trying to get a trace at the ddb prompt first. > > These panics invariably start like this (always the same eip): > > kernel: type 12 trap, code=0 > Stopped at -0xfc81:kernel: type 12 trap, code=0 > db> > > Anyway, here's the backtrace: >#12 0xc023c8bb in vm_fault (map=0xd0768a00, vaddr=138502144, > fault_type=2 '\002', fault_flags=8) at ../../vm/vm_page.h:493 pmap_zero_page(VM_PAGE_TO_PHYS(m)); Can you throw some extra tests in there to make sure m isn't NULL? Also, you might want to check VM_PAGE_TO_PHYS(m) for any weird values. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Here's another one for you...
On 19-Mar-01 Dag-Erling Smorgrav wrote: > SMP box with a bleeding-edge -CURRENT kernel, patched to avoid the > i586_bzero() problem: > > panic: mutex_enter: recursion on non-recursive mutex process lock @ > ../../i386/i386/trap.c:854 > cpuid = 1; lapic.id = 0100 > Debugger("panic") That's a later symptom of a problem. We recursed on the proc lock doing the PHOLD before we handled the page fault. > CPU1 stopping CPUs: 0x0001... stopped. > Stopped at Debugger+0x45: pushl %ebx > db> show mutex > "panic" (0xc030b1e0) locked at ../../kern/kern_shutdown.c:544 > "process lock" (0xd3f15000) locked at ../../i386/i386/machdep.c:625 This is in sendsig(): p = curproc; PROC_LOCK(p); psp = p->p_sigacts; if (SIGISMEMBER(psp->ps_osigset, sig)) { ... > "Giant" (0xc0309ac0) locked at ../../i386/i386/trap.c:1169 > db> trace > Debugger(c027d5e1) at Debugger+0x45 > panic(c027c420,c027a154,c02997d0,356,d3f14ee0) at panic+0x144 > witness_enter(d3f15000,0,c02997d0,356) at witness_enter+0x355 > trap_pfault(d7345d4c,0,0) at trap_pfault+0x143 > trap(18,10,10,d7345fa8,0) at trap+0x978 > calltrap() at calltrap+0x5 > --- trap 0xc, eip = 0, esp = 0xd7345d8c, ebp = 0xd7345ed8 --- > (null)(805c3e0,e,d7345f10,0,4) at 0 > postsig(e) at postsig+0x40b Hmmm. An eip of 0 is bad. This could be just another instance of the bzero bug just in another place. You probably want to change the code that actually sets *bzero to i586_bzero (and same for any other ops that use floating point). The code in question for this lies in i386/isa/npx.c. It seems we use the fp regs for copyin/copyout and bcopy as well. I would just change line 458 of npx.c to say '#ifdef I586_CPU_XXX' for now as your temporary patch (then you don't need to patch pmap_zero_page() anymore.) -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: CURRENT instability
On 19-Mar-01 Pierre Beyssac wrote: > On Mon, Mar 19, 2001 at 10:16:00PM +0100, Dag-Erling Smorgrav wrote: >> Dag-Erling Smorgrav <[EMAIL PROTECTED]> writes: >> > Try this workaround (apply with 'patch -l'): >> >> Here's a better workaround. Revert the previous patch and apply this >> one: > > Ok, thanks, note that your previous patch works fine, at least my > make world is still running :-) The previous patch is not sufficient. It only fixes one instance of bzero, but currently all instances of bzero, bcopy, copyin, and copyout are broken on the 586 and his second patch fixes all of them. > I'll try this one ASAP. > >> +#ifdef I586_CPU_DOES_NOT_WORK > -- > Pierre Beyssac[EMAIL PROTECTED] -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Whatever happened to CTM?
On 20-Mar-01 Michael C . Wu wrote: > On Tue, Mar 20, 2001 at 02:07:13AM +0200, Vladimir Kushnir scribbled: >| Is there anything wrong with our CTM system now? There doesn't seem to be >| any deltas (either src-cur, or ports-cur) since Mar 12 :-( > > For all connections greater than 9600baud modems, we recommend > using CVSup to get src-all and ports-all updated. At the worst case, > be able to CVSup a ports-all collection within an hour, with heavy > packet loss and low bandwidth. > > i.e. CTM sucks, don't use it. :) cvsup is not available via e-mail for those who may only have e-mail access for one reason or another. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Here's another one for you...
On 20-Mar-01 Bruce Evans wrote: > On Mon, 19 Mar 2001, John Baldwin wrote: > >> Hmmm. An eip of 0 is bad. This could be just another instance of the bzero >> bug just in another place. You probably want to change the code that >> actually >> sets *bzero to i586_bzero (and same for any other ops that use floating >> point). >> The code in question for this lies in i386/isa/npx.c. It seems we use the >> fp >> regs for copyin/copyout and bcopy as well. I would just change line 458 of >> npx.c to say '#ifdef I586_CPU_XXX' for now as your temporary patch (then you >> don't need to patch pmap_zero_page() anymore.) > > There is no need to change anything. Just disable the fp optimizations > using the npx flags. That works, too, but until i586_* are fixed they need to default to off, not to on. :) I'm not suggesting committing this, just suggesting a local hack for testing anyways. > Bruce -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: hello
On 21-Mar-01 zer0byte wrote: > hi i just switch to FreeBSD 5.0 current.. > just wanna let you guys know =) You sure you want to do that? :) -current isn't very friendly right now. You have read about what current is in the handbook, right? -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: World breaks in sbin/fsdb [PATCH]
On 21-Mar-01 Ollivier Robert wrote: > cvs diff: Diffing . > Index: fsdbutil.c > === > RCS file: /home/ncvs/src/sbin/fsdb/fsdbutil.c,v > retrieving revision 1.10 > diff -u -2 -r1.10 fsdbutil.c > --- fsdbutil.c 2000/05/01 20:01:16 1.10 > +++ fsdbutil.c 2001/03/21 13:42:01 > @@ -35,4 +35,5 @@ > > #include > +#include > #include > #include > @@ -43,4 +44,5 @@ > > #include > +#include > > #include "fsdb.h" > > -- > Ollivier ROBERT -=- Eurocontrol EEC/ITM -=- > [EMAIL PROTECTED] > FreeBSD caerdonn.eurocontrol.fr 5.0-CURRENT #46: Wed Jan 3 15:52:00 CET 2001 If it fixes it, please commit. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: boom in a syscalll
On 21-Mar-01 Matthew Jacob wrote: > Fatal trap 12: page fault while in kernel mode > cpuid = 0; lapic.id = > fault virtual address = 0x14 NULL pointer deref.. > fault code = supervisor read, page not present > instruction pointer = 0x8:0xc019d7fa > stack pointer = 0x10:0xc8f45ea4 > frame pointer = 0x10:0xc8f45eb0 > code segment= base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags= resume, IOPL = 0 > current process = 331 (bash) > kernel: type 12 trap, code=0 > > CPU0 stopping CPUs: 0x0002... stopped. > Stopped at propagate_priority+0x6e:cmpl0x14(%esi),%ebx > db> t > propagate_priority(c8b4c100) at propagate_priority+0x6e (kgdb) l *propagate_priority+0x6e 0xc019fdce is in propagate_priority (../../kern/kern_mutex.c:201). 201 MPASS(p->p_magic == P_MAGIC); Well, err, maybe this line, considering none of the rest of my line numbers match up I doubt it is this one. :( Could you pull up kgdb on your kernel.debug and find out which line this died in? It could either be that p is NULL (which could be an unintialized mutex that a mtx_lock later blocked on.) In fact, this is quite likely just using a mutex that hasn't been init'd. Might want to add in some code to try to display the description of a mutex if p == NULL (though it is probably invalid, too.) Another take might be to add assertions to the start of mtx_lock_flags() and mtx_lock_spin_flags() that panic if mtx_lock == 0. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: FOLLOWUP: RE: boom in a syscalll
On 22-Mar-01 Matthew Jacob wrote: > > Followup- updated kernel, rebuilt, and the same thing that triggered this > before (^Z in vi) happened again, but this time with a different traceback: I wish I could reproduce this. :( > Fatal trap 12: page fault while in kernel mode > cpuid = 0; lapic.id = > fault virtual address = 0x14 > fault code = supervisor read, page not present > instruction pointer = 0x8:0xc019d9f2 > stack pointer = 0x10:0xc7fc5f30 > frame pointer = 0x10:0xc7fc5f3c > code segment= base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags= resume, IOPL = 0 > current process = 19 (irq2: fxp0 isp1) > kernel: type 12 trap, code=0 > > CPU0 stopping CPUs: 0x0002... stopped. > Stopped at propagate_priority+0x6e:cmpl0x14(%esi),%ebx Same place as last time. It looks like a mutex is either being used before it is initialized or being zero'd, or that curproc is NULL at some point. :( If a mutex is getting zero'd, then that might explain the other panics in witness_exit(). -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: newcard/cardbus instabilities
On 23-Mar-01 Bill Paul wrote: >> >> That's a bit ugly. >> >> > xl0: <3Com 3c575C Fast Etherlink XL> port 0x3000-0x307f mem >> > 0x4402-0x4403,0x44002480-0x440024ff,0x44002400-0x4400247f irq 10 >> > at >> > device 0.0 on cardbus1 >> > xl0: chip is in D6 power mode -- setting to D0 >> >> I'm a bit worried about this; "D6" doesn't really exist, so it's possible >> that something is going wrong here. >> >> Bill; you might have some better ideas than I do. Suggestions? > > My suggestion? Chop out the power management stuff in xl_attach() > and see what happens. The xl driver is using the pci_get_powerstate() > and pci_set_powerstate() routines right now in order to check for PCI > NICs that have been forced into the D3 state by Windoze during shutdown. > However, those functions are internal to the PCI bus code, and I'm not > sure what will happen when you try to use them with devices that are > children of a cardbus bus. > > So, edit /sys/pci/if_xl.c, find the xl_attach() function, and comment > out/#ifdef out/delete the section that checks the power state of the card. > Like Mike says, the D6 state is bogus. > > Unfortunately, I can't test this myself at the moment since I find myself > without a laptop. I might be able to coerce^Wconvince John Baldwin to > let me test this with his though. You'll have to coerce^Wconvince Mike or Warner to fix cardbus resource allocation so that cardbus cards don't try to stomp on PCI devices on my machine and freeze it until I eject the card. :) > -Bill -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: spin lock panic ...
On 24-Mar-01 The Hermit Hacker wrote: > > Over the past few days, I reformatted my computer clean, installed > 4.2-RELEASE onto her and just finished upgrading to the latest 5.x kernel > and world ... went to ports/x11/XFree86-4 and did a 'make install' ... > after awhile, it panic'd as below: > > panic: spin lock sched lock held by 0xcb332840 for > 5 seconds > cpuid = 1; lapic.id = 0100 > Debugger("panic") > > CPU1 stopping CPUs: 0x0001... stopped. > > Now, I do have DDB enabled in the kernel ... but, for the life of me, I > can't seem to find any docs on the keystroke required to drop into it :( > 'man ddb' doesn't document it that I can find, nor does LINT ... Uhh, if it panic'd, it should already be in ddb at a db> prompt assuming you have DDB in your kernel. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: spin lock panic ...
On 24-Mar-01 The Hermit Hacker wrote: > On Fri, 23 Mar 2001, John Baldwin wrote: > >> >> On 24-Mar-01 The Hermit Hacker wrote: >> > >> > Over the past few days, I reformatted my computer clean, installed >> > 4.2-RELEASE onto her and just finished upgrading to the latest 5.x kernel >> > and world ... went to ports/x11/XFree86-4 and did a 'make install' ... >> > after awhile, it panic'd as below: >> > >> > panic: spin lock sched lock held by 0xcb332840 for > 5 seconds >> > cpuid = 1; lapic.id = 0100 >> > Debugger("panic") >> > >> > CPU1 stopping CPUs: 0x0001... stopped. >> > >> > Now, I do have DDB enabled in the kernel ... but, for the life of me, I >> > can't seem to find any docs on the keystroke required to drop into it :( >> > 'man ddb' doesn't document it that I can find, nor does LINT ... >> >> Uhh, if it panic'd, it should already be in ddb at a db> prompt assuming you >> have DDB in your kernel. > > didn't ... last line was as above ... I've even tested my DDB to make sure > I can get to the db> prompt after alfred reminded me of the ctl-alt-esc to > get there, and it works ... Well, that meants it most likely deadlocked trying to get into the debugger. I haven't seen any deadlocks like this in months though. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: panic: resource_list_alloc
On 25-Mar-01 The Hermit Hacker wrote: > > just upgraded my tree and did a reinstall ... trace is: > > resource_list_alloc(c0d9eec0,c0d90180,c0d99b80,4,c0d4a30c) at > resource_list_alloc+0xd3 > isa_alloc_resource() @ +0xd0 > bus_alloc_resource() @ +0x5f > opti_detect @ +0x99 This is the second such one I've seen in opti_detect. I'm guessing that 'mms' is a sound driver? If so, can you try taking it out of your kernel to see if that is what is causing the panic? > mss_detect @ +0x52 > mss_probe @ +0x30a > device_probe_child @ +0xca > device_probe_and_attach @ +0x41 > isa_probe_children @ +0xde > configure @ +0x32 > mi_startup @ +0x6e > begin @ +0x29 > > Marc G. Fournier ICQ#7615664 IRC Nick: > Scrappy > Systems Administrator @ hub.org > primary: [EMAIL PROTECTED] secondary: > scrappy@{freebsd|postgresql}.org > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: panic: resource_list_alloc
On 25-Mar-01 The Hermit Hacker wrote: > > doing so right now ... one quick/stupid question ... how does one > 'reinstall' a new kernel so that you don't lose the /boot/kernel.old (aka > backup that worked)? I've been moving files around before installing the > rebuilt kernel, but that doesn't sound very efficient ... :) > > thanks .. # cd /sys/compile/FOO # < build like normal > # make reinstall Also, if you want to be super careful, keep a /boot/kernel.good lying around that has a known working kernel in it. > On Sat, 24 Mar 2001, John Baldwin wrote: > >> >> On 25-Mar-01 The Hermit Hacker wrote: >> > >> > just upgraded my tree and did a reinstall ... trace is: >> > >> > resource_list_alloc(c0d9eec0,c0d90180,c0d99b80,4,c0d4a30c) at >> > resource_list_alloc+0xd3 >> > isa_alloc_resource() @ +0xd0 >> > bus_alloc_resource() @ +0x5f >> > opti_detect @ +0x99 >> >> This is the second such one I've seen in opti_detect. I'm guessing that >> 'mms' >> is a sound driver? If so, can you try taking it out of your kernel to see >> if >> that is what is causing the panic? >> >> > mss_detect @ +0x52 >> > mss_probe @ +0x30a >> > device_probe_child @ +0xca >> > device_probe_and_attach @ +0x41 >> > isa_probe_children @ +0xde >> > configure @ +0x32 >> > mi_startup @ +0x6e >> > begin @ +0x29 >> > >> > Marc G. Fournier ICQ#7615664 IRC Nick: >> > Scrappy >> > Systems Administrator @ hub.org >> > primary: [EMAIL PROTECTED] secondary: >> > scrappy@{freebsd|postgresql}.org >> > >> > >> > To Unsubscribe: send mail to [EMAIL PROTECTED] >> > with "unsubscribe freebsd-current" in the body of the message >> >> -- >> >> John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ >> PGP Key: http://www.baldwin.cx/~john/pgpkey.asc >> "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ >> > > Marc G. Fournier ICQ#7615664 IRC Nick: > Scrappy > Systems Administrator @ hub.org > primary: [EMAIL PROTECTED] secondary: > scrappy@{freebsd|postgresql}.org > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: panic: resource_list_alloc
On 25-Mar-01 The Hermit Hacker wrote: > > removing pcm fixes the panic, it appears ... David O`Brien just confirmed it on his box as well. > On Sat, 24 Mar 2001, John Baldwin wrote: > >> >> On 25-Mar-01 The Hermit Hacker wrote: >> > >> > just upgraded my tree and did a reinstall ... trace is: >> > >> > resource_list_alloc(c0d9eec0,c0d90180,c0d99b80,4,c0d4a30c) at >> > resource_list_alloc+0xd3 >> > isa_alloc_resource() @ +0xd0 >> > bus_alloc_resource() @ +0x5f >> > opti_detect @ +0x99 >> >> This is the second such one I've seen in opti_detect. I'm guessing that >> 'mms' >> is a sound driver? If so, can you try taking it out of your kernel to see >> if >> that is what is causing the panic? >> >> > mss_detect @ +0x52 >> > mss_probe @ +0x30a >> > device_probe_child @ +0xca >> > device_probe_and_attach @ +0x41 >> > isa_probe_children @ +0xde >> > configure @ +0x32 >> > mi_startup @ +0x6e >> > begin @ +0x29 >> > >> > Marc G. Fournier ICQ#7615664 IRC Nick: >> > Scrappy >> > Systems Administrator @ hub.org >> > primary: [EMAIL PROTECTED] secondary: >> > scrappy@{freebsd|postgresql}.org >> > >> > >> > To Unsubscribe: send mail to [EMAIL PROTECTED] >> > with "unsubscribe freebsd-current" in the body of the message >> >> -- >> >> John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ >> PGP Key: http://www.baldwin.cx/~john/pgpkey.asc >> "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ >> > > Marc G. Fournier ICQ#7615664 IRC Nick: > Scrappy > Systems Administrator @ hub.org > primary: [EMAIL PROTECTED] secondary: > scrappy@{freebsd|postgresql}.org > -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: top output broked?
On 27-Mar-01 Alfred Perlstein wrote: > > PID USERNAME PRI NICE SIZERES STATE C TIME WCPUCPU COMMAND > 824 root -80 1048K 596K biord 0 0:38 0.00% 0.00% find > 385 root 40 32740K 31944K select 1 0:32 0.00% 0.00% XFree86 > 836 root -80 532K 276K biord 1 0:07 0.00% 0.00% nfsd > 14848 root 960 26912K 26832K RUN1 0:04 0.00% 0.00% ld > 424 bright 40 2120K 1340K select 0 0:04 0.00% 0.00% rxvt > > > no cpu time, known issue? Not one that I've seen: PID USERNAME PRI NICE SIZERES STATE C TIME WCPUCPU COMMAND 11 root -160 0K 0K CPU0 0 79.5H 49.37% 49.37% idle: cpu0 10 root -160 0K 0K RUN1 79.4H 48.19% 48.19% idle: cpu1 13 root -48 -167 0K 0K WAIT 0 62:53 0.00% 0.00% swi6: tty:s 15 root 760 0K 0K sleep 0 6:07 0.00% 0.00% random 5 root 200 0K 0K syncer 1 2:47 0.00% 0.00% syncer 20 root -68 -187 0K 0K WAIT 1 1:18 0.00% 0.00% irq18: fxp0 19 root -64 -183 0K 0K WAIT 0 0:53 0.00% 0.00% irq16: ahc0 12 root -44 -163 0K 0K WAIT 0 0:52 0.00% 0.00% swi1: net 18 root -36 -155 0K 0K WAIT 1 0:49 0.00% 0.00% swi3: cambi 4 root -160 0K 0K psleep 0 0:41 0.00% 0.00% bufdaemon 283 root 40 552K 388K select 0 0:10 0.00% 0.00% dhclient If you run 'top -S' does all your time show up in the idle processes like it does here? -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Current for production?
On 27-Mar-01 Gabriel Ambuehl wrote: > While I'm writing this: what is the general opinion about having > CURRENT on production servers (I'd really love to deploy the ACLs > ASAP)? I don't plan to use SMP and can wait for snapshots til the > RELEASE comes... Don't. ACL's are still not production quality yet, and the SMP work breaks UP kernels just as bad as SMP kernels when it breaks. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: top output broked?
On 27-Mar-01 David Wolfskill wrote: >>Date: Tue, 27 Mar 2001 08:33:10 -0800 (PST) >>From: John Baldwin <[EMAIL PROTECTED]> > >>Not one that I've seen: > >> PID USERNAME PRI NICE SIZERES STATE C TIME WCPUCPU COMMAND >> 11 root -160 0K 0K CPU0 0 79.5H 49.37% 49.37% idle: >> cpu0 >> 10 root -160 0K 0K RUN1 79.4H 48.19% 48.19% idle: >> cpu1 >> 13 root -48 -167 0K 0K WAIT 0 62:53 0.00% 0.00% swi6: >> tty:s >> 15 root 760 0K 0K sleep 0 6:07 0.00% 0.00% random >>5 root 200 0K 0K syncer 1 2:47 0.00% 0.00% syncer >> 20 root -68 -187 0K 0K WAIT 1 1:18 0.00% 0.00% irq18: >> fxp0 >> 19 root -64 -183 0K 0K WAIT 0 0:53 0.00% 0.00% irq16: >> ahc0 >> 12 root -44 -163 0K 0K WAIT 0 0:52 0.00% 0.00% swi1: net >> 18 root -36 -155 0K 0K WAIT 1 0:49 0.00% 0.00% swi3: >> cambi >>4 root -160 0K 0K psleep 0 0:41 0.00% 0.00% bufdaemon >> 283 root 40 552K 388K select 0 0:10 0.00% 0.00% dhclient > >>If you run 'top -S' does all your time show up in the idle processes like it >>does here? > > Hmm... mine loks like that (modulo #CPUs), except when I'm actually > making it do some work (re-building the kernel, in this case). What I > see ("top -S") looks like: > > last pid: 9546; load averages: 0.97, 0.64, 0.30up 0+00:08:32 > 08:51:47 > 77 processes: 3 running, 57 sleeping, 2 zombie, 15 waiting > CPU states: 91.1% user, 0.0% nice, 5.4% system, 0.4% interrupt, 3.1% idle This is probably right.. I don't know why you are seeing such weirdness however. Is your world and kernel out of sync. It's a nice (mis)feature now that if items in the middle of kinfo_proc change size it still tries to use the misordered data rather than complaining about it like it used to. :-P See my other e-mail where top on my laptop doles out time to userland tasks ok. > I confess a degree of skepticism :-} I agree. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: top output broked?
On 27-Mar-01 David Wolfskill wrote: >>Date: Tue, 27 Mar 2001 03:18:10 -0800 >>From: Alfred Perlstein <[EMAIL PROTECTED]> > > >> PID USERNAME PRI NICE SIZERES STATE C TIME WCPUCPU COMMAND >> 824 root -80 1048K 596K biord 0 0:38 0.00% 0.00% find >> 385 root 40 32740K 31944K select 1 0:32 0.00% 0.00% XFree86 >> 836 root -80 532K 276K biord 1 0:07 0.00% 0.00% nfsd >>14848 root 960 26912K 26832K RUN1 0:04 0.00% 0.00% ld >> 424 bright 40 2120K 1340K select 0 0:04 0.00% 0.00% rxvt > > >>no cpu time, known issue? > > I get non-zero values from time to time; in particular, I fired up an > xterm & did a "while (1)" loop in it, and the CPU times increased in a > gratifying manner. :-} > > However, the usual values I'm seeing are rather lower than I would > expect, and lower than the same machine running -STABLE (within the last > several days, by my recollection). > > As a reality check, I'm trying "vmstat 5", and it's consistently > reporting either 99 or 100% idle. There -- I got both it & top to > report something noticeable: I fired up netscape > > Maybe it really *is* using CPU much more efficiently...? No, I didn't > think so, but it was a nice thought :-) > > Oh: recent CVSup history (I hadn't noticed the behavior in the > -CURRENNT I built yesterday): > > CVSup started at Sun Mar 25 23:47:00 PST 2001 > CVSup ended at Sun Mar 25 23:52:25 PST 2001 > CVSup started at Mon Mar 26 23:47:00 PST 2001 > CVSup ended at Mon Mar 26 23:53:39 PST 2001 Keep in mind that we no longer charge interrupt time to the process being interrupted, instead all that interrupt handling has been pushed off into ithreads. Same for software interrupt threads. That said, I don't see how X is so idle, it's certainly not on my laptop: PID USERNAME PRI NICE SIZERES STATETIME WCPUCPU COMMAND 454 john 40 0K 43464K select 1:57 4.05% 4.05% XFree86 461 john 40 17076K 16144K select 0:35 0.39% 0.39% enlightenment 492 john 4 10 3072K 2040K select 0:28 0.10% 0.10% E-ScreenSave. 1022 john 40 7764K 7008K select 0:09 0.10% 0.10% xfmail 398 root 40 984K 564K select 0:06 0.00% 0.00% moused > Cheers, > david -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: top output broked?
On 27-Mar-01 David Wolfskill wrote: >>Date: Tue, 27 Mar 2001 10:21:45 -0800 (PST) >>From: John Baldwin <[EMAIL PROTECTED]> > >>Keep in mind that we no longer charge interrupt time to the process being >>interrupted, instead all that interrupt handling has been pushed off into >>ithreads. Same for software interrupt threads. > > OK; that's a good & useful thing to keep in mind. And I did see some > IRQ-related entries in top's output. Are they getting %CPU though. When running top -S, the CPU %'s should always add up to about 100 (with fudges for rounding errors). >>That said, I don't see how X is so idle, it's certainly not on my laptop: > >> PID USERNAME PRI NICE SIZERES STATETIME WCPUCPU COMMAND >> 454 john 40 0K 43464K select 1:57 4.05% 4.05% XFree86 >> 461 john 40 17076K 16144K select 0:35 0.39% 0.39% >> enlightenment >> 492 john 4 10 3072K 2040K select 0:28 0.10% 0.10% >> E-ScreenSave. > > Eh... the "enlightenment" line may provide a clue there. I use tvtwm as > a window manager. :-} (I figure anything that could be marginally > acceptable on a (maxed out) 24 MB Sun 3/60 ought to be adequate for this > 750 MHz/256 MB laptop) Heh, but I figured Alfred was in X when he was running top, so X must've been doing _some_ screen updates, and not just have 0.00% CPU time. :-P -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: top output broked?
On 27-Mar-01 David Wolfskill wrote: >>Date: Tue, 27 Mar 2001 11:56:38 -0800 (PST) >>From: John Baldwin <[EMAIL PROTECTED]> > >>> OK; that's a good & useful thing to keep in mind. And I did see some >>> IRQ-related entries in top's output. > >>Are they getting %CPU though. When running top -S, the CPU %'s should always >>add up to about 100 (with fudges for rounding errors). > > Well, as noted in another note a little prior to this one, the -CURRENT > behavior I'm seeing isn't all *that* different from the -STABLE behavior > -- in each case, the sum of what "top" reports for CPU % is normally small. -STABLE doesn't have idle processes. :) >>> Eh... the "enlightenment" line may provide a clue there. I use tvtwm as >>> a window manager. :-} (I figure anything that could be marginally >>> acceptable on a (maxed out) 24 MB Sun 3/60 ought to be adequate for this >>> 750 MHz/256 MB laptop) > >>Heh, but I figured Alfred was in X when he was running top, so X must've been >>doing _some_ screen updates, and not just have 0.00% CPU time. :-P > > Well, that gets into a matter of perspective, since the amount of CPU > resource required to do the screen updates (vs. what is available) could > well be 0.00 (to 2 decimals).... :-) (Kinda like the ratio of a > circle's circumference to its diameter is "3" to a single significant > figure.) > > (I was in X at the time, too.) Fair enough.. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
HEADS UP: I'm breaking the kernel again
That's right, more SMPng breakage is on the way. Well, hopefully not bad breakage. On a more serious note, I've just spammed sys/alpha/alpha with parts of the critical_enter/exit change which I meant to commit anyways, so I'm going to go ahead and finish committing that right now. The kernel probably won't compile until it is all checked in. Once I get this in I will hopefully also get a chance to check in the new witness code tonight (or tomorrow morning as the case may be) as well. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: HEADS UP: I'm breaking the kernel again
On 28-Mar-01 John Baldwin wrote: > That's right, more SMPng breakage is on the way. Well, hopefully not bad > breakage. On a more serious note, I've just spammed sys/alpha/alpha with > parts > of the critical_enter/exit change which I meant to commit anyways, so I'm > going > to go ahead and finish committing that right now. The kernel probably won't > compile until it is all checked in. Once I get this in I will hopefully also > get a chance to check in the new witness code tonight (or tomorrow morning as > the case may be) as well. Well, it should compile and run now. I have more changes that I hope to commit now, but I'll try not to break the kernel compile while I'm doing them. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: HEADS UP: I'm breaking the kernel again
On 28-Mar-01 John Baldwin wrote: > > On 28-Mar-01 John Baldwin wrote: >> That's right, more SMPng breakage is on the way. Well, hopefully not bad >> breakage. On a more serious note, I've just spammed sys/alpha/alpha with >> parts >> of the critical_enter/exit change which I meant to commit anyways, so I'm >> going >> to go ahead and finish committing that right now. The kernel probably won't >> compile until it is all checked in. Once I get this in I will hopefully >> also >> get a chance to check in the new witness code tonight (or tomorrow morning >> as >> the case may be) as well. > > Well, it should compile and run now. I have more changes that I hope to > commit now, but I'll try not to break the kernel compile while I'm doing > them. s/now/later this evening (or morning)/ -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Fun way to panic -current
On 28-Mar-01 Terry Lambert wrote: > Run the 4.3 mountd on it. > > Boom! Kernel memory allocation way to large; unrecoverable! Yes, struct ucred sucks. In -current the userland now uses a static struct xucred that doesn't contain things like mutexes and thus mountd shouldn't crash in current anymore when struct ucred changes size. Too bad we can't retrofit that. :( -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: NEWCARD broken in -current
On 28-Mar-01 Jesper Skriver wrote: > On Wed, Mar 28, 2001 at 10:15:21PM +0200, Niels Chr. Bank-Pedersen wrote: >> On Wed, Mar 28, 2001 at 10:09:28PM +0200, Jesper Skriver wrote: >> > cc -c -O -pipe -Wall -Wredundant-decls -Wnested-externs >> > -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline >> > -Wcast-qual -fformat-extensions -ansi -nostdinc -I- -I. -I/usr/src/sys >> > -I/usr/src/sys/dev -I/usr/src/sys/../include -I/usr/src/sys/contrib/dev/acp > ica/Subsystem/Include -D_KERNEL -include opt_global.h -elf > -mpreferred-stack-boundary=2 >> > /usr/src/sys/dev/pccbb/pccbb.c >> > In file included from /usr/src/sys/dev/pccbb/pccbb.c:56: >> > /usr/src/sys/sys/mutex.h:87: field `mtx_object' has incomplete type >> > /usr/src/sys/dev/pccbb/pccbb.c: In function `pccbb_detach': >> > /usr/src/sys/dev/pccbb/pccbb.c:533: warning: implicit declaration of >> > function `MPASS' >> > /usr/src/sys/dev/pccbb/pccbb.c:533: warning: implicit declaration of >> > function `LOCK_LOG_LOCK' >> > /usr/src/sys/dev/pccbb/pccbb.c:533: warning: implicit declaration of >> > function `WITNESS_LOCK' >> > /usr/src/sys/dev/pccbb/pccbb.c:539: warning: implicit declaration of >> > function `WITNESS_UNLOCK' >> > *** Error code 1 >> >> You'll need to #include and >> in /usr/src/sys/dev/pccbb/pccbb.c > > Right, the below fixes it, should I commit ? > > Index: src/sys/dev/pccbb/pccbb.c > === > RCS file: /home/ncvs/src/sys/dev/pccbb/pccbb.c,v > retrieving revision 1.12 > diff -u -r1.12 pccbb.c > --- src/sys/dev/pccbb/pccbb.c 2001/02/09 06:08:52 1.12 > +++ src/sys/dev/pccbb/pccbb.c 2001/03/28 20:51:20 > @@ -53,6 +53,8 @@ > #include > #include > #include > +#include > +#include > #include > > #include Please sort the includes or at least attempt to as that is style(9). You'll need to mvoe lock.h up before malloc.h. types.h is special, it belongs as the very first header unless sys/param.h is the first header (param.h includes types.h). -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: NEWCARD broken in -current
On 29-Mar-01 Jesper Skriver wrote: > On Thu, Mar 29, 2001 at 01:13:09PM -0800, John Baldwin wrote: >> > >> > Index: src/sys/dev/pccbb/pccbb.c >> > === >> > RCS file: /home/ncvs/src/sys/dev/pccbb/pccbb.c,v >> > retrieving revision 1.12 >> > diff -u -r1.12 pccbb.c >> > --- src/sys/dev/pccbb/pccbb.c 2001/02/09 06:08:52 1.12 >> > +++ src/sys/dev/pccbb/pccbb.c 2001/03/28 20:51:20 >> > @@ -53,6 +53,8 @@ >> > #include >> > #include >> > #include >> > +#include >> > +#include >> > #include >> > >> > #include >> >> Please sort the includes or at least attempt to as that is style(9). You'll >> need to mvoe lock.h up before malloc.h. types.h is special, it belongs as >> the >> very first header unless sys/param.h is the first header (param.h includes >> types.h). > > The above was committed, so would the below fix it right ? > > Index: src/sys/dev/pccbb/pccbb.c > === > RCS file: /home/ncvs/src/sys/dev/pccbb/pccbb.c,v > retrieving revision 1.13 > diff -u -r1.13 pccbb.c > --- src/sys/dev/pccbb/pccbb.c 2001/03/29 10:23:45 1.13 > +++ src/sys/dev/pccbb/pccbb.c 2001/03/29 22:06:51 > @@ -52,9 +52,8 @@ > #include > #include > #include > -#include > -#include > #include > +#include > #include > > #include Sure, as long as it compiles. :) -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On 30-Mar-01 David O'Brien wrote: > On Fri, Mar 30, 2001 at 07:45:43AM +0200, Mark Murray wrote: >> I thought the 586 FP stuff was disabled? > > Nope. Depending on how current you are, it was either left broken. > I commited BDE's fix to exeception.s that fixed things for K6-2 users. It looks like it is just broken in the SMP case. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: hmm.... spinlocks..
On 30-Mar-01 Matthew Jacob wrote: > > > pic_initialize(): > lint0: 0x00010700 lint1: 0x00010400 TPR: 0x0010 SVR: 0x01ff > kernel trap 12 with interrupts disabled > panic: spin lock sched lock held by 0xc7ba8a60 for > 5 seconds > cpuid = 0; lapic.id = > Debugger("panic") > > CPU0 stopping CPUs: 0x0002... stopped. > Stopped at Debugger+0x45: pushl %ebx > db> t > Debugger(c02e3759) at Debugger+0x45 > panic(c02e2d60,c02fb929,c7ba8a60,c7ba7b80,fffea000) at panic+0xd0 > _mtx_lock_spin(c0357400,0,80246,c02e3a48,2fb) at _mtx_lock_spin+0x6e > wakeup(c3709378,c3709378,c0eec000,c0f71400,c3709454) at wakeup+0x67 > bufdone(c3709378,c7fbff40,c0131efb,c3709378,c0f71400) at bufdone+0x385 > bufdonebio(c3709378) at bufdonebio+0xe > dadone(c0f5d100,c0f71400) at dadone+0x1f7 > camisr(c032a834) at camisr+0x231 > ithread_loop(c0b2f080,c7fbffa8) at ithread_loop+0x247 > fork_exit(c019898c,c0b2f080,c7fbffa8) at fork_exit+0x83 > fork_trampoline() at fork_trampoline+0x8 Did you get a crashdump (probably not.) If you look at sched_lock, one of the words will be a pointer to the process owning the lock in question. Unfortunately it's not the first word anymore (something I may change in the future). On the alpha it would be 'sched_lock+48'. The pointer there points to the process owning the lock (and you can look up the process via ps). If you have a crash dump then I have some gdb macros that make it easy to get a backtrace of that process. If not, then, well, it gets harder. :-P Hmm, it might be nice to be able to ask ddb to give a backtrace of any arbitrary process. Maybe I'll add a new command for that.. The trick is that we want to know who grabbed sched_lock where and then started spinning with it. Using KTR with KTR_LOCK turned on and using the 'show ktr' command in ddb could also be used to see which process was the last to grab sched_lock and where it was grabbed. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: hmm.... spinlocks..
On 31-Mar-01 Matthew Jacob wrote: > > On Fri, 30 Mar 2001, John Baldwin wrote: > >> >> On 30-Mar-01 Matthew Jacob wrote: >> > >> > >> > pic_initialize(): >> > lint0: 0x00010700 lint1: 0x00010400 TPR: 0x0010 SVR: 0x01ff >> > kernel trap 12 with interrupts disabled >> > panic: spin lock sched lock held by 0xc7ba8a60 for > 5 seconds >> > cpuid = 0; lapic.id = >> > Debugger("panic") >> > >> > CPU0 stopping CPUs: 0x0002... stopped. >> > Stopped at Debugger+0x45: pushl %ebx >> > db> t >> > Debugger(c02e3759) at Debugger+0x45 >> > panic(c02e2d60,c02fb929,c7ba8a60,c7ba7b80,fffea000) at panic+0xd0 >> > _mtx_lock_spin(c0357400,0,80246,c02e3a48,2fb) at _mtx_lock_spin+0x6e >> > wakeup(c3709378,c3709378,c0eec000,c0f71400,c3709454) at wakeup+0x67 >> > bufdone(c3709378,c7fbff40,c0131efb,c3709378,c0f71400) at bufdone+0x385 >> > bufdonebio(c3709378) at bufdonebio+0xe >> > dadone(c0f5d100,c0f71400) at dadone+0x1f7 >> > camisr(c032a834) at camisr+0x231 >> > ithread_loop(c0b2f080,c7fbffa8) at ithread_loop+0x247 >> > fork_exit(c019898c,c0b2f080,c7fbffa8) at fork_exit+0x83 >> > fork_trampoline() at fork_trampoline+0x8 >> >> Did you get a crashdump (probably not.) If you look at sched_lock, one of >> the >> words will be a pointer to the process owning the lock in question. >> Unfortunately it's not the first word anymore (something I may change in the >> future). On the alpha it would be 'sched_lock+48'. The pointer there >> points >> to the process owning the lock (and you can look up the process via ps). If >> you have a crash dump then I have some gdb macros that make it easy to get a >> backtrace of that process. If not, then, well, it gets harder. :-P Hmm, it >> might be nice to be able to ask ddb to give a backtrace of any arbitrary >> process. Maybe I'll add a new command for that.. >> >> The trick is that we want to know who grabbed sched_lock where and then >> started >> spinning with it. Using KTR with KTR_LOCK turned on and using the 'show >> ktr' >> command in ddb could also be used to see which process was the last to grab >> sched_lock and where it was grabbed. > > If it happens again, I''ll try. This was, btw, i386. Oh, duh. Umm, on i386 it's 'sched_lock + 28'. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On 31-Mar-01 Bruce Evans wrote: > On Fri, 30 Mar 2001, John Baldwin wrote: > >> On 30-Mar-01 David O'Brien wrote: >> > On Fri, Mar 30, 2001 at 07:45:43AM +0200, Mark Murray wrote: >> >> I thought the 586 FP stuff was disabled? >> > >> > Nope. Depending on how current you are, it was either left broken. >> > I commited BDE's fix to exeception.s that fixed things for K6-2 users. >> >> It looks like it is just broken in the SMP case. > > It is more just broken than before in the SMP case. Premptive context > switching in the kernel did most of the breaking, and recent changes > added sanity checks that detected a very broken case. With preemptive > context switching, the following can happen: > > - we start using the FPU on a CPU with a free FPU (we used to free the > FPU in some cases; now we only use optimizations in bcopy/bzero if > the FPU was free to begin with). > - we do a preemptive context switch and come back using a different FPU. > > The different CPU might even be unfree, and that case is now detected. > In other cases, we just corrupt data by using different FPU registers :-(. Ugh. Hrm, then we need to either disable interrupts inside of i586_* or set a hard affinity flag in the process such that all other CPU's will ignore it and only p_lastcpu will run it next. > Bruce -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: is it supposed to be this broken?
On 31-Mar-01 Warner Losh wrote: > In message <[EMAIL PROTECTED]> Alfred Perlstein writes: >: This is cute... >: hint.ppc.1.disabled="1" >: ppc1: at port 0x378-0x37f,0x778-0x77b irq 7 drq >: 3 on >: isa0 > > That should work. > > However, if it doesn't, consider removing the 'at' lines from your > hints file. > > Warner It doesn't work because he didn't specify all of the hints, so the fd0 "device" that he has in his hints file isn't a perfect match to the fd0 device that comes from the PnP BIOS. If he specifies all the resources like the normal hints file then it should work fine, I think. Though as Mike points out, it doesn't change the fact that the IRQ is already allocated off to something else. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: i586 FP optimizations hosed.
On 03-Apr-01 Bruce Evans wrote: > On Mon, 2 Apr 2001, John Baldwin wrote: > >> On 31-Mar-01 Bruce Evans wrote: >> > [about i586-optimized copying and bzeroing] >> > - we start using the FPU on a CPU with a free FPU (we used to free the >> > FPU in some cases; now we only use optimizations in bcopy/bzero if >> > the FPU was free to begin with). >> > - we do a preemptive context switch and come back using a different FPU. >> > >> > The different CPU might even be unfree, and that case is now detected. >> > In other cases, we just corrupt data by using different FPU registers :-(. >> >> Ugh. Hrm, then we need to either disable interrupts inside of i586_* or set >> a > > This would break fast interrupts :-). > >> hard affinity flag in the process such that all other CPU's will ignore it >> and >> only p_lastcpu will run it next. > > There are many other possibilities: > - don't use these routines. > - don't use these routines for the SMP case. > - disable preemptive context switching for the CPU that is using the FPU. > The hard affinity flag could be used for this as a special case. > - acquire sched_lock so that all sorts of context switching are disabled > for all CPUs. > - don't attempt to save the FPU state reentrantly, since this doesn't work > with preemptive context switchiing unless interrupt handlers also save the > state reentrantly, which they shouldn't do because it is too wasteful. > Instead, save the state in the pcb as is already done in copy{in,out} > so that cpu_switch() handles it. This may be too wasteful too. > - as in previous possibility, but avoid switching the entire state. For > the FPU, the entire state must be switched, but for SSE individual > registers can be saved and restored. Saving and restoring individual > registers reentrantly would be easy but no longer works for the SMP case. > Switching a subset of the state would not be so easy. Hm. I think I'm liking the next to last. Even if there is additional overhead, it should still outperform generic_bcopy and friends on the CPU's in question, right? > Bruce -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.Baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: selwakeup()
On 05-Apr-01 Garrett Wollman wrote: > < <[EMAIL PROTECTED]> said: > >> If I'm reading this backtrace right, the thread handling the sound >> hardware called selwakeup() (frame #19). This called pfind() (frame >> #18), which tries to lock allproc. > > selwakeup() shouldn't need to call pfind(). Because the process table > is in type-stable memory, it should be sufficient to keep a reference > to the caller's proc structure and check to see whether its pid is the > same one as in the selinfo. The locking that selwakeup() already > needs to do should be sufficient to avoid a race. > > (In 4.4BSD, process structures were not type-stable so this technique > could not have been used.) There are probably several other places that pfind is called that this check should also be adequate for as well. The ones in syscons for example. > -GAWollman -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: selwakeup()
On 05-Apr-01 John Baldwin wrote: > > On 05-Apr-01 Garrett Wollman wrote: >> <> <[EMAIL PROTECTED]> said: >> >>> If I'm reading this backtrace right, the thread handling the sound >>> hardware called selwakeup() (frame #19). This called pfind() (frame >>> #18), which tries to lock allproc. >> >> selwakeup() shouldn't need to call pfind(). Because the process table >> is in type-stable memory, it should be sufficient to keep a reference >> to the caller's proc structure and check to see whether its pid is the >> same one as in the selinfo. The locking that selwakeup() already >> needs to do should be sufficient to avoid a race. >> >> (In 4.4BSD, process structures were not type-stable so this technique >> could not have been used.) > > There are probably several other places that pfind is called that this check > should also be adequate for as well. The ones in syscons for example. As a safety check we should probably zero the pid right before zfree()'ing a proc in wait() however, so that a stale pointer to a free'd process doesn't have a valid pid if we do this. >> -GAWollman -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: selwakeup()
On 05-Apr-01 Garrett Wollman wrote: > < > said: > >> As a safety check we should probably zero the pid right before zfree()'ing a >> proc in wait() however, so that a stale pointer to a free'd process doesn't >> have a valid pid if we do this. > > Should not be necessary. Here is the logic: Ah, forgot about the p_stat check. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Problems with fsck after dirpref changes
On 10-Apr-01 Niels Chr. Bank-Pedersen wrote: > > Is it me fsck'ing up, or is fsck(8) lacking behind in the > dirpref changes? > > > Automatic boot in progress... > /dev/da0s1a: BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN > FIRST ALTERNATE > > /dev/da0s1a: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. > /dev/da0s1a: Automatic file system check failed . . . help! > Enter full pathname of shell or RETURN for /bin/sh: > # fsck_ffs -b 32 / > Alternate super block location: 32 > ** /dev/da0s1a > ** Last Mounted on > ** Root file system > ** Phase 1 - Check Blocks and Sizes > ** Phase 2 - Check Pathnames > ** Phase 3 - Check Connectivity > ** Phase 4 - Check Reference Counts > ** Phase 5 - Check Cyl groups > FREE BLK COUNT(S) WRONG IN SUPERBLK > SALVAGE? [yn] y > > 2683 files, 136083 used, 399724 free (1164 frags, 49820 blocks, 0.2% > fragmentation) > > UPDATE STANDARD SUPERBLOCK? [yn] y You didn't want to do this. This is probably why you panic'd. > http://www.openbsd.org/cgi-bin/cvsweb/src/sbin/fsck_ffs/setup.c.diff?r1=1.8&r2 > =1.9&f=h Yep, my fsck works again (well, it doesn't blow up at least), will commit it in a second. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Problems with -CURRENT
On 13-Apr-01 Matthew Schlegel wrote: > I have been working on doing an update to the latest -CURRENT (last cvsup for > this upgrade attempt was this morning at about 9:30 PDT) for the last > couple days from : > FreeBSD msops.crossgain.com 5.0-CURRENT FreeBSD 5.0-CURRENT >#0: Wed Jun 28 13:23:44 PDT 2000 > > Right now I can boot into single user mode just fine, but the moment any > process requests an inode on my / partition, I get a kernel panic (page > fault). Right now the world consistantly falls apart in ffs_valloc, but I'm > not sure how to further track it down at this point. > > I am able to get inodes on another partition that was created using the newfs > from the new world, so I'm wondering if there may have been some filesystem > change between June of last year and now that could be causing the problems. > I would appreciate it if someone could point me in the right direction with > this so I can get moved over. Rebuild and install fsck and fsck your filesystems. This is the dirpref stuff most likely biting you. Warner, we should probably add a warning about the dirpref changes to UPDATING since if you fsck a filesystem with the old fsck and new kernel and you overwrite your superblock with the alternate you hose the filesystem resuling in these panics. :( > Stack trace with pointers resolved: > ffs_valloc(0, 8180, 21, vcp_create_desc 8) > ufs_makeinode > ufs_create > ufs_vnoperate > vn_open > open > > The config I am using is attached to this email as well. > > -- > Matthew Schlegel > Give yourself a raise.. Every month: > http://www.ezinfocenter.com/290234.10/FREE > > Encryption Keys: > Type KeyID Created Fingerprint > PGP DSS 0x30AFD26D 1998-08-20 FC89 1E36 353E BDAA FF81 DD30 A7B0 3942 > 30AF D26D -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: WITNESS + WITNESS_SKIPSPIN = panic
On 14-Apr-01 Peter Jeremy wrote: > Is there any progress on fixing this? > > Peter It panics? I'll see if I can reproduce this on Monday. I never use skipspin. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: WITNESS + WITNESS_SKIPSPIN = panic
On 16-Apr-01 Peter Jeremy wrote: > On 2001-Apr-14 18:54:28 -0700, John Baldwin <[EMAIL PROTECTED]> wrote: >> >>On 14-Apr-01 Peter Jeremy wrote: >>> Is there any progress on fixing this? >>> >>> Peter >> >>It panics? I'll see if I can reproduce this on Monday. I never use >>skipspin. > > A similar problem was reported here in mid-March, ending with the > following message: > > On Mon, 12 Mar 2001 10:49:51 -0800 (PST), in > <[EMAIL PROTECTED]>, John Baldwin <[EMAIL PROTECTED]> wrote: >>Just don't use the skipspin stuff, it shouldn't hurt at all. The new witness >>code will hopefully be in by the end of the week. *crosses fingers* > > I bumped into the same problem last week and couldn't find anything > that looked like a change in the skipspin behaviour since mid-March. > > Having looked in more detail at the previous thread, I suspect I may > be seeing something different. In my case, the kernel is panicing > very early during the boot process in either line 302 or 305 of > /sys/kern/subr_witness.c in witness_initialize(): > > 299 /* First add in all the specified order lists. */ > 300 for (order = order_lists; order->w_name != NULL; order++) { > 301 w = enroll(order->w_name, order->w_class); > 302 w->w_file = "order list"; > 303 for (order++; order->w_name != NULL; order++) { > 304 w1 = enroll(order->w_name, order->w_class); > 305 w1->w_file = "order list"; > 306 itismychild(w, w1); > 307 w = w1; > 308 } > 309 } > > The problem is that enroll() will return NULL for spinlocks when > witness_skipspin is set, but the above code always assumes it > can de-reference the result from enroll(). (There are two other > calls to enroll() where a NULL return appears to be acceptable). Argh, ok. > I don't understand the mutex initialisation well enough to be able > to readily work out the correct fix. I'll fix it later on today. > Peter -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: lock messages from today's -CURRENT
On 16-Apr-01 David Wolfskill wrote: > I saw that jhb committed some changes as of r1.307 of > src/sys/i386/conf/GENERIC, so I replicated those changes (that I didn't > already have) to my kernel config. > > Got -CURRENT built & running; the message below (bracketed by "normal" > messages, to supply a little context) appear to be documenting weirdnesses, > but I don't see other evidence of problems: > > Apr 16 08:57:44 localhost /boot/kernel/kernel: pccard: card inserted, slot 1 > Apr 16 08:57:44 localhost /boot/kernel/kernel: acquring duplicate lock of > same type: "allproc" > Apr 16 08:57:44 localhost /boot/kernel/kernel: 1st @ > /usr/src/sys/kern/kern_proc.c:584 > Apr 16 08:57:44 localhost /boot/kernel/kernel: 2nd @ > /usr/src/sys/kern/kern_proc.c:143 > Apr 16 08:57:44 localhost /boot/kernel/kernel: lock order reversal > Apr 16 08:57:44 localhost /boot/kernel/kernel: 1st vnode interlock last > acquired @ /usr/src/sys/kern/vfs_vnops.c:625 > Apr 16 08:57:44 localhost /boot/kernel/kernel: 2nd 0xc0459b80 mntvnode @ > /usr/src/sys/ufs/ffs/ffs_vfsops.c:954 > Apr 16 08:57:44 localhost /boot/kernel/kernel: 3rd 0xce7d54ac vnode interlock > @ /usr/src/sys/ufs/ffs/ffs_vfsops.c:963 > Apr 16 08:57:44 localhost savecore: magic number mismatch (8b98aa0d != > 8fca0101) > Apr 16 08:57:44 localhost savecore: no core dump These are known right now. The allproc one is because witness is still somewhat bogus wrt to sx locks and treats the recursive shared lock as duplicate acquires of the same lock. The other reversals that are vfs related have been around since probably at least 4.4bsd or perhaps earlier. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: One more typo in src/release/Makefile, rev 1.612? (w/patch)
On 16-Apr-01 David O'Brien wrote: > On Mon, Apr 16, 2001 at 09:53:43AM -0700, Bruce A. Mah wrote: >> Thanks for fixing the typo in src/release/Makefile. I think however the >> real cause of the error that people were seeing is a typo on the line > > Damnit, I *tested* this and things landed in the right place. Grrr... > Ok, no more hacking until I get a CVSup with the lastest release/Makefile > and I'll kick off a fresh release to test. Also, Bruce's fix is not entirely correct as it breaks for the non-debug kernel case, but I've already sent you a mail about that, just to let everyone know that it should be fixed shortly. :) -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: kernel panic in -current, ithread or newcard related ?
On 15-Apr-01 Jesper Skriver wrote: > About every other time I boot my IBM ThinkPad 600E I get this panic > (hand typed, as I don't have a second machine here to be able to use a > serial console). > > Fatal trap 12: page fault while in kernel mode > Fault virtual address = 0x28 It's a null pointer dereference. If you've compiled a debug kernel then do 'gdb -k /usr/obj/usr/src/sys/TAM2/kernel.debug' and then do 'l *csa_readio+0x17' to find the offending line. It's usually pretty easy to figure out then. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: One more typo in src/release/Makefile, rev 1.612? (w/patch)
On 17-Apr-01 Bruce A. Mah wrote: > If memory serves me right, John Baldwin wrote: > >> Also, Bruce's fix is not entirely correct as it breaks for the non-debug >> kern >> el >> case, > > Hmmm? I didn't know there was a choice on debug/non-debug kernels > during a "make release", but I defer to the experts. Yes, it's determined by putting 'makeoptions DEBUG=-g' in the kernel config file. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: kernel panic in -current, ithread or newcard related ?
On 17-Apr-01 Jesper Skriver wrote: > On Mon, Apr 16, 2001 at 08:10:37PM -0700, John Baldwin wrote: >> >> On 15-Apr-01 Jesper Skriver wrote: >> > About every other time I boot my IBM ThinkPad 600E I get this panic >> > (hand typed, as I don't have a second machine here to be able to use a >> > serial console). >> > >> > Fatal trap 12: page fault while in kernel mode >> > Fault virtual address = 0x28 >> >> It's a null pointer dereference. If you've compiled a debug kernel then do >> 'gdb -k /usr/obj/usr/src/sys/TAM2/kernel.debug' and then do >> 'l *csa_readio+0x17' to find the offending line. It's usually pretty easy >> to >> figure out then. > > I's not obvious to me, newbee in kernel debugging, how is the below > (from the trace) related ? > > /Jesper > > (kgdb) l *csa_readio+0x17 > 0xc0159cd3 is in csa_readio (machine/bus_at386.h:205). > 200 } > 201 > 202 static __inline u_int32_t > 203 bus_space_read_4(bus_space_tag_t tag, bus_space_handle_t handle, > 204 bus_size_t offset) > 205 { > 206 #if defined(_I386_BUS_PIO_H_) > 207 #if defined(_I386_BUS_MEMIO_H_) > 208 if (tag == I386_BUS_SPACE_IO) > 209 #endif Hmm, well, looking in dev/sound/pci/csa.c at the csa_readio() function, bus_space_read_4() is called once: if (offset < BA0_AC97_RESET) return bus_space_read_4(rman_get_bustag(resp->io), rman_get_bush andle(resp->io), offset) & 0x; else { if (csa_readcodec(resp, offset, &ul)) ul = 0; return (ul); } My guess is that resp is NULL here. At this point, you may want to poke Cameron Grant <[EMAIL PROTECTED]> with a bug report as he is Mr. Sound and he probably knows what has gone wrong at this point. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
FW: Snapshot Log - current broke
-FW: <[EMAIL PROTECTED]>- Date: Tue, 17 Apr 2001 08:50:35 -0700 (PDT) From: Deimos Root <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: Snapshot Log rm -f vt220l.816 uudecode /usr/src/usr.sbin/pcvt/fonts/vt220l.814.uu uudecode /usr/src/usr.sbin/pcvt/fonts/vt220l.810.uu uudecode /usr/src/usr.sbin/pcvt/fonts/vt220l.816.uu ===> usr.sbin/pcvt/kcon cc -O -pipe -I/usr/src/usr.sbin/pcvt/kcon/../keycap -DKEYB_DEVICE=\"/dev/ttyv0\" -I/usr/obj/usr/src/i386/usr/include -c /usr /src/usr.sbin/pcvt/kcon/kcon.c gzip -cn /usr/src/usr.sbin/pcvt/kcon/kcon.1 > kcon.1.gz cc -O -pipe -I/usr/src/usr.sbin/pcvt/kcon/../keycap -DKEYB_DEVICE=\"/dev/ttyv0\" -I/usr/obj/usr/src/i386/usr/include -o kco n kcon.o /usr/obj/usr/src/usr.sbin/pcvt/kcon/../keycap/libkeycap.a ===> usr.sbin/pcvt/loadfont cc -O -pipe-I/usr/obj/usr/src/i386/usr/include -c /usr/src/usr.sbin/pcvt/loadfont/loadfont.c gzip -cn /usr/src/usr.sbin/pcvt/loadfont/loadfont.1 > loadfont.1.gz cc -O -pipe-I/usr/obj/usr/src/i386/usr/include -o loadfont loadfont.o ===> usr.sbin/pcvt/scon cc -O -pipe-I/usr/obj/usr/src/i386/usr/include -c /usr/src/usr.sbin/pcvt/scon/scon.c gzip -cn /usr/src/usr.sbin/pcvt/scon/scon.1 > scon.1.gz cc -O -pipe-I/usr/obj/usr/src/i386/usr/include -o scon scon.o ===> usr.sbin/pcvt/userkeys cc -O -pipe-I/usr/obj/usr/src/i386/usr/include -c /usr/src/usr.sbin/pcvt/userkeys/vt220keys.c gzip -cn /usr/src/usr.sbin/pcvt/userkeys/vt220keys.1 > vt220keys.1.gz cc -O -pipe-I/usr/obj/usr/src/i386/usr/include -o vt220keys vt220keys.o ===> usr.sbin/pcvt/vttest cc -O -pipe -traditional -DUSEMYSTTY -I/usr/obj/usr/src/i386/usr/include -c /usr/src/usr.sbin/pcvt/vttest/main.c cc -O -pipe -traditional -DUSEMYSTTY -I/usr/obj/usr/src/i386/usr/include -c /usr/src/usr.sbin/pcvt/vttest/esc.c gzip -cn /usr/src/usr.sbin/pcvt/vttest/vttest.1 > vttest.1.gz In file included from /usr/src/usr.sbin/pcvt/vttest/header.h:26, from /usr/src/usr.sbin/pcvt/vttest/esc.c:1: /usr/obj/usr/src/i386/usr/include/stdio.h:302: syntax error before `char' In file included from /usr/src/usr.sbin/pcvt/vttest/header.h:26, from /usr/src/usr.sbin/pcvt/vttest/main.c:20: /usr/obj/usr/src/i386/usr/include/stdio.h:302: syntax error before `char' *** Error code 1 *** Error code 1 2 errors *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 Stop in /usr/src/release. release started at 06:00:00 on 04/17/01 release died at 08:50:35 on 04/17/01 ------End of forwarded message- -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Breakage in today's -CURRENT
On 19-Apr-01 Greg Lehey wrote: > I've just built a couple of worlds from -CURRENT cvsupped at 2030 UTC > on the 18th, and at 0600 UTC on the 19th. In each case, I have > massive problems, apparently with the synchronization. Here's some > log file output: > > Apr 19 18:11:34 zaphod /boot/kernel/kernel: > fforrwwarardd__hsatradtcclloocckk:: ch ecchkesctkasttea te 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: ffrwaordr_whaarrdd_cstocakt:c > lock: checkstate 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: heckstate 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: forward_statclock: checkstate 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: forward_statclock: checkstate 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: fforwoarrwd_rds_hatacrldckckl:o > cckheckstate : c0h > Apr 19 18:11:34 zaphod /boot/kernel/kernel: eckstate 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: fforwarodr_whaarrdd_cstoactk:lo > ccheckstate 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: k: checkstate 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: forfwarod_rhaarrdcd_sltoactkc:l > occk: checkstate 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: heckstate 0 > Apr 19 18:11:34 zaphod /boot/kernel/kernel: forward_statclock: checkstate 0 > > These blocks repeat exactly every 30 seconds. Also, the display is > dead: the keyboard responds to NumLock and ScrollLock, but the last > line on the bottom of the display consists of random data in bright. > I can't enter ddb, or if I do, I can't tell that I've made it. I can > rlogin with no problems. zaphod is an Abit BP6 with 2 Celerons. > > Greg Try bumping the timeouts in forward_hardclock() and forward_statclock() in mp_machdep.c. I'm currently overhauling the clock stuff so that we don't need these timeouts. The problem is that these timeouts are CPU speed sensitive, so while they might be fine for a PPro 200, they are far too short for a P3 600. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message