Re: panic on 5.2 BETA: blockable sleep lock
Hi Please can someone commit the bktr patch for me to fix 5.2-BETA (as long as re@ approve). I don't have the resources. I'm not suprised that I haven't heard from him because this issue came up at the start of the Thanksgiving holiday weekend. If only it were that simple. Actually I'm English and we don't have Thanksgiving. I'm not doing much work on FreeBSD (or OpenH323) at the moment due to lack of free time and a lack of resources. My new job does not involve FreeBSD (or any unix systems) or Video Conferencing. When I can find spare time at home, I only have a P350 + 56k modem. In the New Year I'll upgrade my PC, find someone to get me ISOs and then I'll be back in business. Roger The problem is that selrecord() wants to lock a MTX_DEF mutex, which can cause a context switch if the mutex is already locked by another thread. This is contrary to what bktr_poll() wants to accomplish by calling critical_enter(). Strange enough that does not seem to happen with a kernel built without INVARIANTS and WITNESS. Does this make any sense or is this just by chance? You might try the patch below with WITNESS enabled. I don't have the hardware, so I can't test it. It compiles for me, but for all I know it could delete all your files if you run it. Any chance for getting this committed? I've been forwarding these messages to the bktr maintainer listed in /usr/src/MAINTAINERS, in case he isn't subscribed to [EMAIL PROTECTED] I'm not suprised that I haven't heard from him because this issue came up at the start of the Thanksgiving holiday weekend. Commiting the patch will also require re approval because of the code freeze in preparation for 5.2-RELEASE. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic on 5.2 BETA: blockable sleep lock
On 1 Dec, Roger Hardiman wrote: Hi Please can someone commit the bktr patch for me to fix 5.2-BETA (as long as re@ approve). I don't have the resources. If you're happy with the patch, I'll pursue re@ approval for the commit. I'm not suprised that I haven't heard from him because this issue came up at the start of the Thanksgiving holiday weekend. If only it were that simple. Actually I'm English and we don't have Thanksgiving. Sorry, I guess I should have fired up xearth ;-) ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic on 5.2 BETA: blockable sleep lock
On 1 Dec, Roger Hardiman wrote: Hi Please can someone commit the bktr patch for me to fix 5.2-BETA (as long as re@ approve). I don't have the resources. The patch has been committed with re@ approval. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic on 5.2 BETA: blockable sleep lock
On Fri, 2003-11-28 at 01:02, Don Lewis wrote: On 27 Nov, Stefan Ehmann wrote: On Wed, 2003-11-26 at 08:33, Don Lewis wrote: The problem is that selrecord() wants to lock a MTX_DEF mutex, which can cause a context switch if the mutex is already locked by another thread. This is contrary to what bktr_poll() wants to accomplish by calling critical_enter(). Strange enough that does not seem to happen with a kernel built without INVARIANTS and WITNESS. Does this make any sense or is this just by chance? You might try the patch below with WITNESS enabled. I don't have the hardware, so I can't test it. It compiles for me, but for all I know it could delete all your files if you run it. Any chance for getting this committed? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic on 5.2 BETA: blockable sleep lock
On 30 Nov, Stefan Ehmann wrote: On Fri, 2003-11-28 at 01:02, Don Lewis wrote: On 27 Nov, Stefan Ehmann wrote: On Wed, 2003-11-26 at 08:33, Don Lewis wrote: The problem is that selrecord() wants to lock a MTX_DEF mutex, which can cause a context switch if the mutex is already locked by another thread. This is contrary to what bktr_poll() wants to accomplish by calling critical_enter(). Strange enough that does not seem to happen with a kernel built without INVARIANTS and WITNESS. Does this make any sense or is this just by chance? You might try the patch below with WITNESS enabled. I don't have the hardware, so I can't test it. It compiles for me, but for all I know it could delete all your files if you run it. Any chance for getting this committed? I've been forwarding these messages to the bktr maintainer listed in /usr/src/MAINTAINERS, in case he isn't subscribed to [EMAIL PROTECTED] I'm not suprised that I haven't heard from him because this issue came up at the start of the Thanksgiving holiday weekend. Commiting the patch will also require re approval because of the code freeze in preparation for 5.2-RELEASE. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic on 5.2 BETA: blockable sleep lock
On Fri, 2003-11-28 at 01:02, Don Lewis wrote: On 27 Nov, Stefan Ehmann wrote: On Wed, 2003-11-26 at 08:33, Don Lewis wrote: The problem is that selrecord() wants to lock a MTX_DEF mutex, which can cause a context switch if the mutex is already locked by another thread. This is contrary to what bktr_poll() wants to accomplish by calling critical_enter(). Strange enough that does not seem to happen with a kernel built without INVARIANTS and WITNESS. Does this make any sense or is this just by chance? You might try the patch below with WITNESS enabled. I don't have the hardware, so I can't test it. It compiles for me, but for all I know it could delete all your files if you run it. Tested it a few times and no panics - seems to work fine. Tested it again without the patch: panics immediately after starting alevt (not xawtv as initially stated). Thanks ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic on 5.2 BETA: blockable sleep lock
On Fri, 2003-11-28 at 01:02, Don Lewis wrote: On 27 Nov, Stefan Ehmann wrote: On Wed, 2003-11-26 at 08:33, Don Lewis wrote: The problem is that selrecord() wants to lock a MTX_DEF mutex, which can cause a context switch if the mutex is already locked by another thread. This is contrary to what bktr_poll() wants to accomplish by calling critical_enter(). Strange enough that does not seem to happen with a kernel built without INVARIANTS and WITNESS. Does this make any sense or is this just by chance? You might try the patch below with WITNESS enabled. I don't have the hardware, so I can't test it. It compiles for me, but for all I know it could delete all your files if you run it. Unfortunately, after running the patched kernel some time I got a slightly different panic: GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... panic: blockable sleep lock (sleep mutex) sellck @ /usr/src/sys/kern/sys_generic.c:1145 panic messages: --- panic: blockable sleep lock (sleep mutex) sellck @ /usr/src/sys/kern/sys_generic.c:1145 syncing disks, buffers remaining... 2248 2248 panic: mi_switch: switch in a critical section Uptime: 4m11s Dumping 383 MB 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 --- Reading symbols from /boot/kernel/snd_pcm.ko...done. Loaded symbols for /boot/kernel/snd_pcm.ko Reading symbols from /boot/kernel/snd_csa.ko...done. Loaded symbols for /boot/kernel/snd_csa.ko Reading symbols from /boot/kernel/bktr_mem.ko...done. Loaded symbols for /boot/kernel/bktr_mem.ko Reading symbols from /usr/obj/usr/src/sys/SHOE/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/SHOE/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug Reading symbols from /usr/obj/usr/src/sys/SHOE/modules/usr/src/sys/modules/linux/linux.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/SHOE/modules/usr/src/sys/modules/linux/linux.ko.debug Reading symbols from /usr/obj/usr/src/sys/SHOE/modules/usr/src/sys/modules/ext2fs/ext2fs.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/SHOE/modules/usr/src/sys/modules/ext2fs/ext2fs.ko.debug Reading symbols from /usr/obj/usr/src/sys/SHOE/modules/usr/src/sys/modules/ntfs/ntfs.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/SHOE/modules/usr/src/sys/modules/ntfs/ntfs.ko.debug Reading symbols from /boot/kernel/bktr.ko...done. Loaded symbols for /boot/kernel/bktr.ko #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 240 dumping++; (kgdb) bt #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 #1 0xc050f482 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:372 #2 0xc050f7d8 in panic () at /usr/src/sys/kern/kern_shutdown.c:550 #3 0xc05174e5 in mi_switch () at /usr/src/sys/kern/kern_synch.c:470 #4 0xc050f16e in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:312 #5 0xc050f7d8 in panic () at /usr/src/sys/kern/kern_shutdown.c:550 #6 0xc0535e19 in witness_lock (lock=0xc075c8a0, flags=8, file=0xc06c4d72 /usr/src/sys/kern/sys_generic.c, line=1145) at /usr/src/sys/kern/subr_witness.c:609 #7 0xc0505aaa in _mtx_lock_flags (m=0xc075c8a0, opts=0, file=0xc06f10bc äLmÀ\t, line=-1009577024) at /usr/src/sys/kern/kern_mutex.c:221 #8 0xc053a016 in selrecord (selector=0xc3d313c0, sip=0xc075c8a0) at /usr/src/sys/kern/sys_generic.c:1145 #9 0xc4174981 in bktr_close () from /boot/kernel/bktr.ko #10 0xc04c6650 in spec_poll (ap=0xd8cdbafc) at /usr/src/sys/fs/specfs/spec_vnops.c:379 #11 0xc04c5a88 in spec_vnoperate (ap=0x0) at /usr/src/sys/fs/specfs/spec_vnops.c:122 #12 0xc057635c in vn_poll (fp=0x0, events=0, active_cred=0xc412b080, td=0x0) at vnode_if.h:537 #13 0xc0539851 in selscan (td=0xc3d313c0, ibits=0xd8cdbb9c, obits=0xd8cdbb8c, nfd=6) at /usr/src/sys/sys/file.h:272 #14 0xc05393bf in kern_select (td=0xc3d313c0, nd=6, fd_in=0xbfbfeb90, fd_ou=0x0, fd_ex=0x0, tvp=0x0) at /usr/src/sys/kern/sys_generic.c:816 #15 0xc0539026 in select (td=0x0, uap=0xd8cdbd14) at /usr/src/sys/kern/sys_generic.c:720 #16 0xc06827e0 in syscall (frame= {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1, tf_esi = 134598312, tf_ebp = -1077941208, tf_isp = -657605260, tf_ebx = 0, tf_edx = 6, tf_ecx = -1077941360, tf_eax = 93, tf_trapno = 12, tf_err = 2, tf_eip = 673044815, tf_cs = 31, tf_eflags = 658, tf_esp = -1077941444, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1010 #17 0xc067444d in Xint0x80_syscall () at {standard input}:136 (kgdb) ___ [EMAIL PROTECTED] mailing list
Re: panic on 5.2 BETA: blockable sleep lock
On Fri, 2003-11-28 at 10:31, Stefan Ehmann wrote: On Fri, 2003-11-28 at 01:02, Don Lewis wrote: On 27 Nov, Stefan Ehmann wrote: On Wed, 2003-11-26 at 08:33, Don Lewis wrote: The problem is that selrecord() wants to lock a MTX_DEF mutex, which can cause a context switch if the mutex is already locked by another thread. This is contrary to what bktr_poll() wants to accomplish by calling critical_enter(). Strange enough that does not seem to happen with a kernel built without INVARIANTS and WITNESS. Does this make any sense or is this just by chance? You might try the patch below with WITNESS enabled. I don't have the hardware, so I can't test it. It compiles for me, but for all I know it could delete all your files if you run it. Unfortunately, after running the patched kernel some time I got a slightly different panic: Please ignore the message above - this was the panic from an unpatched kernel - I was debugging the wrong core. Thanks again for quick help. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic on 5.2 BETA: blockable sleep lock
On Wed, 2003-11-26 at 08:33, Don Lewis wrote: The problem is that selrecord() wants to lock a MTX_DEF mutex, which can cause a context switch if the mutex is already locked by another thread. This is contrary to what bktr_poll() wants to accomplish by calling critical_enter(). Strange enough that does not seem to happen with a kernel built without INVARIANTS and WITNESS. Does this make any sense or is this just by chance? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic on 5.2 BETA: blockable sleep lock
On 27 Nov, Stefan Ehmann wrote: On Wed, 2003-11-26 at 08:33, Don Lewis wrote: The problem is that selrecord() wants to lock a MTX_DEF mutex, which can cause a context switch if the mutex is already locked by another thread. This is contrary to what bktr_poll() wants to accomplish by calling critical_enter(). Strange enough that does not seem to happen with a kernel built without INVARIANTS and WITNESS. Does this make any sense or is this just by chance? You might try the patch below with WITNESS enabled. I don't have the hardware, so I can't test it. It compiles for me, but for all I know it could delete all your files if you run it. Index: sys/dev/bktr/bktr_core.c === RCS file: /home/ncvs/src/sys/dev/bktr/bktr_core.c,v retrieving revision 1.131 diff -u -r1.131 bktr_core.c --- sys/dev/bktr/bktr_core.c9 Nov 2003 09:17:21 - 1.131 +++ sys/dev/bktr/bktr_core.c27 Nov 2003 23:58:19 - @@ -526,6 +526,9 @@ } #endif /* FreeBSD or BSDi */ +#ifdef USE_VBIMUTEX + mtx_init(bktr-vbimutex, bktr vbi lock, NULL, MTX_DEF); +#endif /* If this is a module, save the current contiguous memory */ #if defined(BKTR_FREEBSD_MODULE) @@ -807,6 +810,7 @@ * both Odd and Even VBI data is captured. Therefore we do this * in the Even field interrupt handler. */ + LOCK_VBI(bktr); if ( (bktr-vbiflags VBI_CAPTURE) (bktr-vbiflags VBI_OPEN) (field==EVEN_F)) { @@ -826,6 +830,7 @@ } + UNLOCK_VBI(bktr); /* * Register the completed field @@ -1066,8 +1071,13 @@ int vbi_open( bktr_ptr_t bktr ) { - if (bktr-vbiflags VBI_OPEN) /* device is busy */ + + LOCK_VBI(bktr); + + if (bktr-vbiflags VBI_OPEN) {/* device is busy */ + UNLOCK_VBI(bktr); return( EBUSY ); + } bktr-vbiflags |= VBI_OPEN; @@ -1081,6 +1091,8 @@ bzero((caddr_t) bktr-vbibuffer, VBI_BUFFER_SIZE); bzero((caddr_t) bktr-vbidata, VBI_DATA_SIZE); + UNLOCK_VBI(bktr); + return( 0 ); } @@ -1166,8 +1178,12 @@ vbi_close( bktr_ptr_t bktr ) { + LOCK_VBI(bktr); + bktr-vbiflags = ~VBI_OPEN; + UNLOCK_VBI(bktr); + return( 0 ); } @@ -1232,19 +1248,32 @@ int vbi_read(bktr_ptr_t bktr, struct uio *uio, int ioflag) { - int readsize, readsize2; + int readsize, readsize2, start; int status; + /* +* XXX - vbi_read() should be protected against being re-entered +* while it is unlocked for the uiomove. +*/ + LOCK_VBI(bktr); while(bktr-vbisize == 0) { if (ioflag IO_NDELAY) { - return EWOULDBLOCK; + status = EWOULDBLOCK; + goto out; } bktr-vbi_read_blocked = TRUE; +#ifdef USE_VBIMUTEX + if ((status = msleep(VBI_SLEEP, bktr-vbimutex, VBIPRI, vbi, + 0))) { + goto out; + } +#else if ((status = tsleep(VBI_SLEEP, VBIPRI, vbi, 0))) { - return status; + goto out; } +#endif } /* Now we have some data to give to the user */ @@ -1262,19 +1291,28 @@ /* We need to wrap around */ readsize2 = VBI_BUFFER_SIZE - bktr-vbistart; - status = uiomove((caddr_t)bktr-vbibuffer + bktr-vbistart, readsize2, uio); - status += uiomove((caddr_t)bktr-vbibuffer, (readsize - readsize2), uio); + start = bktr-vbistart; + UNLOCK_VBI(bktr); + status = uiomove((caddr_t)bktr-vbibuffer + start, readsize2, uio); + if (status == 0) + status = uiomove((caddr_t)bktr-vbibuffer, (readsize - readsize2), uio); } else { + UNLOCK_VBI(bktr); /* We do not need to wrap around */ status = uiomove((caddr_t)bktr-vbibuffer + bktr-vbistart, readsize, uio); } + LOCK_VBI(bktr); + /* Update the number of bytes left to read */ bktr-vbisize -= readsize; /* Update vbistart */ bktr-vbistart += readsize; bktr-vbistart = bktr-vbistart % VBI_BUFFER_SIZE; /* wrap around if needed */ + +out: + UNLOCK_VBI(bktr); return( status ); Index: sys/dev/bktr/bktr_os.c === RCS file: /home/ncvs/src/sys/dev/bktr/bktr_os.c,v retrieving revision 1.39 diff -u -r1.39 bktr_os.c --- sys/dev/bktr/bktr_os.c 2 Sep 2003 17:30:34 - 1.39 +++ sys/dev/bktr/bktr_os.c 27 Nov 2003 23:49:39 - @@ -499,6 +499,9 @@ printf(bktr%d: i2c_attach: can't attach\n,
Re: panic on 5.2 BETA: blockable sleep lock
On 26 Nov, Stefan Ehmann wrote: I got the following panic twice when starting xawtv using 5.2 BETA (CVS from Oct 23) panic: blockable sleep lock (sleep mutex) sellck @/usr/src/sys/kern/sys_generic.c:1145 I don't think it is directly related to bktr since the last commit there was ~3 months ago. Here is the dmesg output for completeness though. bktr0: BrookTree 878 mem 0xdf103000-0xdf103fff irq 10 at device 12.0 on pci0 bktr0: Card has no configuration EEPROM. Cannot determine card make. bktr0: IMS TV Turbo, Philips FR1236 NTSC FM tuner. Here is a backtrace: GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... panic: blockable sleep lock (sleep mutex) sellck @ /usr/src/sys/kern/sys_generic.c:1145 panic messages: -- panic: blockable sleep lock (sleep mutex) sellck @ /usr/src/sys/kern/sys_generic.c:1145 syncing disks, buffers remaining... 3023 3023 panic: mi_switch: switch in a critical section #5 0xc05153b7 in panic () at /usr/src/sys/kern/kern_shutdown.c:550 #6 0xc053c010 in witness_lock (lock=0xc076cf40, flags=8, file=0xc06d2139 /usr/src/sys/kern/sys_generic.c, line=1145) at /usr/src/sys/kern/subr_witness.c:609 #7 0xc050b57a in _mtx_lock_flags (m=0xc3a56dc0, opts=0, file=0xc06ff79c \037#nÀ\t, line=-1065955520) at /usr/src/sys/kern/kern_mutex.c:221 #8 0xc0540436 in selrecord (selector=0xc076cf40, sip=0xc3a56dc0) at /usr/src/sys/kern/sys_generic.c:1145 #9 0xc08509f1 in bktr_poll () from /boot/kernel/bktr.ko The bktr_poll() implementation invokes the macros DISABLE_INTR() and ENABLE_INTR() as a wrapper around a block of code that calls selrecord(). DISABLE_INTR(s); if (events (POLLIN | POLLRDNORM)) { switch ( FUNCTION( minor(dev) ) ) { case VBI_DEV: if(bktr-vbisize == 0) selrecord(td, bktr-vbi_select); else revents |= events (POLLIN | POLLRDNORM); break; } } ENABLE_INTR(s); The implementation of DISABLE_INTR() and ENABLE_INTR() is defined in bktr_os.h as: #if defined(__FreeBSD__) #if (__FreeBSD_version =50) #define DECLARE_INTR_MASK(s)/* no need to declare 's' */ #define DISABLE_INTR(s) critical_enter() #define ENABLE_INTR(s) critical_exit() #else The man page for critical_enter() and friends says: DESCRIPTION These functions are used to prevent preemption in a critical region of code. All that is guaranteed is that the thread currently executing on a CPU will not be preempted. Specifically, a thread in a critical region will not migrate to another CPU while it is in a critical region. The current CPU may still trigger faults and exceptions during a critical section; however, these faults are usually fatal. The cpu_critical_enter() and cpu_critical_exit() functions provide the machine dependent disabling of preemption, normally by disabling inter- rupts on the local CPU. The critical_enter() and critical_exit() functions provide a machine independent wrapper around the machine dependent API. This wrapper cur- rently saves state regarding nested critical sections. Nearly all code should use these versions of the API. Note that these functions are not required to provide any inter-CPU syn- chronization, data protection, or memory ordering guarantees and thus should not be used to protect shared data structures. These functions should be used with care as an infinite loop within a critical region will deadlock the CPU. Also, they should not be inter- locked with operations on mutexes, sx locks, semaphores, or other syn- chronization primitives. The problem is that selrecord() wants to lock a MTX_DEF mutex, which can cause a context switch if the mutex is already locked by another thread. This is contrary to what bktr_poll() wants to accomplish by calling critical_enter(). I'm guessing that bktr_poll() wants to prevent bktr-vbisize from being updated by an interrupt while the above mentioned block of code is executing, possibly to prevent a select wakeup from being missed. If this is correct, the proper fix would probably be for a mutex to be added to the bktr structure, and the DISABLE_INTR() and ENABLE_INTR() calls above to lock and unlock this mutex. Other places in the code that manipulate bktr-vbisize should grab the mutex before doing so, and there are places in the code that are currently calling tsleep() that should probably be