pcm(4) related panic
Hello, On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic like this: (watch out for folded lines; the stack backtrace below is rewritten by hand from ddb) lock order reversal 1st 0xc22a45ac vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323 2nd 0xc06c0420 swap_pager swhash (swap_pager swhash) @ \ /usr/src/sys/vm/swap_pager.c:1838 3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876 Stack backtrace: backtrace witness_lock _mtx_lock_flags obj_allock slab_zalloc uma_zone_slab uma_zalloc_internal uma_zalloc_arg swp_pager_meta_build swap_pager_putpages default_pager_putpages vm_pageout_flush vm_pageout_clean vm_pageout_scan vm_pageout fork_exit fork_trampoline Sleeping on swread with the following non-sleepable locks held: exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \ /usr/src/sys/dev/sound/pcm/dsp.c:146 panic: sleeping thread (pid 583) owns a non-sleepable lock syncing disks, buffers remaining... 1410 1410 panic: mi_switch: \ switch in a critical section Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep [... repeated few more times] Fatal double fault: eip = 0xc05e3916 esp = 0xc8db8ff4 ebp = 0xc8db9004 panic: double fault Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s [...] And the machine suddenly reboots, so there is no coredump. eip address points close to: c05e3910 T sc_vtb_putc To reproduce this panic just start some audio player app (like xmms), and launch countless memory-eating applications (like mozilla ;). The machine starts swapping, and it panics. % uname -a FreeBSD kaszanka.domek 5.2-BETA FreeBSD 5.2-BETA #0: Sun Nov 23 01:23:10\ CET 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KASZANKA i386 dmesg fragments: CPU: AMD Athlon(tm) XP 2000+ (1666.73-MHz 686-class CPU) pcm0: AudioPCI ES1373-B port 0xec00-0xec3f irq 10 at device 8.0 on pci0 pcm0: Cirrus Logic CS4297A AC97 Codec rl0: RealTek 8139 10/100BaseTX port 0xe800-0xe8ff mem \ 0xdf00-0xdfff ir q 10 at device 10.0 on pci0 miibus0: MII bus on rl0 rlphy0: RealTek internal media interface on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto rl1: RealTek 8139 10/100BaseTX port 0xe400-0xe4ff mem \ 0xde00-0xdeff ir q 10 at device 11.0 on pci0 rlphy1: RealTek internal media interface on miibus1 rlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto Regards, Artur ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: pcm(4) related panic
Artur Poplawski [EMAIL PROTECTED] wrote: Hello, On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic like this: (watch out for folded lines; the stack backtrace below is rewritten by hand from ddb) lock order reversal 1st 0xc22a45ac vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323 2nd 0xc06c0420 swap_pager swhash (swap_pager swhash) @ \ /usr/src/sys/vm/swap_pager.c:1838 3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876 Stack backtrace: backtrace witness_lock _mtx_lock_flags obj_allock slab_zalloc uma_zone_slab uma_zalloc_internal uma_zalloc_arg swp_pager_meta_build swap_pager_putpages default_pager_putpages vm_pageout_flush vm_pageout_clean vm_pageout_scan vm_pageout fork_exit fork_trampoline Sleeping on swread with the following non-sleepable locks held: exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \ /usr/src/sys/dev/sound/pcm/dsp.c:146 panic: sleeping thread (pid 583) owns a non-sleepable lock syncing disks, buffers remaining... 1410 1410 panic: mi_switch: \ switch in a critical section Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep [... repeated few more times] Fatal double fault: eip = 0xc05e3916 esp = 0xc8db8ff4 ebp = 0xc8db9004 panic: double fault Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s panic: msleep Uptime: 1m45s [...] And the machine suddenly reboots, so there is no coredump. eip address points close to: c05e3910 T sc_vtb_putc To reproduce this panic just start some audio player app (like xmms), and launch countless memory-eating applications (like mozilla ;). The machine starts swapping, and it panics. % uname -a FreeBSD kaszanka.domek 5.2-BETA FreeBSD 5.2-BETA #0: Sun Nov 23 01:23:10\ CET 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KASZANKA i386 dmesg fragments: CPU: AMD Athlon(tm) XP 2000+ (1666.73-MHz 686-class CPU) pcm0: AudioPCI ES1373-B port 0xec00-0xec3f irq 10 at device 8.0 on pci0 pcm0: Cirrus Logic CS4297A AC97 Codec rl0: RealTek 8139 10/100BaseTX port 0xe800-0xe8ff mem \ 0xdf00-0xdfff ir q 10 at device 10.0 on pci0 miibus0: MII bus on rl0 rlphy0: RealTek internal media interface on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto rl1: RealTek 8139 10/100BaseTX port 0xe400-0xe4ff mem \ 0xde00-0xdeff ir q 10 at device 11.0 on pci0 rlphy1: RealTek internal media interface on miibus1 rlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto In the meantime I've managed to get a coredump, by directly calling doadump() from ddb. Results: [EMAIL PROTECTED]:/usr/obj/usr/src/sys/KASZANKA# gdb -k kernel.debug /var/crash/vmcore.0 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... panic: sleeping thread (pid 568) owns a non-sleepable lock panic messages: --- panic: sleeping thread (pid 568) owns a non-sleepable lock syncing disks, buffers remaining... panic: msleep Dumping 128 MB 16 32 48 64 80 96 112 --- Reading symbols from /usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linprocfs/linprocfs.ko.debug Reading symbols from /usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linux/linux.ko.debug...done. Loaded symbols for /usr/obj/usr/src/sys/KASZANKA/modules/usr/src/sys/modules/linux/linux.ko.debug Reading symbols from /boot/kernel/netgraph.ko...done. Loaded symbols for /boot/kernel/netgraph.ko Reading symbols from /boot/kernel/ng_ether.ko...done. Loaded symbols for /boot/kernel/ng_ether.ko Reading symbols from /boot/kernel/ng_pppoe.ko...done. Loaded symbols for /boot/kernel/ng_pppoe.ko Reading symbols from /boot/kernel/ng_socket.ko...done. Loaded symbols for /boot/kernel/ng_socket.ko Reading symbols from /boot/kernel/mga.ko...done. Loaded symbols for /boot/kernel/mga.ko #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 240 dumping++; (kgdb) where #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 #1 0xc04292cd in db_fncall (dummy1=0, dummy2=0, dummy3=0, dummy4=0xc8dba7bc à×hÀ) at /usr/src/sys/ddb/db_command.c:548 #2 0xc042906a in db_command (last_cmdp
Re: pcm(4) related panic
Mathew Kanner [EMAIL PROTECTED] wrote: On Nov 25, Don Lewis wrote: On 25 Nov, Don Lewis wrote: On 25 Nov, Artur Poplawski wrote: Artur Poplawski [EMAIL PROTECTED] wrote: Hello, On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic like this: Sleeping on swread with the following non-sleepable locks held: exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \ /usr/src/sys/dev/sound/pcm/dsp.c:146 This enables the panic. panic: sleeping thread (pid 583) owns a non-sleepable lock Then the panic happens when another thread tries to grab the mutex. The problem is that the pcm code attempts to hold a mutex across a call to uiomove(), which can sleep if the userland buffer that it is trying to access is paged out. Either the buffer has to be pre-wired before calling getchns(), or the mutex has to be dropped around the call to uiomove(). The amount of memory to be wired should be limited to 'sz' as calculated by chn_read() and chn_write(), which complicates the logic. Dropping the mutex probably has other issues. Following up to myself ... It might be safe to drop the mutex for the uiomove() call if the code set flags to enforce a limit of one reader and one writer at a time to keep the code from being re-entered. The buffer pointer manipulations in sndbuf_dispose() and sndbuf_acquire() would probably still have to be protected by the mutex. If this can be made to work, it would probably be preferable to wiring the buffer. It would have a lot less CPU overhead, and would work better with large buffers, which could still be allowed to page normally. Don, I never would have suspected that uio might sleep and panic, thanks for the clue. Artur, Could you try the attached patch. I've tried the patch -- and it works great! :-) I was unable to trigger the panic with the patch applied, although I tried really hard -- so I guess the problem is solved. Mat and Don, I'm really very thankful for your help. Best regards, Artur ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: please test pcm patch
Mathew Kanner [EMAIL PROTECTED] wrote: Hello All, Please test a pcm patch that releases the channel lock around calls to uio move. This is a more complete patch than the previous one as it also does the _read routine. I will ask the RE to commit this if I hear a couple of it works. It works. ;-) Artur ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: LOR w/5.2-BETA
On Thu, 27 Nov 2003 18:01:39 -0500 Jesse Guardiani [EMAIL PROTECTED] wrote: I got this LOR today after upgrading my IBM Thinkpad A30p from 5.1-RELEASE to 5.2-BETA: lock order reversal 1st 0xc43d8ad4 vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323 2nd 0xc098cf60 swap_pager swhash (swap_pager swhash) @ /usr/src/sys/vm/swap_pag er.c:1838 3rd 0xc10368c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876 Stack backtrace: backtrace(c089b8b5,c10368c4,c08afa80,c08afa80,c08b0961) at backtrace+0x17 witness_lock(c10368c4,8,c08b0961,36c,c3c7da80) at witness_lock+0x672 _mtx_lock_flags(c10368c4,0,c08b0961,36c,c3c7da94) at _mtx_lock_flags+0xba obj_alloc(c3c7da80,1000,d2a6ba03,101,c095b220) at obj_alloc+0x3f slab_zalloc(c3c7da80,1,8,c08b0961,68c) at slab_zalloc+0xb3 uma_zone_slab(c3c7da80,1,c08b0961,68c,c3c7db20) at uma_zone_slab+0xd6 uma_zalloc_internal(c3c7da80,0,1,5c1,c08ae87a,72e) at uma_zalloc_internal+0x3e uma_zalloc_arg(c3c7da80,0,1,72e,2) at uma_zalloc_arg+0x3ab swp_pager_meta_build(c43d8ad4,19,0,2,0) at swp_pager_meta_build+0x174 swap_pager_putpages(c43d8ad4,d2a6bbd0,1,0,d2a6bb40) at swap_pager_putpages+0x32d default_pager_putpages(c43d8ad4,d2a6bbd0,1,0,d2a6bb40) at default_pager_putpages +0x2e vm_pageout_flush(d2a6bbd0,1,0,eb,0) at vm_pageout_flush+0x17a vm_pageout_clean(c160b7a8,0,c08b077c,32a,0) at vm_pageout_clean+0x305 vm_pageout_scan(0,0,c08b077c,5a9,1f4) at vm_pageout_scan+0x64c vm_pageout(0,d2a6bd48,c0895fd8,311,0) at vm_pageout+0x31b fork_exit(c07e7e60,0,d2a6bd48) at fork_exit+0xb4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xd2a6bd7c, ebp = 0 --- Exactly the same thing here: lock order reversal 1st 0xc0c1f738 vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323 2nd 0xc06c06a0 swap_pager swhash (swap_pager swhash) @ /usr/src/sys/vm/swap_pag er.c:1838 3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876 Stack backtrace: backtrace(c0632e21,c0c358c4,c063ff1c,c063ff1c,c0640dfd) at backtrace+0x17 witness_lock(c0c358c4,8,c0640dfd,36c,c0c223c0) at witness_lock+0x671 _mtx_lock_flags(c0c358c4,0,c0640dfd,36c,c0c223d4) at _mtx_lock_flags+0xb1 obj_alloc(c0c223c0,1000,c8daba0b,101,0) at obj_alloc+0x3f slab_zalloc(c0c223c0,1,c0c223d4,8,c0640dfd) at slab_zalloc+0xb1 uma_zone_slab(c0c223c0,1,c0640dfd,68c,c0c22460) at uma_zone_slab+0xd3 uma_zalloc_internal(c0c223c0,0,1,5c1,72e) at uma_zalloc_internal+0x3e uma_zalloc_arg(c0c223c0,0,1,72e,2) at uma_zalloc_arg+0x39e swp_pager_meta_build(c0c1f738,0,0,2,0) at swp_pager_meta_build+0x174 swap_pager_putpages(c0c1f738,c8dabbd0,1,0,c8dabb40) at swap_pager_putpages+0x31d default_pager_putpages(c0c1f738,c8dabbd0,1,0,c8dabb40) at default_pager_putpages +0x2e vm_pageout_flush(c8dabbd0,1,0,eb,60a) at vm_pageout_flush+0x171 vm_pageout_clean(c0dbabf0,0,c0640c18,32a,0) at vm_pageout_clean+0x2f5 vm_pageout_scan(1,0,c0640c18,5a9,1f4) at vm_pageout_scan+0x655 vm_pageout(0,c8dabd48,c062d6bf,311,0) at vm_pageout+0x318 fork_exit(c05cab80,0,c8dabd48) at fork_exit+0xb4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc8dabd7c, ebp = 0 --- And another one, always showing up at the end of booting process: lock order reversal 1st 0xc205a764 rtentry (rtentry) @ /usr/src/sys/net/rtsock.c:387 2nd 0xc1c5d47c radix node head (radix node head) @ /usr/src/sys/net/route.c:133 Stack backtrace: backtrace(c0632e08,c1c5d47c,c063873b,c063873b,c0638791) at backtrace+0x17 witness_lock(c1c5d47c,8,c0638791,85,c1c4ea20) at witness_lock+0x671 _mtx_lock_flags(c1c5d47c,0,c0638791,85,121) at _mtx_lock_flags+0xb1 rtalloc1(c2073e6c,1,0,436,0) at rtalloc1+0x72 rt_setgate(c205a700,c1c4ea20,c2073e6c,184,0) at rt_setgate+0x24c route_output(c0fb9600,c1d5c690,7c,c0fb9600,1f84) at route_output+0x664 raw_usend(c1d5c690,0,c0fb9600,0,0) at raw_usend+0x73 rts_send(c1d5c690,0,c0fb9600,0,0) at rts_send+0x35 sosend(c1d5c690,0,c8dd9c80,c0fb9600,0) at sosend+0x41d soo_write(c1ce150c,c8dd9c80,c2073e80,0,c0fad3c0) at soo_write+0x70 dofilewrite(c0fad3c0,c1ce150c,3,bfbfde10,7c) at dofilewrite+0xec write(c0fad3c0,c8dd9d14,c0644e02,3ee,3) at write+0x6e syscall(2f,2f,2f,3,3) at syscall+0x292 Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (4), eip = 0x2827e6cf, esp = 0xbfbfddcc, ebp = 0xbfbfddf8 --- Artur ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]