RE: Anybody else seeing a broken /dev/lpt with SMP on -current?
On Fri, 12 Jan 2001, John Baldwin wrote: On 13-Jan-01 Jordan Hubbard wrote: I've actually been seeing this for about 2 months now but only just now got motivated enough to enable crashdumps and get some information on what happens whenver I try to use the printer attached to my (sadly :) -current SMP box: IdlePTD 3682304 initial pcb at 2e70e0 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode cpuid = 0; lapic.id = fault virtual address = 0x8640 fault code = supervisor write, page not present instruction pointer = 0x8:0xc8dc8676 stack pointer = 0x10:0xc8280f88 frame pointer = 0x10:0xc8280f9c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12322 (irq7: lpt0) trap number = 12 panic: page fault cpuid = 0; lapic.id = boot() called on cpu#0 If anybody wants a fuller traceback then I'll compile up a kernel with debugging symbols, but it's going to be pretty sparse anyway since it basically only shows the trap() from the page fault and the subsequent panic. All the other traces show the kerenl having returned to an address that is beyongd the end of the kernel (which causes the page fault) meaning that the stack is fubar'd, so the trace isn't meaningful anyways. :( Knowing how and why the lpd interrupt handler trashes the stack is the useful info, and with teh stack already trashed, I don't know of an easy way to figure that out. Suggestions welcome. This may be cause by the lpt driver (ab)using BUS_SETUP_INTR() on every write(). The interrupt system can't handle this. I noticed the following symptoms: - stray irq7's from when the driver interrupt isn't attached (BUS_SETUP_INTR() for ppbus first tears down any previously set up handler). - under UP, a slow memory leak from not freeing ih_name in inthand_remove(). Fixed in the enclosed patch. - under SMP with 1 cpu, panics in various places due to the process table filling up with undead ithreads. Worked around in the enclosed patch. This bug should go away almost automatically when interrupt handling actually works. Use something like "dd if=/dev/zero of=/dev/lpt0 bs=1" to see this bug. Use a small value for kern.maxproc to see it quickly. - "cp /dev/zero /dev/lpt0 " caused about 50% interrupt overhead. Under UP, interactive response was not noticeably affected, but under SMP with 1 cpu, echoing of keystrokes in /bin/sh in single user mode took a few hundred msec. Index: dev/ppbus/lpt.c === RCS file: /home/ncvs/src/sys/dev/ppbus/lpt.c,v retrieving revision 1.20 diff -c -2 -r1.20 lpt.c *** dev/ppbus/lpt.c 2000/12/07 22:33:12 1.20 --- dev/ppbus/lpt.c 2001/01/15 02:44:40 *** *** 70,73 --- 70,76 #include sys/conf.h #include sys/kernel.h + #include sys/mutex.h + #include sys/proc.h + #include sys/resourcevar.h #include sys/uio.h #include sys/syslog.h *** *** 759,762 --- 762,797 device_printf(lptdev, "handler registration failed, polled mode.\n"); sc-sc_irq = ~LP_USE_IRQ; + } + + /* +* XXX setting up interrupts is a very expensive operation and +* shouldn't be done here. Despite its name, BUS_SETUP_INTR() +* for this bus both sets up and tears down interrupts (it +* first tears down any already-setup interrupt). This +* involves exiting from any existing ithread and starting a +* new one. The exit is done lazily, and at least under SMP, +* writing tinygrams resulted in ithreads being created faster +* than they were destroyed, resulting in assorted panics +* depending on where the resource exhaustion was detected. +* +* Yield so that the ithreads get a chance to exit. +* +* XXX following grot cloned from uio_yield(). +*/ + { + struct proc *p; + int s; + + p = curproc; + s = splhigh(); + mtx_enter(sched_lock, MTX_SPIN); + DROP_GIANT_NOSWITCH(); + p-p_priority = p-p_usrpri; + setrunqueue(p); + p-p_stats-p_ru.ru_nivcsw++; + mi_switch(); + mtx_exit(sched_lock, MTX_SPIN); + PICKUP_GIANT(); + splx(s); } } Index: i386/isa/intr_machdep.c === RCS file: /home/ncvs/src/sys/i386/isa/intr_machdep.c,v retrieving
Re: Anybody else seeing a broken /dev/lpt with SMP on -current?
Hi Bruce, I applied the patch to dev/ppbus/lpt.c and sys/i386/isa/intr_machdep.c. Before the patch, I got the lpt failure almost immediately. df | lpr df | lpr df | lpr lpt .cshrc would normally do it. After the patch, it took lots more activity. I did the above a half-dozen times, successfully, and then: foreach i ( 1 2 3 4 5 6 7 8 9 a b c ) df | lpr end printf "\f" | lpr and, this failed. I had 4 sets of df on the page left to be ejected in the printer. tomdean To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Anybody else seeing a broken /dev/lpt with SMP on -current?
On 14-Jan-01 Garance A Drosihn wrote: At 6:55 PM -0800 1/12/01, John Baldwin wrote: On 13-Jan-01 Jordan Hubbard wrote: If anybody wants a fuller traceback then I'll compile up a kernel with debugging symbols, but it's going to be pretty sparse anyway since it basically only shows the trap() from the page fault and the subsequent panic. All the other traces show the kernel having returned to an address that is beyond the end of the kernel (which causes the page fault) meaning that the stack is fubar'd, so the trace isn't meaningful anyways. :( Knowing how and why the lpd interrupt handler trashes the stack is the useful info, and with the stack already trashed, I don't know of an easy way to figure that out. Do you really mean the "lpd interrupt handler", or do you mean the "lpt interrupt handler"? Does this problem only happen when lpd is sending data thru /dev/lpt? lpt interrupt handler, yes. -- John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Anybody else seeing a broken /dev/lpt with SMP on -current?
At 6:55 PM -0800 1/12/01, John Baldwin wrote: On 13-Jan-01 Jordan Hubbard wrote: If anybody wants a fuller traceback then I'll compile up a kernel with debugging symbols, but it's going to be pretty sparse anyway since it basically only shows the trap() from the page fault and the subsequent panic. All the other traces show the kernel having returned to an address that is beyond the end of the kernel (which causes the page fault) meaning that the stack is fubar'd, so the trace isn't meaningful anyways. :( Knowing how and why the lpd interrupt handler trashes the stack is the useful info, and with the stack already trashed, I don't know of an easy way to figure that out. Do you really mean the "lpd interrupt handler", or do you mean the "lpt interrupt handler"? Does this problem only happen when lpd is sending data thru /dev/lpt? -- Garance Alistair Drosehn= [EMAIL PROTECTED] Senior Systems Programmer or [EMAIL PROTECTED] Rensselaer Polytechnic Instituteor [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: Anybody else seeing a broken /dev/lpt with SMP on -current?
On 13-Jan-01 Jordan Hubbard wrote: I've actually been seeing this for about 2 months now but only just now got motivated enough to enable crashdumps and get some information on what happens whenver I try to use the printer attached to my (sadly :) -current SMP box: IdlePTD 3682304 initial pcb at 2e70e0 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode cpuid = 0; lapic.id = fault virtual address = 0x8640 fault code = supervisor write, page not present instruction pointer = 0x8:0xc8dc8676 stack pointer = 0x10:0xc8280f88 frame pointer = 0x10:0xc8280f9c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12322 (irq7: lpt0) trap number = 12 panic: page fault cpuid = 0; lapic.id = boot() called on cpu#0 If anybody wants a fuller traceback then I'll compile up a kernel with debugging symbols, but it's going to be pretty sparse anyway since it basically only shows the trap() from the page fault and the subsequent panic. All the other traces show the kerenl having returned to an address that is beyongd the end of the kernel (which causes the page fault) meaning that the stack is fubar'd, so the trace isn't meaningful anyways. :( Knowing how and why the lpd interrupt handler trashes the stack is the useful info, and with teh stack already trashed, I don't know of an easy way to figure that out. Suggestions welcome. - Jordan -- John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Anybody else seeing a broken /dev/lpt with SMP on -current?
* John Baldwin [EMAIL PROTECTED] [010112 18:56] wrote: On 13-Jan-01 Jordan Hubbard wrote: I've actually been seeing this for about 2 months now but only just now got motivated enough to enable crashdumps and get some information on what happens whenver I try to use the printer attached to my (sadly :) -current SMP box: All the other traces show the kerenl having returned to an address that is beyongd the end of the kernel (which causes the page fault) meaning that the stack is fubar'd, so the trace isn't meaningful anyways. :( Knowing how and why the lpd interrupt handler trashes the stack is the useful info, and with teh stack already trashed, I don't know of an easy way to figure that out. Suggestions welcome. printf(9) :) -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Anybody else seeing a broken /dev/lpt with SMP on -current?
On 13-Jan-01 Alfred Perlstein wrote: * John Baldwin [EMAIL PROTECTED] [010112 18:56] wrote: On 13-Jan-01 Jordan Hubbard wrote: I've actually been seeing this for about 2 months now but only just now got motivated enough to enable crashdumps and get some information on what happens whenver I try to use the printer attached to my (sadly :) -current SMP box: All the other traces show the kerenl having returned to an address that is beyongd the end of the kernel (which causes the page fault) meaning that the stack is fubar'd, so the trace isn't meaningful anyways. :( Knowing how and why the lpd interrupt handler trashes the stack is the useful info, and with teh stack already trashed, I don't know of an easy way to figure that out. Suggestions welcome. printf(9) :) Maybe if I had a printer lying around. :) I can send jkh some patches to dump out stuff, but I was looking more for suggestions on making sense of the crashdump, not just brute-forcing it. :-P -Alfred -- John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message