RE: Anybody else seeing a broken /dev/lpt with SMP on -current?

2001-01-15 Thread Bruce Evans

On Fri, 12 Jan 2001, John Baldwin wrote:

 On 13-Jan-01 Jordan Hubbard wrote:
  I've actually been seeing this for about 2 months now but only just
  now got motivated enough to enable crashdumps and get some information
  on what happens whenver I try to use the printer attached to my (sadly :)
  -current SMP box:
  
  IdlePTD 3682304
  initial pcb at 2e70e0
  panicstr: page fault
  panic messages:
  ---
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; lapic.id = 
  fault virtual address   = 0x8640
  fault code  = supervisor write, page not present
  instruction pointer = 0x8:0xc8dc8676
  stack pointer   = 0x10:0xc8280f88
  frame pointer   = 0x10:0xc8280f9c
  code segment= base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, def32 1, gran 1
  processor eflags= interrupt enabled, resume, IOPL = 0
  current process = 12322 (irq7: lpt0)
  trap number = 12
  panic: page fault
  cpuid = 0; lapic.id = 
  boot() called on cpu#0
  
  If anybody wants a fuller traceback then I'll compile up a kernel with
  debugging symbols, but it's going to be pretty sparse anyway since it
  basically only shows the trap() from the page fault and the subsequent
  panic.
 
 All the other traces show the kerenl having returned to an address that is
 beyongd the end of the kernel (which causes the page fault) meaning that the
 stack is fubar'd, so the trace isn't meaningful anyways. :(  Knowing how and
 why the lpd interrupt handler trashes the stack is the useful info, and with
 teh stack already trashed, I don't know of an easy way to figure that out. 
 Suggestions welcome.

This may be cause by the lpt driver (ab)using BUS_SETUP_INTR() on every
write().  The interrupt system can't handle this.  I noticed the following
symptoms:
- stray irq7's from when the driver interrupt isn't attached (BUS_SETUP_INTR()
  for ppbus first tears down any previously set up handler).
- under UP, a slow memory leak from not freeing ih_name in inthand_remove().
  Fixed in the enclosed patch.
- under SMP with 1 cpu, panics in various places due to the process table
  filling up with undead ithreads.  Worked around in the enclosed patch.
  This bug should go away almost automatically when interrupt handling
  actually works.  Use something like "dd if=/dev/zero of=/dev/lpt0 bs=1"
  to see this bug.  Use a small value for kern.maxproc to see it quickly.
- "cp /dev/zero /dev/lpt0 " caused about 50% interrupt overhead.  Under
  UP, interactive response was not noticeably affected, but under SMP with
  1 cpu, echoing of keystrokes in /bin/sh in single user mode took a few
  hundred msec.

Index: dev/ppbus/lpt.c
===
RCS file: /home/ncvs/src/sys/dev/ppbus/lpt.c,v
retrieving revision 1.20
diff -c -2 -r1.20 lpt.c
*** dev/ppbus/lpt.c 2000/12/07 22:33:12 1.20
--- dev/ppbus/lpt.c 2001/01/15 02:44:40
***
*** 70,73 
--- 70,76 
  #include sys/conf.h
  #include sys/kernel.h
+ #include sys/mutex.h
+ #include sys/proc.h
+ #include sys/resourcevar.h
  #include sys/uio.h
  #include sys/syslog.h
***
*** 759,762 
--- 762,797 
device_printf(lptdev, "handler registration failed, polled 
mode.\n");
sc-sc_irq = ~LP_USE_IRQ;
+   }
+ 
+   /*
+* XXX setting up interrupts is a very expensive operation and
+* shouldn't be done here.  Despite its name, BUS_SETUP_INTR()
+* for this bus both sets up and tears down interrupts (it
+* first tears down any already-setup interrupt).  This
+* involves exiting from any existing ithread and starting a
+* new one.  The exit is done lazily, and at least under SMP,
+* writing tinygrams resulted in ithreads being created faster
+* than they were destroyed, resulting in assorted panics
+* depending on where the resource exhaustion was detected.
+*
+* Yield so that the ithreads get a chance to exit.
+*
+* XXX following grot cloned from uio_yield().
+*/
+   {
+   struct proc *p;
+   int s;
+ 
+   p = curproc;
+   s = splhigh();
+   mtx_enter(sched_lock, MTX_SPIN);
+   DROP_GIANT_NOSWITCH();
+   p-p_priority = p-p_usrpri;
+   setrunqueue(p);
+   p-p_stats-p_ru.ru_nivcsw++;
+   mi_switch();
+   mtx_exit(sched_lock, MTX_SPIN);
+   PICKUP_GIANT();
+   splx(s);
}
}
Index: i386/isa/intr_machdep.c
===
RCS file: /home/ncvs/src/sys/i386/isa/intr_machdep.c,v
retrieving 

Re: Anybody else seeing a broken /dev/lpt with SMP on -current?

2001-01-15 Thread Thomas D. Dean

Hi Bruce,

I applied the patch to dev/ppbus/lpt.c and sys/i386/isa/intr_machdep.c.

Before the patch, I got the lpt failure almost immediately.
  df | lpr
  df | lpr
  df | lpr
  lpt .cshrc

would normally do it.

After the patch, it took lots more activity.  I did the above a
half-dozen times, successfully, and then:

  foreach i ( 1 2 3 4 5 6 7 8 9 a b c )
df | lpr
  end
  printf "\f" | lpr

and, this failed.  I had 4 sets of df on the page left to be ejected
in the printer.

tomdean


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: Anybody else seeing a broken /dev/lpt with SMP on -current?

2001-01-15 Thread John Baldwin


On 14-Jan-01 Garance A Drosihn wrote:
 At 6:55 PM -0800 1/12/01, John Baldwin wrote:
On 13-Jan-01 Jordan Hubbard wrote:
   If anybody wants a fuller traceback then I'll compile up a kernel
   with debugging symbols, but it's going to be pretty sparse anyway
   since it basically only shows the trap() from the page fault and
   the subsequent panic.

All the other traces show the kernel having returned to an address
that is beyond the end of the kernel (which causes the page fault)
meaning that the stack is fubar'd, so the trace isn't meaningful
anyways. :(  Knowing how and why the lpd interrupt handler trashes
the stack is the useful info, and with the stack already trashed,
I don't know of an easy way to figure that out.
 
 Do you really mean the "lpd interrupt handler", or do you mean
 the "lpt interrupt handler"?  Does this problem only happen when
 lpd is sending data thru /dev/lpt?

lpt interrupt handler, yes.

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: Anybody else seeing a broken /dev/lpt with SMP on -current?

2001-01-13 Thread Garance A Drosihn

At 6:55 PM -0800 1/12/01, John Baldwin wrote:
On 13-Jan-01 Jordan Hubbard wrote:
   If anybody wants a fuller traceback then I'll compile up a kernel
   with debugging symbols, but it's going to be pretty sparse anyway
   since it basically only shows the trap() from the page fault and
   the subsequent panic.

All the other traces show the kernel having returned to an address
that is beyond the end of the kernel (which causes the page fault)
meaning that the stack is fubar'd, so the trace isn't meaningful
anyways. :(  Knowing how and why the lpd interrupt handler trashes
the stack is the useful info, and with the stack already trashed,
I don't know of an easy way to figure that out.

Do you really mean the "lpd interrupt handler", or do you mean
the "lpt interrupt handler"?  Does this problem only happen when
lpd is sending data thru /dev/lpt?
-- 
Garance Alistair Drosehn=   [EMAIL PROTECTED]
Senior Systems Programmer   or  [EMAIL PROTECTED]
Rensselaer Polytechnic Instituteor  [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: Anybody else seeing a broken /dev/lpt with SMP on -current?

2001-01-12 Thread John Baldwin


On 13-Jan-01 Jordan Hubbard wrote:
 I've actually been seeing this for about 2 months now but only just
 now got motivated enough to enable crashdumps and get some information
 on what happens whenver I try to use the printer attached to my (sadly :)
 -current SMP box:
 
 IdlePTD 3682304
 initial pcb at 2e70e0
 panicstr: page fault
 panic messages:
 ---
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; lapic.id = 
 fault virtual address   = 0x8640
 fault code  = supervisor write, page not present
 instruction pointer = 0x8:0xc8dc8676
 stack pointer   = 0x10:0xc8280f88
 frame pointer   = 0x10:0xc8280f9c
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 12322 (irq7: lpt0)
 trap number = 12
 panic: page fault
 cpuid = 0; lapic.id = 
 boot() called on cpu#0
 
 If anybody wants a fuller traceback then I'll compile up a kernel with
 debugging symbols, but it's going to be pretty sparse anyway since it
 basically only shows the trap() from the page fault and the subsequent
 panic.

All the other traces show the kerenl having returned to an address that is
beyongd the end of the kernel (which causes the page fault) meaning that the
stack is fubar'd, so the trace isn't meaningful anyways. :(  Knowing how and
why the lpd interrupt handler trashes the stack is the useful info, and with
teh stack already trashed, I don't know of an easy way to figure that out. 
Suggestions welcome.

 - Jordan

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Anybody else seeing a broken /dev/lpt with SMP on -current?

2001-01-12 Thread Alfred Perlstein

* John Baldwin [EMAIL PROTECTED] [010112 18:56] wrote:
 
 On 13-Jan-01 Jordan Hubbard wrote:
  I've actually been seeing this for about 2 months now but only just
  now got motivated enough to enable crashdumps and get some information
  on what happens whenver I try to use the printer attached to my (sadly :)
  -current SMP box:
  
 
 All the other traces show the kerenl having returned to an address that is
 beyongd the end of the kernel (which causes the page fault) meaning that the
 stack is fubar'd, so the trace isn't meaningful anyways. :(  Knowing how and
 why the lpd interrupt handler trashes the stack is the useful info, and with
 teh stack already trashed, I don't know of an easy way to figure that out. 
 Suggestions welcome.

printf(9)

:)

-Alfred


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Anybody else seeing a broken /dev/lpt with SMP on -current?

2001-01-12 Thread John Baldwin


On 13-Jan-01 Alfred Perlstein wrote:
 * John Baldwin [EMAIL PROTECTED] [010112 18:56] wrote:
 
 On 13-Jan-01 Jordan Hubbard wrote:
  I've actually been seeing this for about 2 months now but only just
  now got motivated enough to enable crashdumps and get some information
  on what happens whenver I try to use the printer attached to my (sadly :)
  -current SMP box:
  
 
 All the other traces show the kerenl having returned to an address that is
 beyongd the end of the kernel (which causes the page fault) meaning that the
 stack is fubar'd, so the trace isn't meaningful anyways. :(  Knowing how and
 why the lpd interrupt handler trashes the stack is the useful info, and with
 teh stack already trashed, I don't know of an easy way to figure that out. 
 Suggestions welcome.
 
 printf(9)
 
:)

Maybe if I had a printer lying around. :)  I can send jkh some patches to dump
out stuff, but I was looking more for suggestions on making sense of the
crashdump, not just brute-forcing it. :-P

 -Alfred

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message