SMP system hangs on current, not stable

2002-01-02 Thread Pete Carah

I have a system using a fairly new Supermicro MB, with 2 P3-1GHZ, and 512mb
ram.  Running stable works fine at least a day or so with LOTS of activity.
Running current it hangs (with no output of any kind, and apparently all 
interrupts disabled) so DDB does me no good...  This requires a fair amount 
of activity (usually will hang in make -j3 world with 2 copies of 
setiathome -nice 19)  Time to hang varies from a half-hour to a couple 
of days; hardly ever longer.

Maybe I need an NMI button (or does that work?)

This does not appear to be the procfs thing that Matt has commented on 
(it still occurs after his patch, and occurs without the use of top
or any other procfs reader that I know of).

Dmesg on both current and stable follows, in case it is useful:
--
Current (with verbose):
---
Copyright (c) 1992-2001 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #0: Mon Dec 31 10:47:25 PST 2001
[EMAIL PROTECTED]:/usr/src/sys/i386/compile/GOONEY
Preloaded elf kernel /boot/kernel/kernel at 0xc040b000.
Preloaded elf module /boot/kernel/acpi.ko at 0xc040b0a8.
Calibrating clock(s) ... TSC clock: 999455711 Hz, i8254 clock: 1193107 Hz
CLK_USE_I8254_CALIBRATION not specified - using default frequency
Timecounter i8254  frequency 1193182 Hz
CLK_USE_TSC_CALIBRATION not specified - using old calibration method
CPU: Pentium III/Pentium III Xeon/Celeron (999.52-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x68a  Stepping = 10
  
Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
real memory  = 268369920 (262080K bytes)
Physical memory chunk(s):
0x1000 - 0x0009efff, 647168 bytes (158 pages)
0x00435000 - 0x0ffe7fff, 263925760 bytes (64435 pages)
avail memory = 256704512 (250688K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 - irq 0
SMP: CPU0 apic_initialize():
 lint0: 0x0700 lint1: 0x00010400 TPR: 0x0010 SVR: 0x01ff
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00178011, at 0xfec0
bios32: Found BIOS32 Service Directory header at 0xc00faf10
bios32: Entry = 0xfb380 (c00fb380)  Rev = 0  Len = 1
pcibios: PCI BIOS entry at 0xf+0xb3b0
pnpbios: Found PnP BIOS data at 0xc00fbe00
pnpbios: Entry = f:be30  Rev = 1.0
Other BIOS signatures found:
null: null device, zero device
random: entropy source
mem: memory  I/O
Pentium Pro MTRR support enabled
SMP: CPU0 bsp_apic_configure():
 lint0: 0x00010700 lint1: 0x0400 TPR: 0x0010 SVR: 0x01ff
pci_open(1):mode 1 addr port (0x0cf8) is 0x8060
pci_open(1a):   mode1res=0x8000 (0x8000)
pci_cfgcheck:   device 0 [class=06] [hdr=00] is there (id=30911106)
Using $PIR table, 8 entries at 0xc00fdc20
npx0: math processor on motherboard
npx0: INT 16 interface
acpi0: VIA694 AWRDACPI on motherboard
acpi0: power button is handled as a fixed feature programming model.
Timecounter ACPI  frequency 3579545 Hz
acpi_timer0: 24-bit timer at 3.579545MHz port 0x4008-0x400b on acpi0
acpi_cpu0: CPU on acpi0
acpi_cpu1: CPU on acpi0
acpi_tz0: thermal zone on acpi0
acpi_button0: Power Button on acpi0
acpi_pcib0: Host-PCI bridge port 
0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0
pci0: physical bus=0
map[10]: type 3, range 32, base f000, size 26, enabled
found- vendor=0x1106, dev=0x3091, revid=0x01
bus=0, slot=0, func=0
class=06-00-00, hdrtype=0x00, mfdev=0
powerspec 2  supports D0 D3  current D0
found- vendor=0x1106, dev=0xb091, revid=0x00
bus=0, slot=1, func=0
class=06-04-00, hdrtype=0x01, mfdev=0
IOAPIC #0 intpin 11 - irq 2
Freeing (NOT implemented) redirected PCI irq 11.
map[10]: type 4, range 32, base c000, size  3, enabled
map[14]: type 4, range 32, base c400, size  2, enabled
map[18]: type 4, range 32, base c800, size  3, enabled
map[1c]: type 4, range 32, base cc00, size  2, enabled
map[20]: type 4, range 32, base d000, size  6, enabled
map[24]: type 1, range 32, base f910, size 17, enabled
found- vendor=0x105a, dev=0x4d30, revid=0x02
bus=0, slot=12, func=0
class=01-04-00, hdrtype=0x00, mfdev=0
intpin=a, irq=2
powerspec 1  supports D0 D3  current D0
map[10]: type 1, range 32, base f912, size 12, enabled
map[14]: type 4, range 32, base d400, size  6, enabled
map[18]: type 1, range 32, base f900, size 20, enabled
found- vendor=0x8086, dev=0x1229, revid=0x08
bus=0, slot=13, func=0
class=02-00-00, hdrtype=0x00, mfdev=0
intpin=a, irq=5
powerspec 2  supports D0 D1 D2 D3  current D0
found- 

Re: SMP system hangs on current, not stable

2002-01-02 Thread Oliver Fromme

Pete Carah [EMAIL PROTECTED] wrote:
  Maybe I need an NMI button (or does that work?)

You can generate NMIs by shortening the first two pins of
an ISA slot with a screwdriver (the two pins close to the
back where the ISA slot covers are).  This can also be done
with PCI slots, if that board doesn't have an ISA slot
anymore, but I don't know which pins (it's _not_ the first
two pins), and it's a lot more difficult because the PCI
pins are much smaller.

Disclaimer:  Don't sue me if you toast your board.  :-)
Do it at your own risk.  Read the docs first.  Check the
pin assignment.  Make your last will and testament first,
etc.

Regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co KG, Oettingenstr. 2, 80538 München
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

All that we see or seem is just a dream within a dream (E. A. Poe)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: SMP system hangs on current, not stable

2002-01-02 Thread Matthew Dillon

:I have a system using a fairly new Supermicro MB, with 2 P3-1GHZ, and 512mb
:ram.  Running stable works fine at least a day or so with LOTS of activity.
:Running current it hangs (with no output of any kind, and apparently all 
:interrupts disabled) so DDB does me no good...  This requires a fair amount 
:of activity (usually will hang in make -j3 world with 2 copies of 
:setiathome -nice 19)  Time to hang varies from a half-hour to a couple 
:of days; hardly ever longer.
:
:Maybe I need an NMI button (or does that work?)

This could be a priority inversion issue.  Try running setiathome at
nice -10 (or not running it at all), and see if you can still crash
the box.

:acpi0: VIA694 AWRDACPI on motherboard
:acpi0: power button is handled as a fixed feature programming model.
:Timecounter ACPI  frequency 3579545 Hz
:acpi_timer0: 24-bit timer at 3.579545MHz port 0x4008-0x400b on acpi0
:acpi_cpu0: CPU on acpi0
:acpi_cpu1: CPU on acpi0
:acpi_tz0: thermal zone on acpi0
:acpi_button0: Power Button on acpi0
:acpi_pcib0: Host-PCI bridge port 
:0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0

Try turning off ACPI.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



RE: System hangs with -current ...

2001-03-02 Thread The Hermit Hacker

On Thu, 1 Mar 2001, John Baldwin wrote:


 On 01-Mar-01 The Hermit Hacker wrote:
 
  any comments on this?  any way of doing this without a serial console?
 
  thanks ...

 The data is too much to make a normal console feasible, although you
 could try cranking up the console to hte highest res (80x60 or 132x60,
 etc.) you can and let it freeze and then write down those 60 lines adn
 maybe that will be enough to figure it out.  However, if its looping
 this won't work. :( I've no idea atm why the serial console isn't
 working for you.

Inability to actually find a NULL modem cable, actually :(  Checked two
local shops, and neither of them carry one ... just hijacked one from work
for the weekend, so will hit this tonight and report anything I can come
up with ...

On Wed, 28 Feb 2001, The Hermit Hacker wrote:
 
 
  Yup, definitely doesn't like me using the console ... just tried it again,
  and its as if it can't scroll up the screen to send more data or
  something?
 
  I just rebooted, and then ssh'd in from remote ... type'd the two sysctl
  commands, and got:
 
  cpu1 ../../i386/i386/trap.c.181 GOT (spin) sched lock [0xc0320f20] r=0 at
  ../../i386/i386/trap.c:181
  cpcsocp/../i386/i386/trap.c.217 REL (spin) sched l
 
  on my screen ... type'd exactly as seen ... and that's it ... console is
  now locked again ...
 
  On Tue, 27 Feb 2001, The Hermit Hacker wrote:
 
  
   Okay, can't seem to find a 9pin-9pin NULL modem cable in this 'pit of the
   earth' town, so figured I'd do the sysctl commands on my console and use
   an ssh connection into the machine to run the 'hanging sequence' ... the
   console flashed a bunch of 'debugging info' and then hung solid ... I
   could still login remotely and whatnot, type commands, just nothing was
   happening on the console, couldn't change vty's, nothing ...
  
   is it supposed to do that? *raised eyebrow*
  
   On Thu, 22 Feb 2001, John Baldwin wrote:
  
   
On 23-Feb-01 The Hermit Hacker wrote:
 On Thu, 22 Feb 2001, John Baldwin wrote:


 On 22-Feb-01 The Hermit Hacker wrote:
 
  Okay, I have to pick up a NULL modem cable tomorrow and dive into
  this ...
  finally ...
 
  The various KTR_ that you mention below, these are kernel settings
  that I
  compile into the kernel?

 Yes.  You want this:

 options KTR
 options KTR_EXTEND
 options KTR_COMPILE=0x1208

 okay, just so that I understand ... I compile my kernel with these
 options, and then run the two sysctl commands you list below?  the
 KTR_COMPILE arg looks similar to the ktr_mask one below, which is why
 I'm
 confirming ...
   
Yes. KTR_COMPILE controls what KTR tracepoints are actually compiled
into
the kernel.  The ktr_mask sysctl controls a runtime mask that lets you
choose
which of the compiled in masks you want to enable.  I have manpages for
this
stuff, but they are waiting for doc guys to review them.
   
 The mtx_quiet.patch is old and won't apply to current now I'm afraid.

  On Tue, 2 Jan 2001, John Baldwin wrote:
 
 
  On 02-Jan-01 The Hermit Hacker wrote:
  
   Over the past several months, as others have reported, I've been
   getting
   system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled,
   but
   ctl-alt-esc doesn't break me to the debugger ...
  
   I'm not complaining about the hangs, if I was overly concerned,
   I'd run
   -STABLE, but I'm wondering how one goes about providing debug
   information
   on them other then through DDB?
 
  Not easily. :(  If you can make the problem easily repeatable,
  then you
  can
  try
  turning on KTR in your kernel (see NOTES, you will need
  KTR_EXTEND),
  setting
  up
  a serial console that you log the output of, create a shell script
  that
  runs
  the following commands:
 
  #!/bin/sh
 
  # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
  sysctl -w debug.ktr_mask=0x1208
  sysctl -w debug.ktr_verbose=2
 
  run_magic_command_that_hangs_my_machine
 
  and run the script.  You probably want to run it over a tty or
  remote
  login
  so
  tthat the serial console output is just the logging (warning, it
  will be
  very
  verbose!).  Also, you probably want to use
  http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up
  most of
  the
  irrelevant and cluttery mutex trace messages.  Note that having
  this much
  logging on will probably slow the machine to a crawl as well, so
  you may
  have
  to just start this up and go off and do something else until it
  hangs.
  :-/
  Another alternative is to rig up a NMI debouncer and use it to
  break into
  the
  debugger.  Then you can start poking around to see who owns

RE: System hangs with -current ...

2001-03-01 Thread The Hermit Hacker


any comments on this?  any way of doing this without a serial console?

thanks ...

On Wed, 28 Feb 2001, The Hermit Hacker wrote:


 Yup, definitely doesn't like me using the console ... just tried it again,
 and its as if it can't scroll up the screen to send more data or
 something?

 I just rebooted, and then ssh'd in from remote ... type'd the two sysctl
 commands, and got:

 cpu1 ../../i386/i386/trap.c.181 GOT (spin) sched lock [0xc0320f20] r=0 at 
../../i386/i386/trap.c:181
 cpcsocp/../i386/i386/trap.c.217 REL (spin) sched l

 on my screen ... type'd exactly as seen ... and that's it ... console is
 now locked again ...

 On Tue, 27 Feb 2001, The Hermit Hacker wrote:

 
  Okay, can't seem to find a 9pin-9pin NULL modem cable in this 'pit of the
  earth' town, so figured I'd do the sysctl commands on my console and use
  an ssh connection into the machine to run the 'hanging sequence' ... the
  console flashed a bunch of 'debugging info' and then hung solid ... I
  could still login remotely and whatnot, type commands, just nothing was
  happening on the console, couldn't change vty's, nothing ...
 
  is it supposed to do that? *raised eyebrow*
 
  On Thu, 22 Feb 2001, John Baldwin wrote:
 
  
   On 23-Feb-01 The Hermit Hacker wrote:
On Thu, 22 Feb 2001, John Baldwin wrote:
   
   
On 22-Feb-01 The Hermit Hacker wrote:

 Okay, I have to pick up a NULL modem cable tomorrow and dive into this ...
 finally ...

 The various KTR_ that you mention below, these are kernel settings that I
 compile into the kernel?
   
Yes.  You want this:
   
options KTR
options KTR_EXTEND
options KTR_COMPILE=0x1208
   
okay, just so that I understand ... I compile my kernel with these
options, and then run the two sysctl commands you list below?  the
KTR_COMPILE arg looks similar to the ktr_mask one below, which is why I'm
confirming ...
  
   Yes. KTR_COMPILE controls what KTR tracepoints are actually compiled into
   the kernel.  The ktr_mask sysctl controls a runtime mask that lets you choose
   which of the compiled in masks you want to enable.  I have manpages for this
   stuff, but they are waiting for doc guys to review them.
  
The mtx_quiet.patch is old and won't apply to current now I'm afraid.
   
 On Tue, 2 Jan 2001, John Baldwin wrote:


 On 02-Jan-01 The Hermit Hacker wrote:
 
  Over the past several months, as others have reported, I've been
  getting
  system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but
  ctl-alt-esc doesn't break me to the debugger ...
 
  I'm not complaining about the hangs, if I was overly concerned, I'd run
  -STABLE, but I'm wondering how one goes about providing debug
  information
  on them other then through DDB?

 Not easily. :(  If you can make the problem easily repeatable, then you
 can
 try
 turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND),
 setting
 up
 a serial console that you log the output of, create a shell script that
 runs
 the following commands:

 #!/bin/sh

 # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
 sysctl -w debug.ktr_mask=0x1208
 sysctl -w debug.ktr_verbose=2

 run_magic_command_that_hangs_my_machine

 and run the script.  You probably want to run it over a tty or remote
 login
 so
 tthat the serial console output is just the logging (warning, it will be
 very
 verbose!).  Also, you probably want to use
 http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of
 the
 irrelevant and cluttery mutex trace messages.  Note that having this much
 logging on will probably slow the machine to a crawl as well, so you may
 have
 to just start this up and go off and do something else until it hangs.
 :-/
 Another alternative is to rig up a NMI debouncer and use it to break into
 the
 debugger.  Then you can start poking around to see who owns sched_lock,
 etc.

  Thanks ...
  
   --
  
   John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
   PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
   "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
  
 
  Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
  Systems Administrator @ hub.org
  primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org
 
 
  To Unsubscribe: send mail to [EMAIL PROTECTED]
  with "unsubscribe freebsd-current" in the body of the message
 

 Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
 Systems Administrator @ hub.org
 primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org



 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message


Marc G. 

RE: System hangs with -current ...

2001-03-01 Thread John Baldwin


On 01-Mar-01 The Hermit Hacker wrote:
 
 any comments on this?  any way of doing this without a serial console?
 
 thanks ...

The data is too much to make a normal console feasible, although you could try
cranking up the console to hte highest res (80x60 or 132x60, etc.) you can and
let it freeze and then write down those 60 lines adn maybe that will be enough
to figure it out.  However, if its looping this won't work. :(  I've no idea
atm why the serial console isn't working for you.

 On Wed, 28 Feb 2001, The Hermit Hacker wrote:
 

 Yup, definitely doesn't like me using the console ... just tried it again,
 and its as if it can't scroll up the screen to send more data or
 something?

 I just rebooted, and then ssh'd in from remote ... type'd the two sysctl
 commands, and got:

 cpu1 ../../i386/i386/trap.c.181 GOT (spin) sched lock [0xc0320f20] r=0 at
 ../../i386/i386/trap.c:181
 cpcsocp/../i386/i386/trap.c.217 REL (spin) sched l

 on my screen ... type'd exactly as seen ... and that's it ... console is
 now locked again ...

 On Tue, 27 Feb 2001, The Hermit Hacker wrote:

 
  Okay, can't seem to find a 9pin-9pin NULL modem cable in this 'pit of the
  earth' town, so figured I'd do the sysctl commands on my console and use
  an ssh connection into the machine to run the 'hanging sequence' ... the
  console flashed a bunch of 'debugging info' and then hung solid ... I
  could still login remotely and whatnot, type commands, just nothing was
  happening on the console, couldn't change vty's, nothing ...
 
  is it supposed to do that? *raised eyebrow*
 
  On Thu, 22 Feb 2001, John Baldwin wrote:
 
  
   On 23-Feb-01 The Hermit Hacker wrote:
On Thu, 22 Feb 2001, John Baldwin wrote:
   
   
On 22-Feb-01 The Hermit Hacker wrote:

 Okay, I have to pick up a NULL modem cable tomorrow and dive into
 this ...
 finally ...

 The various KTR_ that you mention below, these are kernel settings
 that I
 compile into the kernel?
   
Yes.  You want this:
   
options KTR
options KTR_EXTEND
options KTR_COMPILE=0x1208
   
okay, just so that I understand ... I compile my kernel with these
options, and then run the two sysctl commands you list below?  the
KTR_COMPILE arg looks similar to the ktr_mask one below, which is why
I'm
confirming ...
  
   Yes. KTR_COMPILE controls what KTR tracepoints are actually compiled
   into
   the kernel.  The ktr_mask sysctl controls a runtime mask that lets you
   choose
   which of the compiled in masks you want to enable.  I have manpages for
   this
   stuff, but they are waiting for doc guys to review them.
  
The mtx_quiet.patch is old and won't apply to current now I'm afraid.
   
 On Tue, 2 Jan 2001, John Baldwin wrote:


 On 02-Jan-01 The Hermit Hacker wrote:
 
  Over the past several months, as others have reported, I've been
  getting
  system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled,
  but
  ctl-alt-esc doesn't break me to the debugger ...
 
  I'm not complaining about the hangs, if I was overly concerned,
  I'd run
  -STABLE, but I'm wondering how one goes about providing debug
  information
  on them other then through DDB?

 Not easily. :(  If you can make the problem easily repeatable,
 then you
 can
 try
 turning on KTR in your kernel (see NOTES, you will need
 KTR_EXTEND),
 setting
 up
 a serial console that you log the output of, create a shell script
 that
 runs
 the following commands:

 #!/bin/sh

 # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
 sysctl -w debug.ktr_mask=0x1208
 sysctl -w debug.ktr_verbose=2

 run_magic_command_that_hangs_my_machine

 and run the script.  You probably want to run it over a tty or
 remote
 login
 so
 tthat the serial console output is just the logging (warning, it
 will be
 very
 verbose!).  Also, you probably want to use
 http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up
 most of
 the
 irrelevant and cluttery mutex trace messages.  Note that having
 this much
 logging on will probably slow the machine to a crawl as well, so
 you may
 have
 to just start this up and go off and do something else until it
 hangs.
 :-/
 Another alternative is to rig up a NMI debouncer and use it to
 break into
 the
 debugger.  Then you can start poking around to see who owns
 sched_lock,
 etc.

  Thanks ...
  
   --
  
   John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
   PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
   "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
  
 
  Marc G. Fournier   ICQ#7615664   IRC Nick:
  Scrappy
  Systems Administrator @ hub.org
  primary: [EMAIL PROTECTED]  

RE: System hangs with -current ...

2001-02-27 Thread The Hermit Hacker


Okay, can't seem to find a 9pin-9pin NULL modem cable in this 'pit of the
earth' town, so figured I'd do the sysctl commands on my console and use
an ssh connection into the machine to run the 'hanging sequence' ... the
console flashed a bunch of 'debugging info' and then hung solid ... I
could still login remotely and whatnot, type commands, just nothing was
happening on the console, couldn't change vty's, nothing ...

is it supposed to do that? *raised eyebrow*

On Thu, 22 Feb 2001, John Baldwin wrote:


 On 23-Feb-01 The Hermit Hacker wrote:
  On Thu, 22 Feb 2001, John Baldwin wrote:
 
 
  On 22-Feb-01 The Hermit Hacker wrote:
  
   Okay, I have to pick up a NULL modem cable tomorrow and dive into this ...
   finally ...
  
   The various KTR_ that you mention below, these are kernel settings that I
   compile into the kernel?
 
  Yes.  You want this:
 
  options KTR
  options KTR_EXTEND
  options KTR_COMPILE=0x1208
 
  okay, just so that I understand ... I compile my kernel with these
  options, and then run the two sysctl commands you list below?  the
  KTR_COMPILE arg looks similar to the ktr_mask one below, which is why I'm
  confirming ...

 Yes. KTR_COMPILE controls what KTR tracepoints are actually compiled into
 the kernel.  The ktr_mask sysctl controls a runtime mask that lets you choose
 which of the compiled in masks you want to enable.  I have manpages for this
 stuff, but they are waiting for doc guys to review them.

  The mtx_quiet.patch is old and won't apply to current now I'm afraid.
 
   On Tue, 2 Jan 2001, John Baldwin wrote:
  
  
   On 02-Jan-01 The Hermit Hacker wrote:
   
Over the past several months, as others have reported, I've been
getting
system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but
ctl-alt-esc doesn't break me to the debugger ...
   
I'm not complaining about the hangs, if I was overly concerned, I'd run
-STABLE, but I'm wondering how one goes about providing debug
information
on them other then through DDB?
  
   Not easily. :(  If you can make the problem easily repeatable, then you
   can
   try
   turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND),
   setting
   up
   a serial console that you log the output of, create a shell script that
   runs
   the following commands:
  
   #!/bin/sh
  
   # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
   sysctl -w debug.ktr_mask=0x1208
   sysctl -w debug.ktr_verbose=2
  
   run_magic_command_that_hangs_my_machine
  
   and run the script.  You probably want to run it over a tty or remote
   login
   so
   tthat the serial console output is just the logging (warning, it will be
   very
   verbose!).  Also, you probably want to use
   http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of
   the
   irrelevant and cluttery mutex trace messages.  Note that having this much
   logging on will probably slow the machine to a crawl as well, so you may
   have
   to just start this up and go off and do something else until it hangs.
   :-/
   Another alternative is to rig up a NMI debouncer and use it to break into
   the
   debugger.  Then you can start poking around to see who owns sched_lock,
   etc.
  
Thanks ...

 --

 John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
 PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
 "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: System hangs with -current ...

2001-02-27 Thread The Hermit Hacker


Yup, definitely doesn't like me using the console ... just tried it again,
and its as if it can't scroll up the screen to send more data or
something?

I just rebooted, and then ssh'd in from remote ... type'd the two sysctl
commands, and got:

cpu1 ../../i386/i386/trap.c.181 GOT (spin) sched lock [0xc0320f20] r=0 at 
../../i386/i386/trap.c:181
cpcsocp/../i386/i386/trap.c.217 REL (spin) sched l

on my screen ... type'd exactly as seen ... and that's it ... console is
now locked again ...

On Tue, 27 Feb 2001, The Hermit Hacker wrote:


 Okay, can't seem to find a 9pin-9pin NULL modem cable in this 'pit of the
 earth' town, so figured I'd do the sysctl commands on my console and use
 an ssh connection into the machine to run the 'hanging sequence' ... the
 console flashed a bunch of 'debugging info' and then hung solid ... I
 could still login remotely and whatnot, type commands, just nothing was
 happening on the console, couldn't change vty's, nothing ...

 is it supposed to do that? *raised eyebrow*

 On Thu, 22 Feb 2001, John Baldwin wrote:

 
  On 23-Feb-01 The Hermit Hacker wrote:
   On Thu, 22 Feb 2001, John Baldwin wrote:
  
  
   On 22-Feb-01 The Hermit Hacker wrote:
   
Okay, I have to pick up a NULL modem cable tomorrow and dive into this ...
finally ...
   
The various KTR_ that you mention below, these are kernel settings that I
compile into the kernel?
  
   Yes.  You want this:
  
   options KTR
   options KTR_EXTEND
   options KTR_COMPILE=0x1208
  
   okay, just so that I understand ... I compile my kernel with these
   options, and then run the two sysctl commands you list below?  the
   KTR_COMPILE arg looks similar to the ktr_mask one below, which is why I'm
   confirming ...
 
  Yes. KTR_COMPILE controls what KTR tracepoints are actually compiled into
  the kernel.  The ktr_mask sysctl controls a runtime mask that lets you choose
  which of the compiled in masks you want to enable.  I have manpages for this
  stuff, but they are waiting for doc guys to review them.
 
   The mtx_quiet.patch is old and won't apply to current now I'm afraid.
  
On Tue, 2 Jan 2001, John Baldwin wrote:
   
   
On 02-Jan-01 The Hermit Hacker wrote:

 Over the past several months, as others have reported, I've been
 getting
 system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but
 ctl-alt-esc doesn't break me to the debugger ...

 I'm not complaining about the hangs, if I was overly concerned, I'd run
 -STABLE, but I'm wondering how one goes about providing debug
 information
 on them other then through DDB?
   
Not easily. :(  If you can make the problem easily repeatable, then you
can
try
turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND),
setting
up
a serial console that you log the output of, create a shell script that
runs
the following commands:
   
#!/bin/sh
   
# Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
sysctl -w debug.ktr_mask=0x1208
sysctl -w debug.ktr_verbose=2
   
run_magic_command_that_hangs_my_machine
   
and run the script.  You probably want to run it over a tty or remote
login
so
tthat the serial console output is just the logging (warning, it will be
very
verbose!).  Also, you probably want to use
http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of
the
irrelevant and cluttery mutex trace messages.  Note that having this much
logging on will probably slow the machine to a crawl as well, so you may
have
to just start this up and go off and do something else until it hangs.
:-/
Another alternative is to rig up a NMI debouncer and use it to break into
the
debugger.  Then you can start poking around to see who owns sched_lock,
etc.
   
 Thanks ...
 
  --
 
  John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
  PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
  "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
 

 Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
 Systems Administrator @ hub.org
 primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org


 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message


Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: System hangs with -current ...

2001-02-22 Thread The Hermit Hacker


Okay, I have to pick up a NULL modem cable tomorrow and dive into this ...
finally ...

The various KTR_ that you mention below, these are kernel settings that I
compile into the kernel?

On Tue, 2 Jan 2001, John Baldwin wrote:


 On 02-Jan-01 The Hermit Hacker wrote:
 
  Over the past several months, as others have reported, I've been getting
  system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but
  ctl-alt-esc doesn't break me to the debugger ...
 
  I'm not complaining about the hangs, if I was overly concerned, I'd run
  -STABLE, but I'm wondering how one goes about providing debug information
  on them other then through DDB?

 Not easily. :(  If you can make the problem easily repeatable, then you can try
 turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND), setting up
 a serial console that you log the output of, create a shell script that runs
 the following commands:

 #!/bin/sh

 # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
 sysctl -w debug.ktr_mask=0x1208
 sysctl -w debug.ktr_verbose=2

 run_magic_command_that_hangs_my_machine

 and run the script.  You probably want to run it over a tty or remote login so
 tthat the serial console output is just the logging (warning, it will be very
 verbose!).  Also, you probably want to use
 http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of the
 irrelevant and cluttery mutex trace messages.  Note that having this much
 logging on will probably slow the machine to a crawl as well, so you may have
 to just start this up and go off and do something else until it hangs. :-/
 Another alternative is to rig up a NMI debouncer and use it to break into the
 debugger.  Then you can start poking around to see who owns sched_lock, etc.

  Thanks ...

 --

 John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
 PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
 "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: System hangs with -current ...

2001-02-22 Thread John Baldwin


On 22-Feb-01 The Hermit Hacker wrote:
 
 Okay, I have to pick up a NULL modem cable tomorrow and dive into this ...
 finally ...
 
 The various KTR_ that you mention below, these are kernel settings that I
 compile into the kernel?

Yes.  You want this:

options KTR
options KTR_EXTEND
options KTR_COMPILE=0x1208

The mtx_quiet.patch is old and won't apply to current now I'm afraid.

 On Tue, 2 Jan 2001, John Baldwin wrote:
 

 On 02-Jan-01 The Hermit Hacker wrote:
 
  Over the past several months, as others have reported, I've been getting
  system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but
  ctl-alt-esc doesn't break me to the debugger ...
 
  I'm not complaining about the hangs, if I was overly concerned, I'd run
  -STABLE, but I'm wondering how one goes about providing debug information
  on them other then through DDB?

 Not easily. :(  If you can make the problem easily repeatable, then you can
 try
 turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND), setting
 up
 a serial console that you log the output of, create a shell script that runs
 the following commands:

 #!/bin/sh

 # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
 sysctl -w debug.ktr_mask=0x1208
 sysctl -w debug.ktr_verbose=2

 run_magic_command_that_hangs_my_machine

 and run the script.  You probably want to run it over a tty or remote login
 so
 tthat the serial console output is just the logging (warning, it will be
 very
 verbose!).  Also, you probably want to use
 http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of the
 irrelevant and cluttery mutex trace messages.  Note that having this much
 logging on will probably slow the machine to a crawl as well, so you may
 have
 to just start this up and go off and do something else until it hangs. :-/
 Another alternative is to rig up a NMI debouncer and use it to break into
 the
 debugger.  Then you can start poking around to see who owns sched_lock, etc.

  Thanks ...

 --

 John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
 PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
 "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

 
 Marc G. Fournier   ICQ#7615664   IRC Nick:
 Scrappy
 Systems Administrator @ hub.org
 primary: [EMAIL PROTECTED]   secondary:
 scrappy@{freebsd|postgresql}.org
 

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: System hangs with -current ...

2001-02-22 Thread The Hermit Hacker

On Thu, 22 Feb 2001, John Baldwin wrote:


 On 22-Feb-01 The Hermit Hacker wrote:
 
  Okay, I have to pick up a NULL modem cable tomorrow and dive into this ...
  finally ...
 
  The various KTR_ that you mention below, these are kernel settings that I
  compile into the kernel?

 Yes.  You want this:

 options KTR
 options KTR_EXTEND
 options KTR_COMPILE=0x1208

okay, just so that I understand ... I compile my kernel with these
options, and then run the two sysctl commands you list below?  the
KTR_COMPILE arg looks similar to the ktr_mask one below, which is why I'm
confirming ...


 The mtx_quiet.patch is old and won't apply to current now I'm afraid.

  On Tue, 2 Jan 2001, John Baldwin wrote:
 
 
  On 02-Jan-01 The Hermit Hacker wrote:
  
   Over the past several months, as others have reported, I've been getting
   system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but
   ctl-alt-esc doesn't break me to the debugger ...
  
   I'm not complaining about the hangs, if I was overly concerned, I'd run
   -STABLE, but I'm wondering how one goes about providing debug information
   on them other then through DDB?
 
  Not easily. :(  If you can make the problem easily repeatable, then you can
  try
  turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND), setting
  up
  a serial console that you log the output of, create a shell script that runs
  the following commands:
 
  #!/bin/sh
 
  # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
  sysctl -w debug.ktr_mask=0x1208
  sysctl -w debug.ktr_verbose=2
 
  run_magic_command_that_hangs_my_machine
 
  and run the script.  You probably want to run it over a tty or remote login
  so
  tthat the serial console output is just the logging (warning, it will be
  very
  verbose!).  Also, you probably want to use
  http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of the
  irrelevant and cluttery mutex trace messages.  Note that having this much
  logging on will probably slow the machine to a crawl as well, so you may
  have
  to just start this up and go off and do something else until it hangs. :-/
  Another alternative is to rig up a NMI debouncer and use it to break into
  the
  debugger.  Then you can start poking around to see who owns sched_lock, etc.
 
   Thanks ...
 
  --
 
  John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
  PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
  "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
 
 
  Marc G. Fournier   ICQ#7615664   IRC Nick:
  Scrappy
  Systems Administrator @ hub.org
  primary: [EMAIL PROTECTED]   secondary:
  scrappy@{freebsd|postgresql}.org
 

 --

 John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
 PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
 "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: System hangs with -current ...

2001-02-22 Thread John Baldwin


On 23-Feb-01 The Hermit Hacker wrote:
 On Thu, 22 Feb 2001, John Baldwin wrote:
 

 On 22-Feb-01 The Hermit Hacker wrote:
 
  Okay, I have to pick up a NULL modem cable tomorrow and dive into this ...
  finally ...
 
  The various KTR_ that you mention below, these are kernel settings that I
  compile into the kernel?

 Yes.  You want this:

 options KTR
 options KTR_EXTEND
 options KTR_COMPILE=0x1208
 
 okay, just so that I understand ... I compile my kernel with these
 options, and then run the two sysctl commands you list below?  the
 KTR_COMPILE arg looks similar to the ktr_mask one below, which is why I'm
 confirming ...

Yes. KTR_COMPILE controls what KTR tracepoints are actually compiled into
the kernel.  The ktr_mask sysctl controls a runtime mask that lets you choose
which of the compiled in masks you want to enable.  I have manpages for this
stuff, but they are waiting for doc guys to review them.

 The mtx_quiet.patch is old and won't apply to current now I'm afraid.

  On Tue, 2 Jan 2001, John Baldwin wrote:
 
 
  On 02-Jan-01 The Hermit Hacker wrote:
  
   Over the past several months, as others have reported, I've been
   getting
   system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but
   ctl-alt-esc doesn't break me to the debugger ...
  
   I'm not complaining about the hangs, if I was overly concerned, I'd run
   -STABLE, but I'm wondering how one goes about providing debug
   information
   on them other then through DDB?
 
  Not easily. :(  If you can make the problem easily repeatable, then you
  can
  try
  turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND),
  setting
  up
  a serial console that you log the output of, create a shell script that
  runs
  the following commands:
 
  #!/bin/sh
 
  # Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
  sysctl -w debug.ktr_mask=0x1208
  sysctl -w debug.ktr_verbose=2
 
  run_magic_command_that_hangs_my_machine
 
  and run the script.  You probably want to run it over a tty or remote
  login
  so
  tthat the serial console output is just the logging (warning, it will be
  very
  verbose!).  Also, you probably want to use
  http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of
  the
  irrelevant and cluttery mutex trace messages.  Note that having this much
  logging on will probably slow the machine to a crawl as well, so you may
  have
  to just start this up and go off and do something else until it hangs.
  :-/
  Another alternative is to rig up a NMI debouncer and use it to break into
  the
  debugger.  Then you can start poking around to see who owns sched_lock,
  etc.
 
   Thanks ...

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



System hangs with -current ...

2001-01-02 Thread The Hermit Hacker


Over the past several months, as others have reported, I've been getting
system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but
ctl-alt-esc doesn't break me to the debugger ...

I'm not complaining about the hangs, if I was overly concerned, I'd run
-STABLE, but I'm wondering how one goes about providing debug information
on them other then through DDB?

Thanks ...

Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: System hangs with -current ...

2001-01-02 Thread John Baldwin


On 02-Jan-01 The Hermit Hacker wrote:
 
 Over the past several months, as others have reported, I've been getting
 system hangs using 5.0-CURRENT w/ SMP ... I've got DDB enabled, but
 ctl-alt-esc doesn't break me to the debugger ...
 
 I'm not complaining about the hangs, if I was overly concerned, I'd run
 -STABLE, but I'm wondering how one goes about providing debug information
 on them other then through DDB?

Not easily. :(  If you can make the problem easily repeatable, then you can try
turning on KTR in your kernel (see NOTES, you will need KTR_EXTEND), setting up
a serial console that you log the output of, create a shell script that runs
the following commands:

#!/bin/sh

# Turn on KTR_INTR, KTR_PROC, and KTR_LOCK
sysctl -w debug.ktr_mask=0x1208
sysctl -w debug.ktr_verbose=2

run_magic_command_that_hangs_my_machine

and run the script.  You probably want to run it over a tty or remote login so
tthat the serial console output is just the logging (warning, it will be very
verbose!).  Also, you probably want to use
http://www.FreeBSD.org/~jhb/patches/mtx_quiet.patch to shut up most of the
irrelevant and cluttery mutex trace messages.  Note that having this much
logging on will probably slow the machine to a crawl as well, so you may have
to just start this up and go off and do something else until it hangs. :-/ 
Another alternative is to rig up a NMI debouncer and use it to break into the
debugger.  Then you can start poking around to see who owns sched_lock, etc.

 Thanks ...

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message