Re: -current kernel hangs machine solid ...

2000-12-06 Thread Michael Harnois

Just checking in ... I haven't had one of these random hangs in the
last week or so. Anyone else?

-- 
Michael D. Harnois, Redeemer Lutheran Church, Washburn, IA 
[EMAIL PROTECTED]  [EMAIL PROTECTED] 
 No man knows how bad he is 
 till he has tried very hard to be good. -- C.S. Lewis


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current kernel hangs machine solid ...

2000-12-06 Thread Steve Ames

Actually... I just noticed that my hanging machine had been
up for 5 days. I started it doing 'make -j8 world' in a
continual loop a couple hours ago before leaving work... I'll
see tomorrow.

-Steve

On Wed, Dec 06, 2000 at 05:37:14PM -0600, Michael Harnois wrote:
 Just checking in ... I haven't had one of these random hangs in the
 last week or so. Anyone else?
 
 -- 
 Michael D. Harnois, Redeemer Lutheran Church, Washburn, IA 
 [EMAIL PROTECTED]  [EMAIL PROTECTED] 
  No man knows how bad he is 
  till he has tried very hard to be good. -- C.S. Lewis
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current kernel hangs machine solid ...

2000-12-06 Thread Greg Lehey

On Wednesday,  6 December 2000 at 17:37:14 -0600, Michael Harnois wrote:
 Just checking in ... I haven't had one of these random hangs in the
 last week or so. Anyone else?

Yup.  My freshly installed machine has hung up again.  Completely
dead, apparently during a make world.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: -current kernel hangs machine solid ...

2000-11-28 Thread John Baldwin


On 28-Nov-00 The Hermit Hacker wrote:
 
 Just tried to build a kernel based on sources from today, to enable
 BREAK_TO_DEBUGGER so that I can try and get in and see where its hanging
 ... the compile hung the machine solid.  Even hitting the
 'numlock'/'capslock' on my keyboard generated no results ...

It is spinning with interrupts disabled, probably due to holding a spinlock for
far too long.  Debugging this is not all that fun.  :-P  If you can rig up an
NMI switch, you can use that to drop into ddb and then use 'x' to see who owns
various mutexes (sched_lock and callout_mtx being the primary spin mutexes of
concern).  If you compile your kernel with WITNESS and MUTEX_DEBUG, then you
can use 'x' to look at the sched_lock and callout_mtx mutex structures, find
the pointer to the mtx_debug structure, and examine that to find the mtxd_file
and mtxd_line members.  Then you can look at those (x/s to look at the filename
as a string) to find the filename and line number when the mutex was last
acquired.  Grr, except that this is broken for spin mutexes.  If you are
patient, you can try rigging up a serial console, compile KTR into your kernel
as so:

options KTR
options KTR_EXTEND
options KTR_COMPILE=(KTR_LOCK|KTR_PROC|KTR_INTR)

Then when the machine has booted, log in via ssh or a tty other than the serial
console and type the following:

# sysctl -w debug.ktr_mask=0x1208
# sysctl -w debug.ktr_verbose=2
# while (1) do
 make -j 16 buildworld
 end

Unfortunately, there is a chance the machine will die before it hangs due to
exceeding the stack space.  In that case, you can _try_ bumping UPAGES, but
that didn't help on my test machines. :-/  However, if your machine doesn't
blow up and die, then when it hangs, the KTR output dumped to the serial console
(which you should probably log to a file via script or somesuch) will show what
mutex was acquired and where it was acquired that is causing the hang.

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: -current kernel hangs machine solid ...

2000-11-28 Thread The Hermit Hacker


gah ... okay, first "problem" is coming up with a serial console, as I
only have the one machine ... but, am going to search one out and save
this email ... will come back to it once I get it setup that far ...

thanks ...

On Tue, 28 Nov 2000, John Baldwin wrote:

 
 On 28-Nov-00 The Hermit Hacker wrote:
  
  Just tried to build a kernel based on sources from today, to enable
  BREAK_TO_DEBUGGER so that I can try and get in and see where its hanging
  ... the compile hung the machine solid.  Even hitting the
  'numlock'/'capslock' on my keyboard generated no results ...
 
 It is spinning with interrupts disabled, probably due to holding a spinlock for
 far too long.  Debugging this is not all that fun.  :-P  If you can rig up an
 NMI switch, you can use that to drop into ddb and then use 'x' to see who owns
 various mutexes (sched_lock and callout_mtx being the primary spin mutexes of
 concern).  If you compile your kernel with WITNESS and MUTEX_DEBUG, then you
 can use 'x' to look at the sched_lock and callout_mtx mutex structures, find
 the pointer to the mtx_debug structure, and examine that to find the mtxd_file
 and mtxd_line members.  Then you can look at those (x/s to look at the filename
 as a string) to find the filename and line number when the mutex was last
 acquired.  Grr, except that this is broken for spin mutexes.  If you are
 patient, you can try rigging up a serial console, compile KTR into your kernel
 as so:
 
 options KTR
 options KTR_EXTEND
 options KTR_COMPILE=(KTR_LOCK|KTR_PROC|KTR_INTR)
 
 Then when the machine has booted, log in via ssh or a tty other than the serial
 console and type the following:
 
 # sysctl -w debug.ktr_mask=0x1208
 # sysctl -w debug.ktr_verbose=2
 # while (1) do
  make -j 16 buildworld
  end
 
 Unfortunately, there is a chance the machine will die before it hangs due to
 exceeding the stack space.  In that case, you can _try_ bumping UPAGES, but
 that didn't help on my test machines. :-/  However, if your machine doesn't
 blow up and die, then when it hangs, the KTR output dumped to the serial console
 (which you should probably log to a file via script or somesuch) will show what
 mutex was acquired and where it was acquired that is causing the hang.
 
 -- 
 
 John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
 PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
 "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
 

Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



-current kernel hangs machine solid ...

2000-11-27 Thread The Hermit Hacker


Just tried to build a kernel based on sources from today, to enable
BREAK_TO_DEBUGGER so that I can try and get in and see where its hanging
... the compile hung the machine solid.  Even hitting the
'numlock'/'capslock' on my keyboard generated no results ...

going to try and get a clean compile by booting into my older kernel, and
see what happens ...

Marc G. Fournier   ICQ#7615664   IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: [EMAIL PROTECTED]   secondary: scrappy@{freebsd|postgresql}.org 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message