Re: -current kernel hangs machine solid ...
On Wednesday, 6 December 2000 at 17:37:14 -0600, Michael Harnois wrote: > Just checking in ... I haven't had one of these random hangs in the > last week or so. Anyone else? Yup. My freshly installed machine has hung up again. Completely dead, apparently during a make world. Greg -- Finger [EMAIL PROTECTED] for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: -current kernel hangs machine solid ...
Actually... I just noticed that my hanging machine had been up for 5 days. I started it doing 'make -j8 world' in a continual loop a couple hours ago before leaving work... I'll see tomorrow. -Steve On Wed, Dec 06, 2000 at 05:37:14PM -0600, Michael Harnois wrote: > Just checking in ... I haven't had one of these random hangs in the > last week or so. Anyone else? > > -- > Michael D. Harnois, Redeemer Lutheran Church, Washburn, IA > [EMAIL PROTECTED] [EMAIL PROTECTED] > No man knows how bad he is > till he has tried very hard to be good. -- C.S. Lewis > > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: -current kernel hangs machine solid ...
Just checking in ... I haven't had one of these random hangs in the last week or so. Anyone else? -- Michael D. Harnois, Redeemer Lutheran Church, Washburn, IA [EMAIL PROTECTED] [EMAIL PROTECTED] No man knows how bad he is till he has tried very hard to be good. -- C.S. Lewis To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: -current kernel hangs machine solid ...
gah ... okay, first "problem" is coming up with a serial console, as I only have the one machine ... but, am going to search one out and save this email ... will come back to it once I get it setup that far ... thanks ... On Tue, 28 Nov 2000, John Baldwin wrote: > > On 28-Nov-00 The Hermit Hacker wrote: > > > > Just tried to build a kernel based on sources from today, to enable > > BREAK_TO_DEBUGGER so that I can try and get in and see where its hanging > > ... the compile hung the machine solid. Even hitting the > > 'numlock'/'capslock' on my keyboard generated no results ... > > It is spinning with interrupts disabled, probably due to holding a spinlock for > far too long. Debugging this is not all that fun. :-P If you can rig up an > NMI switch, you can use that to drop into ddb and then use 'x' to see who owns > various mutexes (sched_lock and callout_mtx being the primary spin mutexes of > concern). If you compile your kernel with WITNESS and MUTEX_DEBUG, then you > can use 'x' to look at the sched_lock and callout_mtx mutex structures, find > the pointer to the mtx_debug structure, and examine that to find the mtxd_file > and mtxd_line members. Then you can look at those (x/s to look at the filename > as a string) to find the filename and line number when the mutex was last > acquired. Grr, except that this is broken for spin mutexes. If you are > patient, you can try rigging up a serial console, compile KTR into your kernel > as so: > > options KTR > options KTR_EXTEND > options KTR_COMPILE=(KTR_LOCK|KTR_PROC|KTR_INTR) > > Then when the machine has booted, log in via ssh or a tty other than the serial > console and type the following: > > # sysctl -w debug.ktr_mask=0x1208 > # sysctl -w debug.ktr_verbose=2 > # while (1) do > > make -j 16 buildworld > > end > > Unfortunately, there is a chance the machine will die before it hangs due to > exceeding the stack space. In that case, you can _try_ bumping UPAGES, but > that didn't help on my test machines. :-/ However, if your machine doesn't > blow up and die, then when it hangs, the KTR output dumped to the serial console > (which you should probably log to a file via script or somesuch) will show what > mutex was acquired and where it was acquired that is causing the hang. > > -- > > John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ > PGP Key: http://www.baldwin.cx/~john/pgpkey.asc > "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: [EMAIL PROTECTED] secondary: scrappy@{freebsd|postgresql}.org To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
RE: -current kernel hangs machine solid ...
On 28-Nov-00 The Hermit Hacker wrote: > > Just tried to build a kernel based on sources from today, to enable > BREAK_TO_DEBUGGER so that I can try and get in and see where its hanging > ... the compile hung the machine solid. Even hitting the > 'numlock'/'capslock' on my keyboard generated no results ... It is spinning with interrupts disabled, probably due to holding a spinlock for far too long. Debugging this is not all that fun. :-P If you can rig up an NMI switch, you can use that to drop into ddb and then use 'x' to see who owns various mutexes (sched_lock and callout_mtx being the primary spin mutexes of concern). If you compile your kernel with WITNESS and MUTEX_DEBUG, then you can use 'x' to look at the sched_lock and callout_mtx mutex structures, find the pointer to the mtx_debug structure, and examine that to find the mtxd_file and mtxd_line members. Then you can look at those (x/s to look at the filename as a string) to find the filename and line number when the mutex was last acquired. Grr, except that this is broken for spin mutexes. If you are patient, you can try rigging up a serial console, compile KTR into your kernel as so: options KTR options KTR_EXTEND options KTR_COMPILE=(KTR_LOCK|KTR_PROC|KTR_INTR) Then when the machine has booted, log in via ssh or a tty other than the serial console and type the following: # sysctl -w debug.ktr_mask=0x1208 # sysctl -w debug.ktr_verbose=2 # while (1) do > make -j 16 buildworld > end Unfortunately, there is a chance the machine will die before it hangs due to exceeding the stack space. In that case, you can _try_ bumping UPAGES, but that didn't help on my test machines. :-/ However, if your machine doesn't blow up and die, then when it hangs, the KTR output dumped to the serial console (which you should probably log to a file via script or somesuch) will show what mutex was acquired and where it was acquired that is causing the hang. -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
-current kernel hangs machine solid ...
Just tried to build a kernel based on sources from today, to enable BREAK_TO_DEBUGGER so that I can try and get in and see where its hanging ... the compile hung the machine solid. Even hitting the 'numlock'/'capslock' on my keyboard generated no results ... going to try and get a clean compile by booting into my older kernel, and see what happens ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: [EMAIL PROTECTED] secondary: scrappy@{freebsd|postgresql}.org To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message