Re: somewhat random mostly-lockups in 5.0

2000-04-11 Thread Nik Clayton

On Mon, Apr 10, 2000 at 07:48:45AM -0400, Brian Fundakowski Feldman wrote:
 On Sun, 9 Apr 2000, Alfred Perlstein wrote:
  In otherwords, unplug your palm pilot and attach a console.
 
 I'm going to get my friend to get a traceback and whatever else is
 possible.  He has a laptop and "null" serial cable to use, and he
 experiences these problems as much as I do; I'll just convince him
 to keep running the latest -CURRENT and get the serial console working.

"ptelnet" for the Palm will do a serial connection over the Hotsync cradle.
I have it on reasonable authority from a friend that they've booted FreeBSD
this way, interacting with the boot loader via the Palm as they go.

Not that I'm recommending this for day to day use, or anything, but if all
you have is a Palm. . .

N
-- 
Internet connection, $19.95 a month.  Computer, $799.95.  Modem, $149.95.
Telephone line, $24.95 a month.  Software, free.  USENET transmission,
hundreds if not thousands of dollars.  Thinking before posting, priceless.
Somethings in life you can't buy.  For everything else, there's MasterCard.
  -- Graham Reed, in the Scary Devil Monastery


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: somewhat random mostly-lockups in 5.0

2000-04-10 Thread Brian Somers

 Well, it seems that -CURRENT likes locking up nowdays.  It started happening
 very recently, and I (as well as jlemon) do suspect that it's a problem
 with some of the changes that were made to the syscall mechanisms on
 3/28/2000.
 
 Keep in mind that this problem is completely corroborated by a friend
 whose machine behaved exactly the same starting at the same time.  I
 hadn't noticed it until now because it seems to occur under rare
 circumstances, which are untknown till now.  The circumstances sre
 trivial things like compiling things and playing mp3's, normally
 quite mediocre stuff.
 
 The syptoms are that the machine locks up.  Hard.  But there's a catch:
 it _can_ be pinged.  In fact, TCP connections can be made.  In my
 case, SSH connected, but the remote end never sent/received any data
 (or, that is, showed signs).  In my friend's case, telnet connected,
 but yet no data was received or acknowledged.  According to jlemon,
 whose diagnosis makes sense, the problem is that for whatever reason
 the kernel is not returning to user mode.  That explains why sshd
 doesn't work, telnetd doesn't work, XFree86 and apps don't respond.
 The question is, why?

FWIW, I can confirm that my laptop has been doing exactly this.  
Pings work, everything else is dead.  Mouse movement in X works for a 
while after the machine goes AWOL, but eventually that locks up too.  
I suspected vmware to be the culprit (or one of its klds), but I know 
now that's not the case because it sometimes happens as I shut down

This *may* have started happening when I did this:

Filesystem  1K-blocks UsedAvail Capacity  Mounted on
.
linprocfs   440   100%/usr/compat/linux/proc


 --
  Brian Fundakowski Feldman   \  FreeBSD: The Power to Serve!  /
  [EMAIL PROTECTED]`--'

-- 
Brian [EMAIL PROTECTED]brian@[uk.]FreeBSD.org
  http://www.Awfulhak.org   brian@[uk.]OpenBSD.org
Don't _EVER_ lose your sense of humour !




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: somewhat random mostly-lockups in 5.0

2000-04-10 Thread Brian Fundakowski Feldman

On Sun, 9 Apr 2000, Alfred Perlstein wrote:

 This can happen when the kernel is stuck in an infinite loop
 somewhere, you're still responding to interrupts, just stuck
 somewhere.
 
 FYI:
 ~ % uname -a
 FreeBSD thumper 5.0-CURRENT FreeBSD 5.0-CURRENT #1: Sun Apr  2 16:29:20 PDT 2000 
bright@thumper:/home/src/sys/compile/thumper  i386
 ~ % uptime
 10:58PM  up 7 days,  4:06, 21 users, load averages: 0.01, 0.02, 0.00
 
 I've been building world, playing mp3s, using fxtv and xmradio.

Like I said, it doesn't really have anything to do with what you're
doing, it just seems that it has to do with the fact the machine is
running at all...

 My setup is fine, perhaps you can furnish us with a traceback?  These
 kinds of lockups are very easy to fix with a traceback because they
 just mean that most likely the kernel is stuck in an infinite loop
 somewhere.
 
 In otherwords, unplug your palm pilot and attach a console.

I'm going to get my friend to get a traceback and whatever else is
possible.  He has a laptop and "null" serial cable to use, and he
experiences these problems as much as I do; I'll just convince him
to keep running the latest -CURRENT and get the serial console working.

 thanks,
 -- 
 -Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
 "I have the heart of a child; I keep it in a jar on my desk."

--
 Brian Fundakowski Feldman   \  FreeBSD: The Power to Serve!  /
 [EMAIL PROTECTED]`--'



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: somewhat random mostly-lockups in 5.0

2000-04-10 Thread Jordan K. Hubbard

 The syptoms are that the machine locks up.  Hard.  But there's a catch:

Erm, Brian, You *know* nobody can debug a problem like this without
hard information.  It's like calling a mechanic on the phone and
saying "My car won't go.  It just doesn't move at all!  Tell me what's
wrong!"

Compile in the kernel debugger and start hunting around when the
system "locks up" next time.  Just figuring out which wait address
processes are stuck on would be a BIG HELP.  Saying your machine locks
up but is still pingable narrows it down to only several thousand
lines of code.  Even jlemon's "diagnosis" is of only marginal help
without actually having access to the failing machine.

- Jordan

P.S. My -current box from April 6th has yet to do anything like this.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: somewhat random mostly-lockups in 5.0

2000-04-10 Thread Brian Fundakowski Feldman

On Mon, 10 Apr 2000, Jordan K. Hubbard wrote:

  The syptoms are that the machine locks up.  Hard.  But there's a catch:
 
 Erm, Brian, You *know* nobody can debug a problem like this without
 hard information.  It's like calling a mechanic on the phone and
 saying "My car won't go.  It just doesn't move at all!  Tell me what's
 wrong!"
 [...]

I'm not really expecting someone to be able to explain why it's happening.
I'm wondering if anyone else notices the same problem.  My friend down
here who also has this problem is going to get DDB set up to work with
the serial console, which means when it happens to him next, he'll have
all the info necessary to figure this out.  I was thinking that perhaps
I was not the only one to notice this yet, and if someone else did they
could find out more.

I'm not looking for a psychic; I'm trying to find the problem by letting
other people know that when it happens to them, they aren't the only ones,
and shouldn't brush it off if possible...

 - Jordan
 
 P.S. My -current box from April 6th has yet to do anything like this.

It's occurred on UP machines only that I know of, and I know only of
these two specific reports.  There's more in common, such as use of
softupdates, invariants, ATA, and other kernel options.

--
 Brian Fundakowski Feldman   \  FreeBSD: The Power to Serve!  /
 [EMAIL PROTECTED]`--'



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: somewhat random mostly-lockups in 5.0

2000-04-09 Thread Alfred Perlstein

* Brian Fundakowski Feldman [EMAIL PROTECTED] [000409 18:30] wrote:
 Well, it seems that -CURRENT likes locking up nowdays.  It started happening
 very recently, and I (as well as jlemon) do suspect that it's a problem
 with some of the changes that were made to the syscall mechanisms on
 3/28/2000.
 
 Keep in mind that this problem is completely corroborated by a friend
 whose machine behaved exactly the same starting at the same time.  I
 hadn't noticed it until now because it seems to occur under rare
 circumstances, which are untknown till now.  The circumstances sre
 trivial things like compiling things and playing mp3's, normally
 quite mediocre stuff.
 
 The syptoms are that the machine locks up.  Hard.  But there's a catch:
 it _can_ be pinged.  In fact, TCP connections can be made.  In my
 case, SSH connected, but the remote end never sent/received any data
 (or, that is, showed signs).  In my friend's case, telnet connected,
 but yet no data was received or acknowledged.  According to jlemon,
 whose diagnosis makes sense, the problem is that for whatever reason
 the kernel is not returning to user mode.  That explains why sshd
 doesn't work, telnetd doesn't work, XFree86 and apps don't respond.
 The question is, why?

This can happen when the kernel is stuck in an infinite loop
somewhere, you're still responding to interrupts, just stuck
somewhere.

FYI:
~ % uname -a
FreeBSD thumper 5.0-CURRENT FreeBSD 5.0-CURRENT #1: Sun Apr  2 16:29:20 PDT 2000 
bright@thumper:/home/src/sys/compile/thumper  i386
~ % uptime
10:58PM  up 7 days,  4:06, 21 users, load averages: 0.01, 0.02, 0.00

I've been building world, playing mp3s, using fxtv and xmradio.

My setup is fine, perhaps you can furnish us with a traceback?  These
kinds of lockups are very easy to fix with a traceback because they
just mean that most likely the kernel is stuck in an infinite loop
somewhere.

In otherwords, unplug your palm pilot and attach a console.

thanks,
-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message