Re: Current stalls...(now also panic)

2001-01-02 Thread Edwin Culp

Mark,

Thanks a lot.  I read Marc's mail first so I'm trying the -k solution first
but I will also take USB support out of the kernels that have it.

Thanks,

ed

Mark Hittinger wrote:

  Maybe this is related, maybe not... I upgraded to the latest CURRENT
  available this morning and now I also see occasional (albeit short) hangs
  sometimes, although the machine seems to be responsive otherwise.

 I see this also.  If you aren't using USB but have it compiled into your
 kernel try commenting out all the "device" entries after "# USB support"
 in the config file.  A kernel built without USB may not show the hangs.

 I think there is a buglet that has krept into the USB code within the
 last couple of weeks.

 Later

 Mark Hittinger
 Earthlink
 [EMAIL PROTECTED]

 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Current stalls...(now also panic)

2001-01-02 Thread Edwin Culp

Szilveszter,

I've got some other weird problems with cores and panics but vi is the only
one that I can define well enough.  I'm also taking all USB support from the
kernels that have it to see if my other strange problems disappear like access
to mysql databases with php4 cvs and some strange library problems that I am
having with my new XFree4.0.2_3 with kde-2.0.1 and LinuxNetscape that I just
installed on Dec. 31. (Poor timing, but sorting these things out is what makes
life interesting :-)

Thanks,

ed

Szilveszter Adam wrote:

 Hi!

 Maybe this is related, maybe not... I upgraded to the latest CURRENT
 available this morning and now I also see occasional (albeit short) hangs
 sometimes, although the machine seems to be responsive otherwise.

 But, not only that, I also got a cool panic while trying to do some stuff
 (like compiling the docs) and pressing Ctrl-Z in another tty. No X, no
 nothing. I was dropped into the debugger, but since I am not a ddb artist,
 I tried to avoid to type the whole trace by hand and in the process managed
 to reboot the box... oh well. Next time I will know.

 But, the panic message was this:

 panic: blockable mtx_enter() of lockmgr interlock when not legal @
 ../../kern/kern_lock.c:247

 Maybe this rings a bell with someone. The error was appearently caught by
 WITNESS, which I also have enabled in my kernel (albeit without
 MUTEX_DEBUG, because *that* really made it impossible to do anything
 sensible on the machine...)

 Hardware-wise, nothing fancy here... UP PII-233, 128M non-ECC RAM, all-IDE,
 two disks, one CD-ROM. Next time, I promise to write down all the
 details... but just wanted to chime in quickly since this might be related
 to the hangs other people are seeing, but maybe they don't panic() because
 they don't have WITNESS enabled (just speculating)

 --
 Regards:

 Szilveszter ADAM
 Szeged University
 Szeged Hungary

 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Current stalls...(now also panic)

2000-12-30 Thread Alex Kapranoff

On Fri, Dec 29, 2000 at 10:57:10PM +0100, Szilveszter Adam wrote:
 Hi!
 
 Maybe this is related, maybe not... I upgraded to the latest CURRENT
 available this morning and now I also see occasional (albeit short) hangs
 sometimes, although the machine seems to be responsive otherwise. 
 
 But, not only that, I also got a cool panic while trying to do some stuff
 (like compiling the docs) and pressing Ctrl-Z in another tty. No X, no
 nothing. I was dropped into the debugger, but since I am not a ddb artist,
 I tried to avoid to type the whole trace by hand and in the process managed
 to reboot the box... oh well. Next time I will know.
 
 But, the panic message was this:
 
 panic: blockable mtx_enter() of lockmgr interlock when not legal @
 ../../kern/kern_lock.c:247
 
 Maybe this rings a bell with someone. The error was appearently caught by
 WITNESS, which I also have enabled in my kernel (albeit without
 MUTEX_DEBUG, because *that* really made it impossible to do anything
 sensible on the machine...)

  I see the same panic. I managed to collect some info in kern/23935.
http://www.freebsd.org/cgi/query-pr.cgi?pr=23935
I can reliably reproduce it.

-- 
Alex Kapranoff,  Voice: +7(0832)791845
36 hours before the brand new millenium...


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Current stalls...

2000-12-29 Thread Alexander Langer

Thus spake Poul-Henning Kamp ([EMAIL PROTECTED]):

   cd /usr/src
   cvs -q update -P -d -A
 on any of my two -current systems.
 The systems stalls as described in my email yesterday.

Maybe this is related:
I had two complete hangs today on my
FreeBSD cichlids.cichlids.com 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Wed Dec 27 13:05:38 
CET 2000 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/cichlids  i386
today, while I had *much* i/o on my xl0 NIC and on my hdd (UDMA33).

This is the first time for months that I have stressed my xl0 that
hard, and so I don't know if this is related or not.

Alex
-- 
cat: /home/alex/.sig: No such file or directory


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Current stalls...(now also panic)

2000-12-29 Thread Szilveszter Adam

Hi!

Maybe this is related, maybe not... I upgraded to the latest CURRENT
available this morning and now I also see occasional (albeit short) hangs
sometimes, although the machine seems to be responsive otherwise. 

But, not only that, I also got a cool panic while trying to do some stuff
(like compiling the docs) and pressing Ctrl-Z in another tty. No X, no
nothing. I was dropped into the debugger, but since I am not a ddb artist,
I tried to avoid to type the whole trace by hand and in the process managed
to reboot the box... oh well. Next time I will know.

But, the panic message was this:

panic: blockable mtx_enter() of lockmgr interlock when not legal @
../../kern/kern_lock.c:247

Maybe this rings a bell with someone. The error was appearently caught by
WITNESS, which I also have enabled in my kernel (albeit without
MUTEX_DEBUG, because *that* really made it impossible to do anything
sensible on the machine...)

Hardware-wise, nothing fancy here... UP PII-233, 128M non-ECC RAM, all-IDE,
two disks, one CD-ROM. Next time, I promise to write down all the
details... but just wanted to chime in quickly since this might be related
to the hangs other people are seeing, but maybe they don't panic() because
they don't have WITNESS enabled (just speculating)

-- 
Regards:

Szilveszter ADAM
Szeged University
Szeged Hungary


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Current stalls...(now also panic)

2000-12-29 Thread Mark Hittinger


 Maybe this is related, maybe not... I upgraded to the latest CURRENT
 available this morning and now I also see occasional (albeit short) hangs
 sometimes, although the machine seems to be responsive otherwise. 

I see this also.  If you aren't using USB but have it compiled into your
kernel try commenting out all the "device" entries after "# USB support"
in the config file.  A kernel built without USB may not show the hangs.

I think there is a buglet that has krept into the USB code within the
last couple of weeks.

Later

Mark Hittinger
Earthlink
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Current stalls...(now also panic)

2000-12-29 Thread Thomas D. Dean

I have a -current stall or hang, also.

I do NOT have USB support in my kernel.

Running gdb on hello.c, the standard hello, world, will cause the
stall or hang.  Keyboard input is echoed, but, no action.

tomdean


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: Current stalls...

2000-12-29 Thread John Baldwin


On 29-Dec-00 Poul-Henning Kamp wrote:
 
 I am totally unable to complete a 
   cd /usr/src
   cvs -q update -P -d -A
 on any of my two -current systems.
 
 The systems stalls as described in my email yesterday.
 
 CCD is now out of the equation.

I'm getting these hangs on my laptop as well, but only in the last few days.
An installworld from a previously built world hangs as well.

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: Current stalls...

2000-12-29 Thread Matthew Jacob


I'm getting these on NFS for loopback.


 
 On 29-Dec-00 Poul-Henning Kamp wrote:
  
  I am totally unable to complete a 
cd /usr/src
cvs -q update -P -d -A
  on any of my two -current systems.
  
  The systems stalls as described in my email yesterday.
  
  CCD is now out of the equation.
 
 I'm getting these hangs on my laptop as well, but only in the last few days.
 An installworld from a previously built world hangs as well.
 
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: RE: Current stalls...

2000-12-29 Thread Matt Dillon


:
:I'm getting these on NFS for loopback.

Can you verify that it's the same as Poul's by breaking into DDB
and doing a trace ?

I have very little time, but what I think may be going on is that
current may be exposing a bug in the specfs fsync code related to
flushing dirty buffers which already have a background bitmap write
in progress.

I think what is going on is that interrupt threads are not able to
run in -current, whereas interrupts do run in -stable, and an interrupt
completing the write on a buffer is what breaks us out of the 
infinite loop.  I noticed in Poul's 'ps' output that a number of
interrupt threads were runnable, but not getting any cpu to run.

The way to test this hypothesis is to give up the cpu for a tick in the
specfs fsync loop, allowing interrupt threads to run.  Maybe do a
tsleep for hz/10 every 500 iterations or something like that.
If this fixes the problem, then we have confirmation. 

Alternatively someone can work up a simple MARK/SCAN for specfs's fsync,
ala what we do in FFS's fsync, and try that.  I think MARK/SCAN may
be the ultimate solution but we should home in on the problem before
tring it out.

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: RE: Current stalls...

2000-12-29 Thread Matthew Jacob


I'll try- lots on plate to do.

There's a lot of iffy stuff with ithreads on alpha.  But this theory of yours
doesn't match the situation where I can then still log in and ping, but that
the NFS loopback mount is still hosed. 

I went back to building across NFS and that worked mucho better. 

 
 :
 :I'm getting these on NFS for loopback.
 
 Can you verify that it's the same as Poul's by breaking into DDB
 and doing a trace ?
 
 I have very little time, but what I think may be going on is that
 current may be exposing a bug in the specfs fsync code related to
 flushing dirty buffers which already have a background bitmap write
 in progress.
 
 I think what is going on is that interrupt threads are not able to
 run in -current, whereas interrupts do run in -stable, and an interrupt
 completing the write on a buffer is what breaks us out of the 
 infinite loop.  I noticed in Poul's 'ps' output that a number of
 interrupt threads were runnable, but not getting any cpu to run.
 
 The way to test this hypothesis is to give up the cpu for a tick in the
 specfs fsync loop, allowing interrupt threads to run.  Maybe do a
 tsleep for hz/10 every 500 iterations or something like that.
 If this fixes the problem, then we have confirmation. 
 
 Alternatively someone can work up a simple MARK/SCAN for specfs's fsync,
 ala what we do in FFS's fsync, and try that.  I think MARK/SCAN may
 be the ultimate solution but we should home in on the problem before
 tring it out.
 
   -Matt
 




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: RE: Current stalls...

2000-12-29 Thread Matt Dillon


:
:
:I'll try- lots on plate to do.
:
:There's a lot of iffy stuff with ithreads on alpha.  But this theory of yours
:doesn't match the situation where I can then still log in and ping, but that
:the NFS loopback mount is still hosed. 
:
:I went back to building across NFS and that worked mucho better. 

I'm kinda shooting in the dark here, at least where -current is
concerned.   It's very fragile.  The source of this particular problem
could be anything.

Maybe if we froze new development in -current and concentrated on 
stabilizing it for a month we could bring it back up to snuff.

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Current stalls...(now also panic)

2000-12-29 Thread Szilveszter Adam

On Fri, Dec 29, 2000 at 04:02:08PM -0800, Thomas D. Dean wrote:
 I have a -current stall or hang, also.
 
 I do NOT have USB support in my kernel.
 
 Running gdb on hello.c, the standard hello, world, will cause the
 stall or hang.  Keyboard input is echoed, but, no action.
 
 tomdean

No, no USB devices (nor USB support in the kernel.)

Also, I was not trying to use gdb (I tried that two days ago, with
strikingly similar results though:-) This panic is somewhat elusive: I
cannot readily trigger it it seems. Will stress system to see what gives.

BTW: I do not know if it was intentional but I see a *lot* fewer pcm0:
hwptr went backwards messages with very recent -CURRENT and indeed sound
output is a lot more continous, even under load. Congrats to whoever did
it! (probably cg)

-- 
Regards:

Szilveszter ADAM
Szeged University
Szeged Hungary


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: RE: Current stalls...

2000-12-29 Thread Matthew Jacob


 :
 :
 :I'll try- lots on plate to do.
 :
 :There's a lot of iffy stuff with ithreads on alpha.  But this theory of yours
 :doesn't match the situation where I can then still log in and ping, but that
 :the NFS loopback mount is still hosed. 
 :
 :I went back to building across NFS and that worked mucho better. 
 
 I'm kinda shooting in the dark here, at least where -current is
 concerned.   It's very fragile.  The source of this particular problem
 could be anything.
 
 Maybe if we froze new development in -current and concentrated on 
 stabilizing it for a month we could bring it back up to snuff.

Nope, I doubt it. It's about what I would expect it to be. Karnak predicts
that things will be miserable for about two more months and then get a lot
better. Freezing development (as if you could *really* folks to do that) will
make that three months or maybe four. The time to have done a rational plan to
stage all of this was last May, not now. I'm actually amazed that things work
as well as they do.

What we really need is a few more subsystems cleaned up lockwise (I have my
eye on Justin's latest CAM patches and how that will play with locking) and
somebody (that might be me when  if I get time for it- my current clients
don't give a doodly about FreeBSD so it's hard to break loose time in a work
context) to clean up the alpha ithreads stuff (I like what Doug has just
passed out- I need to thank him and try it (Yes, Doug, if you're reading this,
the proc holding Giant should inherit the hardware int priority)) and people
to just beat on things and have things get shaken out. Given our very loose
confederation, and given the downright absolute dislike some of us have for
each other, this is probably the best of all possible software development
worlds.

-matt




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message