Re: panic: found dirty cache page 0xf046f1c0

1999-01-24 Thread Wilko Bulte
As Matthew Dillon wrote... I've committed one bug fix to the 'found dirty cache page' bug -- turns out vm_map_split() was the culprit, renaming pages without removing them from PQ_CACHE. The bug was introduced in -3.0, and hit the KASSERT() I put in -4.x. I've committed

Re: panic: found dirty cache page 0xf046f1c0

1999-01-24 Thread Matthew Dillon
:FYI: a buildworld of -current including the above on FreeBSD/axp completed :without any incidents. : :Wilko :... :... ( other reports ) We are looking good, I've got half a dozen positive reports! On general principles, I think it is possible to make the FreeBSD VM system

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Matthew Dillon
:It's definately happening still, sorry. :-( I recompiled a 100% static :kernel and have had three more explosions, usually after starting exmh. :(exmh takes 10 to 15MB of ram on this system due to my mailbox folder :sizes). : :However, a clue.. The SMP box that is doing fine is a P6, an NFS

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Peter Wemm
Matthew Dillon wrote: :It's definately happening still, sorry. :-( I recompiled a 100% static :kernel and have had three more explosions, usually after starting exmh. :(exmh takes 10 to 15MB of ram on this system due to my mailbox folder :sizes). : :However, a clue.. The SMP box that

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Doug Rabson
On Sat, 23 Jan 1999, Peter Wemm wrote: Matthew Dillon wrote: :It's definately happening still, sorry. :-( I recompiled a 100% static :kernel and have had three more explosions, usually after starting exmh. :(exmh takes 10 to 15MB of ram on this system due to my mailbox folder

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Doug Rabson
On Sat, 23 Jan 1999, Doug Rabson wrote: I just had one of these on one of my alphas. The machine is UP (obviously), no MFS, no dynamically loaded stuff. It was doing an installworld with NFSv3 mounted source, local obj. All filesystems were using softupdates. I made it happen again by

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Matthew Dillon
:I made it happen again by doing the same installworld but this time I :caught it in the debugger. I'll leave the machine up for a while in case :someone has some idea of how to debug it. The stacktrace looks like this: : :#0 Debugger () at ../../alpha/alpha/db_interface.c:260 :#1

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Doug Rabson
On Sat, 23 Jan 1999, Matthew Dillon wrote: :I made it happen again by doing the same installworld but this time I :caught it in the debugger. I'll leave the machine up for a while in case :someone has some idea of how to debug it. The stacktrace looks like this: : :#0 Debugger () at

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Peter Wemm
Doug Rabson wrote: On Sat, 23 Jan 1999, Matthew Dillon wrote: :I made it happen again by doing the same installworld but this time I :caught it in the debugger. I'll leave the machine up for a while in case :someone has some idea of how to debug it. The stacktrace looks like this:

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Peter Wemm
Peter Wemm wrote: Matthew Dillon wrote: [..] Try changing the panic in vm/vm_page.c to a printf() ( I'll do that. BTW; what are the dangers of this? lost disk writes or corruption? Can we (as a workaround) push the page that we found back onto a dirty queue and try again after some

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Doug Rabson
On Sun, 24 Jan 1999, Peter Wemm wrote: Doug Rabson wrote: On Sat, 23 Jan 1999, Matthew Dillon wrote: :I made it happen again by doing the same installworld but this time I :caught it in the debugger. I'll leave the machine up for a while in case :someone has some idea of how

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Matthew Dillon
:[..] : Try changing the panic in vm/vm_page.c to a printf() ( : : I'll do that. : :BTW; what are the dangers of this? lost disk writes or corruption? Can :we (as a workaround) push the page that we found back onto a dirty queue :and try again after some diagnostics? That's ok,

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Peter Wemm
Matthew Dillon wrote: [..] :Oh, one other thing that occurred to me.. Under 4.0-current, I regularly :(ie: within 30 seconds of boot) get if_de tranmitter underflows. My :console corruption was happening at the instant that de0 was being :configured with ifconfig. exmh is running to a

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Doug Rabson
On Sun, 24 Jan 1999, Peter Wemm wrote: Oh, one other thing that occurred to me.. Under 4.0-current, I regularly (ie: within 30 seconds of boot) get if_de tranmitter underflows. My console corruption was happening at the instant that de0 was being configured with ifconfig. exmh is

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread N
On Sun, 24 Jan 1999, Peter Wemm wrote: [..] Oh, one other thing that occurred to me.. Under 4.0-current, I regularly (ie: within 30 seconds of boot) get if_de tranmitter underflows. My console corruption was happening at the instant that de0 was being configured with ifconfig. exmh is

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Matthew Dillon
Yes, we're working on it in a sub-group. Since the panic message is a new one -- it's one I added that never existed in -3.x, it is possible that the bug is not related to my VM stuff but related to something else going on. I've found a number of other bugs in the greater VM

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Matthew Dillon
:Here too... pretty quickly after boot on a SMP machine (current as of Jan :12) that pushes quite a bit of traffic, the following messages appear: : :de0: abnormal interrupt: transmit underflow (raising TX threshold to 96|256) :de0: abnormal interrupt: transmit underflow (raising TX threshold to

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread N
Here too... pretty quickly after boot on a SMP machine (current as of Jan 12) that pushes quite a bit of traffic, the following messages appear: de0: abnormal interrupt: transmit underflow (raising TX threshold to 96|256) [..] Three people getting these panics, three people with DEC

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Matthew Dillon
:But that was a week ago, and it's a *busy* news server (that's not hitting :swap), I was just curious about the error messages from the de driver. : : -- Niels. The transmit underflow messages: de0: abnormal interrupt: transmit underflow (raising TX threshold to 96|256) de0: abnormal

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Brian Feldman
On Sat, 23 Jan 1999, Matthew Dillon wrote: Yes, we're working on it in a sub-group. Since the panic message is a new one -- it's one I added that never existed in -3.x, it is possible that the bug is not related to my VM stuff but related to something else going on.

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Matthew Dillon
: : I am also comitting some very strict KASSERT checking to try to catch : the problem earlier. Everyone running 4.x kernels should add the following : :Ahem, would you kindly define 'everyone'? Anyone, everyone, everybody, all ... any individual using the -4.x kernels needs

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Peter Wemm
Matthew Dillon wrote: :But that was a week ago, and it's a *busy* news server (that's not hitting :swap), I was just curious about the error messages from the de driver. : : -- Niels. The transmit underflow messages: de0: abnormal interrupt: transmit underflow (raising TX

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Matthew Dillon
:On my system I can understand it, it's a 2xP5 with a shared L2 cache on a :Neptune chipset - something that isn't known for speed. Once you get two :processors hammering the system bus, *plus* mix in an EISA scsi :controller, I could well imagine the memory bus getting thrashed. When we

Re: panic: found dirty cache page 0xf046f1c0

1999-01-23 Thread Matthew Dillon
I've committed one bug fix to the 'found dirty cache page' bug -- turns out vm_map_split() was the culprit, renaming pages without removing them from PQ_CACHE. The bug was introduced in -3.0, and hit the KASSERT() I put in -4.x. I've committed a general inlining of 'changing

Re: panic: found dirty cache page 0xf046f1c0

1999-01-22 Thread Manfred Antar
At 10:34 AM 1/23/99 +0800, Peter Wemm wrote: Dual p5-90 w/ 48M ram, doing a major cvs update/merge (which mostly got lost): panic: found dirty cache page 0xf046f1c0 mp_lock = 0101; cpuid = 1; lapic.id = 0100 Debugger(panic) Stopped at Debugger+0x37: movl$0,in_Debugger db trace

Re: panic: found dirty cache page 0xf046f1c0

1999-01-22 Thread Peter Wemm
Peter Wemm wrote: Dual p5-90 w/ 48M ram, doing a major cvs update/merge (which mostly got lost): panic: found dirty cache page 0xf046f1c0 mp_lock = 0101; cpuid = 1; lapic.id = 0100 Debugger(panic) Stopped at Debugger+0x37: movl$0,in_Debugger db trace This is possibly a

Re: panic: found dirty cache page 0xf046f1c0

1999-01-22 Thread Matthew Dillon
:Peter Wemm wrote: : Dual p5-90 w/ 48M ram, doing a major cvs update/merge (which mostly got : lost): : : panic: found dirty cache page 0xf046f1c0 :... : :This is possibly a false alarm.. Something wierd was happening. I cleaned :out the kernel and reconfigured with NFS static (it was being

Re: panic: found dirty cache page 0xf046f1c0

1999-01-22 Thread Matthew Dillon
:At 10:34 AM 1/23/99 +0800, Peter Wemm wrote: :Dual p5-90 w/ 48M ram, doing a major cvs update/merge (which mostly got :lost): : :panic: found dirty cache page 0xf046f1c0 :mp_lock = 0101; cpuid = 1; lapic.id = 0100 :... :I just got the same thing doing a make -j8 world :Machine is a

Re: panic: found dirty cache page 0xf046f1c0

1999-01-22 Thread Brian Feldman
On Fri, 22 Jan 1999, Manfred Antar wrote: At 10:34 AM 1/23/99 +0800, Peter Wemm wrote: Dual p5-90 w/ 48M ram, doing a major cvs update/merge (which mostly got lost): panic: found dirty cache page 0xf046f1c0 mp_lock = 0101; cpuid = 1; lapic.id = 0100 Debugger(panic) Stopped at

Re: panic: found dirty cache page 0xf046f1c0

1999-01-22 Thread Peter Wemm
Matthew Dillon wrote: :Peter Wemm wrote: : Dual p5-90 w/ 48M ram, doing a major cvs update/merge (which mostly got : lost): : : panic: found dirty cache page 0xf046f1c0 :... : :This is possibly a false alarm.. Something wierd was happening. I cleaned :out the kernel and reconfigured