Bruce Evans writes:
> On Sat, 6 Jul 2002, Andrew Gallatin wrote:
> > Julian Elischer writes:
> > > On Sat, 6 Jul 2002, Andrew Gallatin wrote:
> > > > OK, current is really confusing me. When we are panic'ing and syncing
> > > > disks, how are we supposed to come back to the current thread which
> > > > caused the dump after we do an mi_switch() to allow an interrupt
> > > > thread to run?
> > >
> > > It depends.
> > >
> > > the previous thread should have been put back onto the run queue
> > > before the interrupt thread was scheduled.
> > Could it have anything to do with interrupt preemption being disabled on
> > alpha & enabled on i386?
> Very likely.
Unfortunately, that wasn't it.
After reverting all my local hacks, I see that the system ends up
siointr1() at siointr1+0x198
siointr() at siointr+0x40
isa_handle_fast_intr() at isa_handle_fast_intr+0x24
alpha_dispatch_intr() at alpha_dispatch_intr+0xd0
interrupt() at interrupt+0x110
XentInt() at XentInt+0x28
--- interrupt (from ipl 0) ---
_mtx_unlock_flags() at _mtx_unlock_flags+0x8c
kthread_suspend_check() at kthread_suspend_check+0xbc
buf_daemon() at buf_daemon+0x80
fork_exit() at fork_exit+0xe0
exception_return() at exception_return
--- root of call graph ---
I think that the buf_daemon just happened to wake up at the wrong
time, and the panicstr hacks in msleep prevent it from ever going back
to sleep again once it is awake. Now that I realize this, I suspect
the same thing happened with the random_kthread that I was talking
Perhaps there is something about alpha's hz being 1024 which is making
it more likely to loose whatever race is won on i386.
Humerously enough, if I clear panicstr in panic(), then crashes work
(for a loose definition of work, who knows what they mean!), with the
added "benefit" of marking the filesystems clean:
panic: vm_page_wakeup: page not busy!!!
Stopped at Debugger+0x34: zapnot v0,#0xf,v0 <v0=0x0>
Waiting (max 60 seconds) for system process `vnlru' to stop...stopped
Waiting (max 60 seconds) for system process `bufdaemon' to stop...stopped
Waiting (max 60 seconds) for system process `syncer' to stop...stopped
syncing disks... 1 1
Dumping 509 MB
pid 569 (scp), uid 1387: exited on signal 4 (core dumped)
pid 539 (tcsh), uid 1387: exited on signal 4 (core dumped)
pid 538 (sshd), uid 1387: exited on signal 4
pid 536 (sshd), uid 0: exited on signal 4
pid 481 (sshd), uid 0: exited on signal 4
pid 442 (ntpd), uid 0: exited on signal 4
16 32 48 64 80 96 112pid 530 (cron), uid 0: exited on signal 4
128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384
400 416 432 448 464 480 496
Automatic reboot in 15 seconds - press a key on the console to abort
Maybe we need to strengthen to the panicstr hacks and only allow the
thread which caused the crash and interrupt threads to be
scheduled once a panic occurs.
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message