Now that we have mutexes in our I/O path (SCSI, mfi, etc)
vfs_shutdown codepath is no longer safe since it still doesn't
disable process scheduling and relies on tsleep and now msleep
not to get into the mi_switch by accident. Unfortunately msleep
doesn't provide such guarantees yet.
Here's a diff to remedy this. This is the same chunk as in the
tsleep, except it uses semantics of msleep. IPL dance is there
to negate the IPL changing effect of mtx_enter/mtx_leave so that
splx(safepri) operation is actually changing IPL level from HIGH
to "safepri" and then back to the mutex IPL level.
This fixes up hangs seen in vfs_shutdown (sys_sync) on panics.
Tested with SP and MP kernels on amd64. OK?
diff --git sys/kern/kern_synch.c sys/kern/kern_synch.c
index fbf7684..6fe2b61 100644
--- sys/kern/kern_synch.c
+++ sys/kern/kern_synch.c
@@ -142,7 +142,7 @@ tsleep(const volatile void *ident, int priority, const char
*wmesg, int timo)
}
/*
- * Same as tsleep, but if we have a mutex provided, then once we've
+ * Same as tsleep, but if we have a mutex provided, then once we've
* entered the sleep queue we drop the mutex. After sleeping we re-lock.
*/
int
@@ -155,12 +155,33 @@ msleep(const volatile void *ident, struct mutex *mtx, int
priority,
KASSERT((priority & ~(PRIMASK | PCATCH | PNORELOCK)) == 0);
KASSERT(mtx != NULL);
+ if (cold || panicstr) {
+ /*
+ * After a panic, or during autoconfiguration,
+ * just give interrupts a chance, then just return;
+ * don't run any other procs or panic below,
+ * in case this is the idle process and already asleep.
+ */
+ spl = MUTEX_OLDIPL(mtx);
+ MUTEX_OLDIPL(mtx) = splhigh();
+ mtx_leave(mtx);
+
+ splx(safepri);
+
+ if ((priority & PNORELOCK) == 0) {
+ mtx_enter(mtx);
+ MUTEX_OLDIPL(mtx) = spl;
+ } else
+ splx(spl);
+ return (0);
+ }
+
sleep_setup(&sls, ident, priority, wmesg);
sleep_setup_timeout(&sls, timo);
sleep_setup_signal(&sls, priority);
/* XXX - We need to make sure that the mutex doesn't
- * unblock splsched. This can be made a bit more
+ * unblock splsched. This can be made a bit more
* correct when the sched_lock is a mutex.
*/
spl = MUTEX_OLDIPL(mtx);