I just noticed this morning that msleep() will not behave correctly on some platforms, when passed a mutex associated to the IPL_NONE level (i.e. plain mutex, without an associated spl semantic).
msleep() needs to run at splsched(), and fiddles with the mutex fields to make sure mtx_leave() will not lower spl below IPL_SCHED, and fiddles later on, after taking the mutex back, to make the next mtx_leave() return to the level the first mtx_enter() was run at. This works on amd64, i386, powerpc and sparc64, because the mutex implementation on these platforms always performs an spl update. All other platforms have a slightly different implementation which skips the splraise()/splx() calls if the mutex is associated to IPL_NONE. This was done mainly for speed, since some of these platforms have somewhat expensive spl calls (in comparison to the overhead of the comparison + branch). When invoking msleep() on an IPL_NONE mutex on these platforms, msleep() will raise to IPL_SCHED, but the next mtx_leave() call, after msleep() returns, will not alter the current level. Things continue running at splsched() until the next sleep or preemption. There are two ways to fix this: - make all mutex implementation perform spl operations, even if they are associated to IPL_NONE. Easy to do. - make msleep() aware that IPL_NONE mutexens may not change spl, and avoid using the mutex bowels to store the temporary splsched() sequence. This is easy to do as well, but requires an extra macro to be defined, to let msleep() know the ipl the mutex is associated to (something like #define MUTEX_WANTIPL(mtx) (mtx)->mtx_wantipl) Or is there any better option? (Note that the hidden message here, is ``shall we define precisely the exact semantics of IPL_NONE mutexes once for all, and fix all non-compliant implementations?'') Miod
