I just noticed this morning that msleep() will not behave correctly on
some platforms, when passed a mutex associated to the IPL_NONE level
(i.e. plain mutex, without an associated spl semantic).

msleep() needs to run at splsched(), and fiddles with the mutex fields
to make sure mtx_leave() will not lower spl below IPL_SCHED, and fiddles
later on, after taking the mutex back, to make the next mtx_leave()
return to the level the first mtx_enter() was run at.

This works on amd64, i386, powerpc and sparc64, because the mutex
implementation on these platforms always performs an spl update.

All other platforms have a slightly different implementation which skips
the splraise()/splx() calls if the mutex is associated to IPL_NONE. This
was done mainly for speed, since some of these platforms have somewhat
expensive spl calls (in comparison to the overhead of the comparison +
branch).

When invoking msleep() on an IPL_NONE mutex on these platforms, msleep()
will raise to IPL_SCHED, but the next mtx_leave() call, after msleep()
returns, will not alter the current level. Things continue running at
splsched() until the next sleep or preemption.

There are two ways to fix this:
- make all mutex implementation perform spl operations, even if they are
  associated to IPL_NONE. Easy to do.
- make msleep() aware that IPL_NONE mutexens may not change spl, and
  avoid using the mutex bowels to store the temporary splsched()
  sequence. This is easy to do as well, but requires an extra macro to
  be defined, to let msleep() know the ipl the mutex is associated to
  (something like #define MUTEX_WANTIPL(mtx) (mtx)->mtx_wantipl)

Or is there any better option?

(Note that the hidden message here, is ``shall we define precisely the
exact semantics of IPL_NONE mutexes once for all, and fix all
non-compliant implementations?'')

Miod

Reply via email to