>>       * the kprocdev framework.  all i/o into devip, devfs, and devdraw
>>         is marshalled and handed off to a kproc running in a different
>>         pthread, so that blocking i/o won't block the cpu0 pthread,
>>         which is the only one that can run vx32.  this means that
>>         all i/o gets copied one extra time inside the kernel.
>
> why can only one thread run vx32?

9vx requires that the page fault handler runs on an alternate stack
during vx32 ("user") execution and on the kernel stack during
kernel execution.  That bit--whether or not to run the signal handler
on the pthread's alternate signal stack--is part of the struct sigaction
defining the signal-handling behavior, which is shared by all
pthreads in the process, not per-pthread.  9vx arranges that the
global bit is correct for all pthreads by only allowing one of the
pthreads--cpu0--to page fault.  The others, which run supporting
kprocs, arrange never to fault.  When user i/o is moved off cpu0
to the supporting kprocs, the i/o has to be done into fault-free
kernel buffers and then copied back into user space on cpu0.

This is essentially a failure of vision in the pthreads interface:
sigaltstack is per-pthread, but sigaction is not.  Linux does make
it possible to have different sigactions per pthread, but you'd
have to hack up your own thread library.  If the Linux guys had
really understood the wisdom of rfork, one could just do
rfork(RFSIGHAND) at the start of each new pthread instead of
having to drag in a whole new library.  FreeBSD didn't get
this right either, for what it's worth.  In both cases you could
work around this by linking with a modified pthread library.
Ironically, OS X doesn't have this problem because its signal
handling was so bad 9vx has to reimplement it from scratch
in terms of Mach exceptions (see 9vx/osx/signal.c).

There are other simplifying assumptions, like having just one
address range for the "user address space", but they could be
removed if necessary.  The real difficulty is the sigaction
SA_ONSTACK bit.

None of this is terribly important for performance: right now there
are plenty of inefficiencies not related to having just one cpu on
which to run user code.

Russ

Reply via email to