> - splhi -- it's not a true splhi in some sense; is it possible that
> some code is sneaking in and running even when you splhi()? Could this
> explain it?

The error Philippe has found is only indirectly related to splhi().
It's a race between a process in sleep() returning to the scheduler on
cpu A, and the same process being readied and rescheduled on cpu B
after the wakeup.

On native plan 9, A always wins the race because it runs splhi() and
the code path from sleep to schedinit (where up->state==Running is
checked) is shorter than the code path from runproc to the point in
sched where up->state is set to Running.  But the fact that this works
is timing-dependent: if cpu A for some reason ran slower than cpu B,
it could lose the race even without being interrupted.

As Philippe explained, in 9vx the cpus are being simulated by
threads.  Because these threads are being scheduled by the host
operating system, the virtual cpus can appear to be running at
different speeds or to pause at awkward moments.  Even without
any "preemption" at the plan 9 level of abstraction, the timing
assumption which prevents the sleep - reschedule race is no longer
guaranteed.


Reply via email to