I think this is probably the correct explanation of why nanosleep went
to sleep too long...
Priority inheritance for in-kernel spinlocks and semaphores
Realtime programmers are often concerned about priority inversion,
which can happen as follows:
Low-priority task A acquires a resource, for example, a lock.
Medium-priority task B starts executing CPU-bound, preempting
low-priority task A.
High-priority task C attempts to acquire the lock held by
low-priority task A, but blocks because of medium-priority task B
having preempted low-priority task A.
Such priority inversion can indefinitely delay a high-priority task.
There are two main ways to address this problem: (1) suppressing
preemption and (2) priority inheritance. In the first case, since
there is no preemption, task B cannot preempt task A, preventing
priority inversion from occurring. This approach is used by PREEMPT
kernels for spinlocks, but not for semaphores. It does not make sense
to suppress preemption for semaphores, since it is legal to block
while holding one, which could result in priority inversion even in
absence of preemption. For some realtime workloads, preemption cannot
be suppressed even for spinlocks, due to the impact to scheduling
latencies.
Priority inheritance can be used in cases where suppressing preemption
does not make sense. The idea here is that high-priority tasks
temporarily donate their high priority to lower-priority tasks that
are holding critical locks. This priority inheritance is transitive:
in the example above, if an even higher priority task D attempted to
acquire a second lock that high-priority task C was already holding,
then both tasks C and A would be be temporarily boosted to the
priority of task D. The duration of the priority boost is also sharply
limited: as soon as low-priority task A releases the lock, it will
immediately lose its temporarily boosted priority, handing the lock to
(and being preempted by) task C.
However, it may take some time for task C to run, and it is quite
possible that another higher-priority task E will try to acquire the
lock in the meantime. If this happens, task E will "steal" the lock
from task C, which is legal because task C has not yet run, and has
therefore not actually acquired the lock. On the other hand, if task C
gets to run before task E tries to acquire the lock, then task E will
be unable to "steal" the lock, and must instead wait for task C to
release it, possibly boosting task C's priority in order to expedite
matters.
In addition, there are some cases where locks are held for extended
periods. A number of these have been modified to add "preemption
points" so that the lock holder will drop the lock if some other task
needs it. The JBD journaling layer contains a couple of examples of
this.
It turns out that write-to-reader priority inheritance is particularly
problematic, so PREEMPT_RT simplifies the problem by permitting only
one task at a time to read-hold a reader-writer lock or semaphore,
though that task is permitted to recursively acquire it. This makes
priority inheritance doable, though it can limit scalability.
------------
thanks to Bill for the tip
_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug