On Tue, Dec 03, 2013 at 08:55:06AM +0700, Robert Elz wrote: > Date: Mon, 2 Dec 2013 12:29:13 +0100 > From: Manuel Bouyer <[email protected]> > Message-ID: <[email protected]> > > | > evtchn_do_event: handler 0xffffffff801f9c7f didn't lower ipl 8 7 > > | Yes, is has been there for a long time, but we failed to find where is > | problem is exactly. > > I didn't get quite as close as I'd hoped, and I probably haven't > discovered anything that wasn't already known, but ... > > For me at least, that message is triggered from statclock() called from > hardclock() - when statclock() starts the ipl is 7, when it exits, it is 8. > > But it's not statclock's fault ... the ipl is raised as a side effect or > mutex_spin_enter - on the other hand, mutex_spin_exit only restores the > ipl when the mutex being released is the last one held by the CPU.
Ha, good catch. I was looking for a missing splx() but didn't think about spin mutexes, which also changes the IPL. > > When statclock() - and hardclock() before it - is (or are) called, the > cpu (apparently) already holds a (spin) mutex (the ci_mtx_count field of > the cpu_info struct is -1). Given that, and the way spin mutexes work, > statclock() (and then hardclock()) must return with the ipl higher. > > What I haven't found yet is where that extra spin mutex is grabbed - but > it looks as if either there must be one, or the spin_mutex code has bugs > (and the latter doesn't seem likely - it would need to be some kind of > race against an interrupt handler, as this is a XEN Dom0 I am testing using, > and consequently, acts as if there is just one cpu). In any case, things > are far too repeatable, for it to be some bizarre random timing event with > an interrupt just at the wrong nanosecond... > > Unless someone else suddenly realises what's up - even if this is really > just SOP, that is, if there's some mutex that's meant to be held, and the > problem is the diagnostic objecting to normal expected conditions, then > the next time I get some time to look into this, I'll probably add some > instrumentation to the spin mutex functions, to record from where, and > which, mutexes are being locked, so when the (seemingly) spurious one is > detected, we can find out where it came from. > > At the minute I also can't explain how there is apparently a spin mutex > still held, yet the IPL is not IPL_HIGH ... that also suggests some kind > of problem I believe - when the first spin mutex is grabbed, the ipl > should be being raised to IPL, and it should stay there until no more spin > locks are held. Yet I (repeatedly, and reliably) observed statclock() > (and hardclock(), and other) functions being called with ipl lower than > IPL_HIGH (7 rather than 8) while ci_mtx_count is indicating a mutex is > held. Weird. I think what happens is: - the kernel is running in a code portion protected by a spin mutex registered at IPL_FOO, with IPL_FOO < IPL_SCHED - a clock interrupt comes in, it' serviced as we're running < IPL_SCHED. We still hold the lock at IPL_FOO from the interruped code. Then we take a spin lock registered at IPL_HIGH. This raises the IPL to IPL_HIGH, and when we mutex_exit() this spin mutex, the IPL is not restored to IPL_SCHED because we're holding another mutex. Now I guess it's done this way because it's allowed to release mutexes in any order: we could grab mutex1 at e.g. IPL_VM grab mutex2 at IPL_HIGH release mutex1 release mutex2 if "release mutex1" did restore IPL, we would be running at IPL_VM or lower, while we would expext to still be at IPL_HIGH because we're still holding mutex2. At first glance I don't see an easy way to fix this, if it has to be fixed. I can see 2 problems with this - minor performance issue, because interrupts are blocked more than necessery - in this specific case, the kernel stays at IPL_HIGH when it should not. This means IPIs are blocked when they should not be. Could this cause a deadlock ? Note that in the specific case of an interrupt handler returning at IPL_HIGH, interrupt exit should lower the IPL anyway. -- Manuel Bouyer <[email protected]> NetBSD: 26 ans d'experience feront toujours la difference --
