Summing up my current observations and theory. In the dump I too, I had
8 VPUs (0-7). With regard of spinlocks I saw the following:

CPU#0 claimed to be waiting on the spinlock for runqueue#0 (its own). Though 
the actual lock was free (this could mean that it was just woken out of the 
wait but not yet claimed the lock).
CPU#1 was waiting on a spinlock of a wait_queue_head that belonged to a task 
which seemed to be on CPU#7. CPU#1 was the one reporting soft lockup.
CPU#2 to 5 seemed to be idle and not on the slow spinlock acquiring path.
CPU#6 was waiting on a spinlock that might be for block io queue (though I am 
not sure on this one)
CPU#7 was also waiting on the spinlock for runqueue#0 (in the process of 
pulling a task over from that for idle balance). It looked like the task 
running there was doing some ext4 operations and just went into io schedule.

One issue with the xen paravirt spinlock implementation I thought to
have found was that when unlocking a spinlock it would only send an
notification to the first online cpu it finds waiting for that lock
(always starting with the lowest number). This sounded like a high
contention might badly put higher numbered cpu at a disadvantage.
Unfortunately a test kernel I created with a potential fix there just
locked up as well. So either the theory is wrong or the attempted fix
does not do the right thing...

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1011792

Title:
  Kernel lockup running 3.0.0 and 3.2.0 on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1011792/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to