[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-13 Thread Stefan Bader
At a first glance this looks like all the hangs occur because the hypercall 
used to poll for the pv interrupts never returns. Has there been any change to 
then hypervisor on the affected instances? Or maybe some correlation with a 
specific type of hw running those?
Can we get the full dmesg of one of the hanging guests after a reboot (which I 
assume keeps it on the same host)? Did the instances just idle or where they 
used for some task that you can tell us about? Thanks.

** Changed in: linux-ec2 (Ubuntu)
   Importance: Undecided = High

** Changed in: linux-ec2 (Ubuntu)
   Status: New = Incomplete

** Changed in: linux-ec2 (Ubuntu)
 Assignee: (unassigned) = Stefan Bader (stefan-bader-canonical)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-13 Thread Matt Wilson
I also suspect something going sideways in the PV spinlock code, but
nothing has changed in the underlying hardware or hypervisor in this
area. There have been bugs in the PV spinlock code in the past,
including using mb() instead of barrier() in the unlock path, which
could cause the VCPU holding a lock to trigger a kick on the VCPU
waiting before the memory write is complete. I looked at the 10.04
kernel, and this particular bug is already addressed in the PV spinlock
code.

These instances are under load when they hang. Here's the uptime and
/proc/interrupts output from one instance before it hung, but after it
was operational:

Linux ip-10-94-81-231 2.6.32-341-ec2 #42-Ubuntu SMP Tue Dec 6 14:56:13 UTC 2011 
x86_64 GNU/Linux
16:10:54 up 16 days, 19:52,  0 users,  load average: 9.86, 5.01, 3.41
   CPU0   CPU1   CPU2   CPU3   
 16:  186872780  170473347  170447163  170493692   Dynamic-percputimer
 17:  191775644  350788322  357828130  357481319   Dynamic-percpuresched
 18:  67019  74008  66602  66485   Dynamic-percpucallfunc
 19: 189590 193987 188670 181119   Dynamic-percpucall1func
 20:  0  0  0  0   Dynamic-percpureboot
 21:  165290618  177938588  177538577  177157514   Dynamic-percpuspinlock
 22:410  0  0  0   Dynamic-level xenbus
 23:  0  0  0  0   Dynamic-level suspend
 24:341  0 74180   Dynamic-level xencons
 25: 392339 664199 899350 700455   Dynamic-level blkif
 26:   19953668   46164431   58214738   57029478   Dynamic-level blkif
 27: 1483445834  0  0  0   Dynamic-level eth0
NMI:  0  0  0  0   Non-maskable interrupts
RES:  191775644  350788323  357828131  357481320   Rescheduling interrupts
CAL: 256609 267995 255272 247604   Function call interrupts

Over the weekend, m2.4xlarge instances hung as well. I'll work on
getting dmesg output.


** Summary changed:

- Kernel deadlock in scheduler on m2.2xlarge EC2 instance
+ Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-10 Thread Scott Moser
** Visibility changed to: Public

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-10 Thread Matt Wilson
Overnight an instance running 2.6.32-316 locked up. The stack traces are
attached.

** Attachment added: stack traces from instance running 2.6.32-316
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2730182/+files/i-804475e2.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-09 Thread Matt Wilson
** Attachment added: stack traces from instance running 2.6.32-341 (1/2)
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2728701/+files/ubuntu-deadlock-2.6.32-341-1.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-09 Thread Matt Wilson
-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-09 Thread Matt Wilson
** Attachment added: stack traces from instance running 2.6.32-341 (2/2)
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2728702/+files/ubuntu-deadlock-2.6.32-341-2.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-09 Thread Matt Wilson
** Attachment added: stack traces from instance running 2.6.32-342 (1/1)
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2728703/+files/ubuntu-deadlock-2.6.32-342-1.txt

** Visibility changed to: Private

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs