Mike, I've been successfully running this configuration:

hi1.4xlarge instance in VPC
ami-eafa5883, official Ubuntu 12 PV instance-store AMI
linux 3.2.0-31-virtual (picked up on upgrade), booting with grub kernel option 
"noautogroup"
These are running a write-heavy mysql replica load to XFS/MD/LVM on the SSDs.

This configuration has been stable across 10 instances for the past 34
hours.


Prior to running with "noautogroup" these instances would fail at approximately 
20%/day, so there was roughly a 10% chance that they'd survive one day without 
at least one crashing. The failures I saw were mostly complete failures, aside 
from a couple where processes were in a state where they'd partially work. The 
console logs  had "INFO: rcu_sched detected stalls on CPUs/tasks [...]" 
Curiously, during the failures, Cloudwatch would report a 6% cpu usage, which 
is one core on these instances.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1011792

Title:
  Kernel lockup running 3.0.0 and 3.2.0 on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1011792/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to