[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-07-23 Thread Launchpad Bug Tracker
This bug was fixed in the package linux-ec2 - 2.6.32-346.51

---
linux-ec2 (2.6.32-346.51) lucid-proposed; urgency=low

  [ Stefan Bader ]

  * SAUCE: Update spinlock handling code
- LP: #929941
  * SAUCE: Use ticket locks for Xen 3.0.2+
- LP: #929941
  * Rebased to Ubuntu-2.6.32-41.93
  * Release Tracking Bug
- LP: #1021084

  [ Ubuntu: 2.6.32-41.93 ]

  * No change upload to fix .ddeb generation in the PPA.

  [ Ubuntu: 2.6.32-41.92 ]

  * drm/i915: Move Pineview CxSR and watermark code into update_wm hook.
- LP: #1004707
  * drm/i915: Add CxSR support on Pineview DDR3
- LP: #1004707
 -- Stefan Bader stefan.ba...@canonical.com   Mon, 25 Jun 2012 11:20:40 +0200

** Changed in: linux-ec2 (Ubuntu)
   Status: Fix Committed = Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-07-16 Thread Fabio Kung
We started with 30 instances running with the -proposed ec2 kernel, in
our production environment. We plan to gradually boot more and update
this ticket once we collect enough instance hours to be confident that
this bug is not present in that version.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-07-10 Thread Stefan Bader
I would mark this as verified as the intended change has been running in
test kernels for some time before and as Ilan said it would take longer
than the verification period to hit it.

** Tags removed: verification-needed-lucid
** Tags added: verification-done-lucid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-07-10 Thread Fabio Kung
We are very confident that this bug is not present in the v1 kernel, as
we have been only running instances with that kernel for some months
now, and we have not seen these issues anymore. Me and noav were some of
the original reporters of this bug.

We can help testing the kernel currently in -proposed, but as others
have already commented, one week would not be enough to collect enough
instance hours. Two weeks would give us much more confidence.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-07-09 Thread Luis Henriques
This bug is awaiting verification that the kernel for lucid in -proposed
solves the problem (2.6.32-346.51). Please test the kernel and update
this bug with the results. If the problem is solved, change the tag
'verification-needed-lucid' to 'verification-done-lucid'.

If verification is not done by one week from today, this fix will be
dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-lucid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-07-09 Thread Ilan
Given the sporadic nature of this bug, it would take at least 2 weeks of
testing before we could say with even a slight bit of confidence that a
given kernel has stopping the  crashes we were experiencing.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-07-07 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/lucid-proposed/linux-ec2

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-06-25 Thread Stefan Bader
** Changed in: linux-ec2 (Ubuntu)
   Status: Incomplete = In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-06-25 Thread Stefan Bader
** Description changed:

+ SRU Justification:
+ 
+ Impact: The version of Xen patches we currently use for the ec2 kernel
+ have a serious flaw in the handling of nested spinlocks. This can result
+ in a complete deadlock under certain workloads.
+ 
+ Fix: The spinlock handling code has been substantially restructured in
+ later versions of the patchset. The changes backport this but also
+ enable the use of ticket-spinlocks (as we do now) when compiling with
+ the compatibility level we use.
+ 
+ Testcase: Not easy to reproduce. But feedback with the patchset applied
+ (see comment #32) look good.
+ 
+ --
+ 
  After running for some indeterminate period of time, the 2.6.32-341-ec2
  and 2.6.32-342-ec2 kernels stop responding when running on m2.2xlarge
  EC2 instances. No console output is emitted. Stack dumps gathered by
  examining CPU context information show that all VCPUs are stuck waiting
  on spinlocks. This could be a deadlock in the scheduling code.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 10.04
  Package: linux-image-2.6.32-341-ec2 2.6.32-341.42
  ProcVersionSignature: User Name 2.6.32-341.42-ec2 2.6.32.49+drm33.21
  Uname: Linux 2.6.32-341-ec2 x86_64
  Architecture: amd64
  Date: Fri Feb 10 01:56:17 2012
  Ec2AMI: ami-55dc0b3c
  Ec2AMIManifest: (unknown)
  Ec2AvailabilityZone: us-east-1c
  Ec2InstanceType: m1.xlarge
  Ec2Kernel: aki-427d952b
  Ec2Ramdisk: unavailable
  ProcEnviron:
-  LANG=en_US.UTF-8
-  SHELL=/bin/bash
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
  SourcePackage: linux-ec2

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-06-25 Thread Stefan Bader
** Changed in: linux-ec2 (Ubuntu)
   Status: In Progress = Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-06-23 Thread Ilan
Still seeing the crash with the most recent kernel update in lucid:
2.6.32-345-ec2 #49-Ubuntu SMP

Haven't yet seen a crash on
2.6.32-345-ec2_2.6.32-345.47+lp929941v3_amd64.deb.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-06-17 Thread Ilan
We've had a few instances running on 2.6.32-345-ec2 #47+lp929941v3
linked from this ticket since 2012-05-09.  So far those instances have
been stable, but it is  not possible for us to determine if the the
crash has been resolved, or if the subset of instances we upgraded was
just lucky enough to not trigger this bug.

As mentioned before the crashes do not appear to be consistently
reproducible, and hits our instances at random.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-06-05 Thread Matt Wilson
We've had a customer report a very similar looking lockup on
3.0.0-20-virtual. Full version info, 3.0.0-20-virtual (buildd@yellow)
(gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1) ) #34~lucid1-Ubuntu SMP Wed
May 2 17:24:41 UTC 2012 (Ubuntu 3.0.0-20.34~lucid1-virtual 3.0.30)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-05-09 Thread Ilan
We believe we are experiencing this bug as well. The most frequently
impacted instance type in our environment seems to be c1.xlarge.
However, it appears to be almost entirely at random and rarely hits the
same instances twice.

We're currently testing the new kernels on a very limited set of
machines.  We will report back with our experiences.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-04-02 Thread Noah Zoschke
Thank you for the information. We will begin limited testing of the
latest kernels provided.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-03-30 Thread Stefan Bader
Noah,

thanks for testing and reporting the results. The first thing to do now, is to 
decide whether v1 or v3 should be the goal. v1 could be considered well tested 
by now. The downside I see with that, is that to avoid some problems on certain 
older hypervisor code, this uses real spinning spinlocks. Which means while 
waiting for a lock, the virtual cpu will busily wait (which could have some 
impact on the cloud hosts cpu usage. Also this gives no queuing, which means 
that getting the lock can be unfair in contented situations.
The v3 kernel would in principle use the same implementation, which could 
theoretically be the wrong thing on older hypervisor versions (though the 
chance to have an instance launched on such an older host version is likely to 
get smaller every day). At least it is the same risk as we have now and the 
lockups happened on newer hypervisor versions. So I would tend towards the v3 
solution but for that it would be good to have more hours testing with v3 to 
see it is not showing other problems that might be related to this change.

Normally the process to get a change into an official kernel means to
propose it for SRU (stable release update), I will propose the patches
for inclusion and when accepted those get into a proposed kernel.
Normally those are prepared and made available and then verification has
to be done within a week. Which is not working with a bug like this. But
if there is a reasonable confidence that a test kernel has been running
on your busy instances without the original issue and new stability
problems, this should be a good argument.

Since the time I build the current v3 kernels there have been other
updates, too. So I would go ahead and prepare a new set of those. I will
post here when those are ready. If you then could start migrating your
instances to those and report back here when you feel confident about
the stability. Then I would start the steps required to integrate the
changes into the official kernels.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-03-30 Thread Stefan Bader
Newer kernels have been uploaded to same place as before.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-03-29 Thread Noah Zoschke
Stefan,

We've collected enough instance hours on the v1 kernel to feel confident
that it is not suffering the deadlock issue. We are continuing to roll
over our affected production instances to it.

We have done basic testing on v3 but we haven't collected enough
production data on it yet to report anything.

Can you help me understand the trajectory of these patches for our long
term planning?

Is there any indication of when v1 or v3 would land in an official linux
-ec2 release?

What can we do to help the most here? Collect significant instance hours
on v3?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-03-13 Thread Stefan Bader
Matt, any progress in testing the latest (v3) kernels that I provided?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-03-13 Thread Matt Wilson
I've never been able to reproduce the problem with synthetic workloads.
I've asked customers that experience the lockup regularly to test the v3
builds in an environment that won't cause production problems, but
haven't received results.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-03-13 Thread Stefan Bader
Ah, ok. Thanks. We'll have to wait then.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-03-12 Thread Matt Wilson
This has also been observed on c1.xlarge, adjusting the summary

** Summary changed:

- Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance
+ Kernel deadlock in scheduler on multiple EC2 instance types

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs