[Bug 666211] Re: maverick on ec2 64bit ext4 deadlock

2011-12-28 Thread Brandon Black
As a matter of practicality, given this and other problematic bugs w/ EC2 on the stock Maverick and Lucid kernels (there are several that range from annoying to flat-out unreliable or un-(re)-bootable), I had been running my Maverick-based instances with the Karmic kernels pretty successfully

Re: [Bug 666348] Re: Reboot notification should not require X

2011-11-16 Thread Brandon Black
Auto-closing bugs without reading them because you never got around to fixing them is an *awesome* QA plan. The bug isn't missing any data :P -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/666348

[Bug 636474] Re: xtables-addons-source will not compile

2011-05-04 Thread Brandon Black
Maverick still doesn't have this commit released, any plans to do so? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/636474 Title: xtables-addons-source will not compile -- ubuntu-bugs mailing

[Bug 727814] Re: [Maverick] Reboot of linux-virtual hangs on EC2

2011-04-12 Thread Brandon Black
Stefan, your workaround would almost be acceptable if this were the only bug in play. However, for those of us booting Maverick AMIs for PV- grub, and then using cloud-init to auto-downgrade the kernel to Karmic's or auto-upgrade to Natty's (because let's face it, so far Lucid and Maverick have

[Bug 727814] Re: [Maverick] Reboot of linux-virtual hangs on EC2

2011-03-15 Thread Brandon Black
Anyone have a userland workaround (other than ec2-reboot-instances of own instance-id, which requires auth keys on the node...) for getting an ec2 node on these kernels to reboot itself successfully? Also: note that once we have a release kernel w/ the fix, new Maverick AMIs will have to go out

[Bug 669096] Re: Kernel divide by zero panic in sched.c:update_sd_lb_stats

2011-01-06 Thread Brandon Black
This looks very likely to be related upstream: https://bugzilla.kernel.org/show_bug.cgi?id=16991 ** Bug watch added: Linux Kernel Bug Tracker #16991 http://bugzilla.kernel.org/show_bug.cgi?id=16991 -- You received this bug notification because you are a member of Ubuntu Bugs, which is

[Bug 669096] Re: Kernel divide by zero panic in sched.c:update_sd_lb_stats

2011-01-06 Thread Brandon Black
Ah, and from there I see a link to a different LP bug which has more details as well, should probably close this as a dupe of that one: https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/614853 -- You received this bug notification because you are a member of Ubuntu Bugs, which is

[Bug 666348] Re: Reboot notification should not require X

2010-12-01 Thread Brandon Black
If you haven't fixed it, then yes, it's still an issue. The bug was found in Maverick. I don't have the time or inclination to test later versions of the distribution, but either way you need to fix it in Maverick, and perhaps look at your other kernel packages to see if they need fixing as

[Bug 651370] Re: ec2 kernel crash invalid opcode 0000 [#1]

2010-11-02 Thread Brandon Black
Stefan: the ~32 vs ~64GB memory issue is very likely orthogonal and has a separate bug now (bug 667796). This issue is solely about intel_idle vs certain CPU types under Amazon's EC2 (Xen) environment. m2.4xlarge in us-east reproduces the crash on boot readily (and also happens to exhibit the

[Bug 667796] Re: kernel only recognizes 32G of memory

2010-11-02 Thread Brandon Black
My experience so far has been that the Lucid kernels do not have this memory size bug, only the Maverick ones. -- kernel only recognizes 32G of memory https://bugs.launchpad.net/bugs/667796 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to

[Bug 669096] [NEW] Kernel divide by zero panic in sched.c:update_sd_lb_stats

2010-10-31 Thread Brandon Black
Public bug reported: host was an m2.2xlarge instance in us-east-1 (with EBS root and data volumes) running stock lucid and serving primarily as a MySQL server. ProblemType: Bug DistroRelease: Ubuntu 10.04 Package: linux-image-2.6.32-308-ec2 2.6.32-308.16 ProcVersionSignature: Ubuntu

[Bug 669096] Re: Kernel divide by zero panic in sched.c:update_sd_lb_stats

2010-10-31 Thread Brandon Black
** Attachment added: crash output from ec2-get-console-output https://bugs.launchpad.net/bugs/669096/+attachment/1718175/+files/Console.txt -- Kernel divide by zero panic in sched.c:update_sd_lb_stats https://bugs.launchpad.net/bugs/669096 You received this bug notification because you are a

[Bug 669096] Re: Kernel divide by zero panic in sched.c:update_sd_lb_stats

2010-10-31 Thread Brandon Black
The part we crashed in is the code that calculates cpu scheduler stats, with a divide by zero. I've done some searching around on git.kernel.org and lxr.linux.no, and I don't see where any similar existing problem has been caught and fixed yet upstream through 2.6.32, but it could be

[Bug 668434] [NEW] ec2-bundle-vol should copy filesystem label

2010-10-29 Thread Brandon Black
Public bug reported: Binary package hint: ec2-ami-tools ec2-bundle-vol needs to copy the filesystem label from the source system's rootfs to the rootfs image, so that things like LABEL=euc- rootfs in /etc/fstab work (which is used in Maverick, which makes ec2 -bundle-vol fail to create bootable

[Bug 668434] [NEW] ec2-bundle-vol should copy filesystem label

2010-10-29 Thread Brandon Black
Public bug reported: Binary package hint: ec2-ami-tools ec2-bundle-vol needs to copy the filesystem label from the source system's rootfs to the rootfs image, so that things like LABEL=euc- rootfs in /etc/fstab work (which is used in Maverick, which makes ec2 -bundle-vol fail to create bootable

[Bug 667793] Re: euca-bundle-vol should copy filesystem label

2010-10-28 Thread Brandon Black
Did you check on this bug in euca-bundle-vol? The broken AMI I referenced in the other thread was actually made with ec2-bundle-vol. -- euca-bundle-vol should copy filesystem label https://bugs.launchpad.net/bugs/667793 You received this bug notification because you are a member of Ubuntu

[Bug 667793] Re: euca-bundle-vol should copy filesystem label

2010-10-28 Thread Brandon Black
Did you mean here or with Amazon? I've opened a feature-request case with Amazon as a customer in any case. BTW, the simple commandline way to get the label is: /sbin/blkid -s LABEL -o value /dev/sda1 -- euca-bundle-vol should copy filesystem label https://bugs.launchpad.net/bugs/667793 You

[Bug 667793] Re: euca-bundle-vol should copy filesystem label

2010-10-28 Thread Brandon Black
Did you check on this bug in euca-bundle-vol? The broken AMI I referenced in the other thread was actually made with ec2-bundle-vol. -- euca-bundle-vol should copy filesystem label https://bugs.launchpad.net/bugs/667793 You received this bug notification because you are a member of Ubuntu

[Bug 667793] Re: euca-bundle-vol should copy filesystem label

2010-10-28 Thread Brandon Black
Did you mean here or with Amazon? I've opened a feature-request case with Amazon as a customer in any case. BTW, the simple commandline way to get the label is: /sbin/blkid -s LABEL -o value /dev/sda1 -- euca-bundle-vol should copy filesystem label https://bugs.launchpad.net/bugs/667793 You

[Bug 651370] Re: ec2 kernel crash invalid opcode 0000 [#1]

2010-10-27 Thread Brandon Black
I wasn't able to boot on ami-d258acbb on m2.4xlarge. It seemed to come up without the special kernel options: [0.00] Linux version 2.6.35-22-virtual (bui...@allspice) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu4) ) #33-Ubuntu SMP Sun Sep 19 21:05:42 UTC 2010 (Ubuntu

[Bug 651370] Re: ec2 kernel crash invalid opcode 0000 [#1]

2010-10-27 Thread Brandon Black
What's the method for making the S3 AMIs by the way? When I tried before, I tried just doing standard ec2-bundle-vol stuff inside of a fixed Maverick, but my first attempts failed because of the root device not having LABEL=euc-rootfs in the newly-launched instances, and the second generation

[Bug 651370] Re: ec2 kernel crash invalid opcode 0000 [#1]

2010-10-27 Thread Brandon Black
Well, I had a hunch this morning that perhaps my test AMI was faulty (perhaps some stupid issue related to block-device mapping, etc, which varies between the variations on c1.xlarge), since it wasn't packaged by the same methods/tools as the official one. It seems this may be the case.

[Bug 651370] Re: ec2 kernel crash invalid opcode 0000 [#1]

2010-10-26 Thread Brandon Black
I tried to look in more detail at the crash this evening, because it's really causing me a lot of headache now. The most recent time I tried to boot a new c1.xlarge in us-east-1 this evening, I had to cycle through the crash/terminate/relaunch cycle 7 times before I got a working instance. I

[Bug 651370] Re: ec2 kernel crash invalid opcode 0000 [#1]

2010-10-26 Thread Brandon Black
I forgot to add above: on the E5410 c1.xlarge's that do boot successfully, the kernel output contains: Oct 26 07:37:55 ip-10-243-51-207 kernel: [0.210255] intel_idle: MWAIT substates: 0x2220 Oct 26 07:37:55 ip-10-243-51-207 kernel: [0.210257] intel_idle: does not run on family 6 model

[Bug 651370] Re: ec2 kernel crash invalid opcode 0000 [#1]

2010-10-26 Thread Brandon Black
So far my test instances with one or both of the MWAIT-related kernel flags have given even worse results than the original: They boot showing intel_idle disabled on E5410 nodes only, but the (assumed) E5506 nodes just terminate themselves quickly with no console log output at all (even after

[Bug 666348] [NEW] Reboot notification should not require X

2010-10-25 Thread Brandon Black
Public bug reported: The postinst script for Maverick (and probably other releases') kernels ultimately signal that a reboot is required by checking for and executing (if exists) /usr/share/update-notifier/notify-reboot- required. That script in turn generates /var/run/reboot-required, which

[Bug 666348] Re: Reboot notification should not require X

2010-10-25 Thread Brandon Black
[ s/exit/exist/ in near the end of the second paragraph above, typo ] Also, as a temporary workaround, I've found that the kernel postinst script *does* directly create /var/run/do-not-hibernate when a kernel is upgraded, so that can be used to determine needs reboot as well for now. **

[Bug 651370] Re: ec2 kernel crash invalid opcode 0000 [#1]

2010-10-25 Thread Brandon Black
Having the same issue on c1.xlarge in us-east-1 (kernel crash on boot related to intel_idle). I've booted the Maverick release AMI several times on m1.large instances fine, but I seem to have a 50%+ failure rate getting it to initially boot without crashing on c1.xlarge. You're going to need to

[Bug 289087] Re: Missing linux-image-debug packages and metapackages since Intrepid

2010-01-24 Thread Brandon Black
As above, the ddebs link is useless for me. I'm running the current latest release of Ubuntu server, fully updated. My kernel is 2.6.31-17-server. There is no vmlinux shipped, there doesn't appear to be one available via apt, and even the ddebs link noted above only contains packages for