[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-02-04 Thread Stefan Bader
Any news here? Also, it will help me to figure out what exact code level any crashing kernel was if you could post the "uname -a" output after booting into it on a likely affected machine. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-27 Thread Stefan Bader
Hm one note, and sorry this probably not very trusting... have you ensured (uname -a) that you actually booted into the right older kernel. Its just easy to go wrong as normally only the kernel with the highest version number is used. One have to fiddle manually with /etc/default/grub (for HVM

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-26 Thread Stefan Bader
Would be great if you were able to get it working again. At least to see whether the crash happens in the same area (timers). At the moment this sounds like something in user-space changed in a way that allows it to mess badly with the kernel. That sounds bad. And if its really somethign getting

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-25 Thread Will Buckner
I have now downgraded all of these systems to 3.19.8-031908-generic (Mainline). We'll know in a couple of days if this fixed it (or as soon as a few hours if it didn't fix it, possibly). I'll update when I know anything else; good so far! 4.0.9 definitely didn't fix it completely, but MAY have

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-25 Thread Will Buckner
3.19 crashed as well, just now. This is surprising. I don't have a trace, as the kernel downgrades messed up netconsole somehow, but I'll try to get it working again. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-22 Thread Will Buckner
And I should also point out that ifquery is crashing on boot most of the time on these systems: xxx.log:[3.909680] ifquery[378]: segfault at 1 ip 00403187 sp 7fff7078d8c0 error 4 in ifup[40+d000] xxx.log:[3.008003] ifquery[380]: segfault at 1 ip 00403187 sp

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-22 Thread Stefan Bader
The ifquery segfaults should be safe to ignore. I see those a lot but those don't seem to have any impact (yeah, probably should be fixed at some point ... if there were not always more pressing matters). Your in-house utility, hard to say without knowing more about what causes the segfault

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-22 Thread Will Buckner
Hey Stefan, Just wanted to update you that I've installed the Mainline 4.0.9 kernel (4.0.9-040009-generic #201507212131) on all affected machines, finishing about an hour ago, from https://wiki.ubuntu.com/Kernel/MainlineBuilds. I've still had two crashes in the hour since the downgrade. I don't

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-22 Thread Stefan Bader
Oh hm, another crazy thought...maybe it would also be worth trying a 3.19 kernel... If the guests then crash as well then maybe something in the area of user-space (as in the whole release) is causing issues. -- You received this bug notification because you are a member of Kernel Packages,

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-21 Thread Stefan Bader
More braindump: One thing that seems sure is that the crash occurs in __run_timers after optionally cascading pending timers into higher level lists. This picks a list of timers from the root vector array (tv1) into a temporary location and then removes timers from that (and calls actions)

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-20 Thread Stefan Bader
Having written that down, it feels quite dangerous to use LIST_POISON in combination with __hlist_del as the latter only protects against the next pointer being NULL (indicating the last list entry). That definitively breaks when trying to detach the same timer twice. Need to think about whether

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-20 Thread Stefan Bader
Note that the following is not final statement but sharing some thoughts to whoever else is looking at this report (and for me to remember). So while I did find nothing that really looked odd in the xen-netfront code I saw there was some change to the generic timer code: commit

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-20 Thread Stefan Bader
Looking at both traces this looks to be consistently happen inside run_timer_softirq() and from the offset I would guess we are in the inlined __run_timers. Another noteworthy part is the value of RAX. This is the value of LIST_POISON2 which is used to mark an invalid pointer of a (hlist_node

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-20 Thread Will Buckner
Thanks for looking into this Stefan! We were completely fine with 15.05 and 3.19. If it won't break anything terribly, I can try to put 3.19, 4.0, and 4.1 on these machines, but each one crashes every 24-48 hours, so it might take me several days. Which kernel would you recommend starting with,

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-20 Thread Stefan Bader
Hi Will and thanks for volunteering. That backtrace at least looks to confirm the educated guess about being related to network. *If* it actually is related, then I could imagine that processing incoming network traffic in softirq context of cpu#1 might cancel a timer which was set to wait for

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-19 Thread Robert C Jennings
Leann, can the kernel team take a look at this bug? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1534345 Title: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-18 Thread Alberto Salvia Novella
** Changed in: linux (Ubuntu) Assignee: (unassigned) => Alberto Salvia Novella (es20490446e) ** Changed in: linux (Ubuntu) Status: Confirmed => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-18 Thread Alberto Salvia Novella
** Changed in: linux (Ubuntu) Status: In Progress => Triaged ** Changed in: linux (Ubuntu) Assignee: Alberto Salvia Novella (es20490446e) => (unassigned) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-18 Thread Na3iL
** Changed in: linux (Ubuntu) Assignee: (unassigned) => Na3iL (naeilzoueidi) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1534345 Title: Ubuntu 15.10 Crashing Frequently on EC2

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-15 Thread Will Buckner
Do you guys need a vmcore? I'm working on getting one from AWS. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1534345 Title: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-15 Thread Alberto Salvia Novella
** Changed in: linux (Ubuntu) Importance: Undecided => Critical -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1534345 Title: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/

[Kernel-packages] [Bug 1534345] Re: Ubuntu 15.10 Crashing Frequently on EC2 Instances w/ Enhanced Networking

2016-01-14 Thread Will Buckner
And now we've got a second trace from the same machine with a bit more info: And one more: [14032.676085] general protection fault: [#1] SMP [14032.678409] Modules linked in: isofs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4