[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2019-07-24 Thread Brad Figg
** Tags added: cscc -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968 To manage notifications about this bug go to:

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2019-06-11 Thread Dan Streetman
** Changed in: linux (Ubuntu Trusty) Status: Triaged => Won't Fix ** Changed in: linux (Ubuntu) Status: Triaged => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2017-10-13 Thread Dan Streetman
** Changed in: linux (Ubuntu Trusty) Assignee: Dan Streetman (ddstreet) => (unassigned) ** Changed in: linux (Ubuntu) Assignee: Dan Streetman (ddstreet) => (unassigned) ** Changed in: linux (Ubuntu Trusty) Status: In Progress => Triaged ** Changed in: linux (Ubuntu)

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2017-03-13 Thread Craig Watcham
m4.10xl also looks good: [7399124.202570] lp1497428: module verification failed: signature and/or required key missing - tainting kernel [7399124.223762] pageblock_nr_pages 0x200 [7399124.226007] node 0 zone 0 info: [7399124.227849] node 0 zone 0 provided page pfn 0xfff valid 1 present 1

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2017-03-13 Thread Craig Watcham
Looks like this has been corrected, from a recent c4.8xl launch: -- [0.00] Linux version 3.13.0-71-generic (buildd@lgw01-09) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #114-Ubuntu SMP Tue Dec 1 02:34:22 UTC 2015 (Ubuntu 3.13.0-71.114-generic 3.13.11-ckt29) [0.00] Command line:

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2016-09-09 Thread Dan Streetman
That is very definitely not the same bug as this bug. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968 To manage

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2016-08-23 Thread Matt W
I can't be sure that we ran into the exact same bug, but Amazon seems to think we may have. I can't find the beginning of the console log, but here's a mid-point that shows the hang: Host Type: Amazon EC2 r3.8xlarge OS: Ubuntu 14.04.5 Kernel: 3.13.0-93-generic Networking: Intel Enhanced Neworking

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2016-02-02 Thread Joseph Salisbury
** Tags removed: kernel-key ** Tags added: kernel-da-key -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968 To manage

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2016-01-27 Thread Dan Streetman
Matt, I think it's fine that upstream has demoted the BUG_ON, as I haven't heard anyone report this with a kernel later than 3.13; I assume whatever is causing it is fixed in later kernels. At this point there's not much more I can do, as I can't reproduce it and don't have much debug info on

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2016-01-11 Thread Matt Wilson
Dan, This BUG_ON has been demoted to only trigger when DEBUG_VM is set in upstream: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=97ee4ba7cbd30f1858f0d16911e042737c53f2ef I'm looking into why there's a one page difference between the E820 tables and SRAT. You're

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-18 Thread Dan Streetman
kernel module to add debug for this mm BUG(). This module is for kernel 3.13.0-71-generic only. ** Attachment added: "lp1497428.ko" https://bugs.launchpad.net/ubuntu/trusty/+source/linux/+bug/1497428/+attachment/4537000/+files/lp1497428.ko -- You received this bug notification because you

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-18 Thread Dan Streetman
Can anyone seeing this problem, if you're on the 3.13.0-71-generic kernel, please load the above attached module? It will initially check the node/zone start/end locations for validity, and also will check every time move_freepages is called, and if it detects the BUG() will be hit it prints out

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-18 Thread Dan Streetman
BTW, I've only seen this situation - with a node end pfn not on a pageblock boundary - happen with the AWS flavors "c4.8xlarge" and "m4.10xlarge". If anyone else sees this bug anywhere besides those Amazon AWS instances, please let me know. -- You received this bug notification because you are

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-10 Thread Dan Streetman
> I won't pretend to know how numactl interleaves the memory across the nodes, > but I can't help but think high memory usage on these nodes combined with > forced interleaving might be why we hit this issue? The numactl interleaving just causes memory to be allocated from all nodes on a

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-10 Thread dave.muysson
Dan, Not sure if this will help or not, but of the 8+ servers we have using the r3.large instance type, the only two that have encountered the issue were running MongoDB on them, launched using the numactl tool with the --interleave=all option set. Here's the exact launch command used: exec

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-09 Thread Dan Streetman
i booted a c4.8xlarge flavor AWS instance and got the same memory/numa layout as comment 16. To clarify though, the /proc/iomem output isn't representative of the actual memory layout; specifically it is: [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-07 Thread Chris J Arges
** Tags added: kernel-key -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968 To manage notifications about this bug go to:

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-07 Thread Joseph Salisbury
** Changed in: linux (Ubuntu) Importance: Low => High ** Changed in: linux (Ubuntu Trusty) Importance: Undecided => High -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-04 Thread Dan Streetman
To clarify the bug, a bit of background is needed first (specific numbers apply only to this situation). The kernel refers to all pages under a single PMD (midlevel page table) as a "pageblock". It's the same size as a hugepage, 2M. In the function triggering the BUG(), it's expecting that the

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-04 Thread Dan Streetman
The newer kernel may have some change/fix that prevents this bug, as I haven't seen any reports of this (from google, at least) on any other kernel. Plus, the unusual requirement of the memory having to end at not a multiple of 2M. > But the “Node 0 Normal” zone, judging from (start_pfn,

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-04 Thread Nelson Elhage
** Attachment added: "/proc/zoneinfo from the same machine" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+attachment/4529727/+files/zoneinfo.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-04 Thread Nelson Elhage
Hi @ddstreet, thanks for the update. We unfortunately weren't able to reproduce this on your test kernel, and have since moved to a newer kernel version for other reasons. However, I can confirm that on the affected machine types, and only the affected machine type, we see a memory range in

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-01 Thread Dan Streetman
For reference, here's a pasted sample of the Oops (taken from Diego's log above): [415478.493013] [ cut here ] [415478.496056] kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968! [415478.496056] invalid opcode: [#1] SMP [415478.496056] Modules linked in:

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-01 Thread Dan Streetman
Diego, thanks, although the log doesn't provide any new info, and it's doubtful this is related to hugepages. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel BUG at

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-11-30 Thread Diego Andres
Also, here's some system config that might have any influence on the crash (in particular Transparent Huge Page): (cannot attach more than one file): /etc/rc.local: echo never > /sys/kernel/mm/transparent_hugepage/enabled exit 0 ** Attachment added: "sysctl.conf"

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-11-30 Thread Diego Andres
Hi, I recently had the same issue in a AWS EC2 r3.large instance. In attackment you can find the system log. Hope that helps! ** Attachment added: "AWS System Log" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+attachment/4527455/+files/kernel-crash.txt -- You received this

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-11-06 Thread Nelson Elhage
Hey, We're also seeing this issue on a production system, and have been around 1/week for a while now. We may be able to boot that test kernel for experimentation purposes – would that still be useful? -- You received this bug notification because you are a member of Ubuntu Bugs, which is

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-11-06 Thread Dan Streetman
Yep it would definitely be useful to see a repro with the test/debug kernel, thanks! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel BUG at

Re: [Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-10-13 Thread dave.muysson
Dan, I haven’t tried to directly reproduce the bug, but I have a few ideas. If I can free up some time I’ll see if I can reproduce it. Dave Muysson | Cloud Architect dave.muys...@360pi.com |​ (613) 562-2525 x 510 |​ 360pi.com

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-10-13 Thread Dan Streetman
Hi Dave, are you able to reproduce the bug? The trace by itself isn't terribly helpful, all it really says is the pageblock spans zones, which means move_freepages_block() logic for detecting that failed for some reason. I have a debug kernel ppa here: pad.lv/ppa/ddstreet/lp1497428 that

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-10-08 Thread dave.muysson
Dan, I have run into this issue 4 times over the past few months, on two separate servers running 3.13. I captured the kernel trace output of each occurrence and can post them here if it would help. I have attached the latest one, but there are 3 others I can provide as well. Environment: AWS EC2

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-09-22 Thread Christopher M. Penalver
Dan Steetman, ah, never heard of STS so my bad on zapping the tag. Would it be possible to perform an apport-collect on a reference computer this is reproducible with? Otherwise, nobody can really contribute to this given the current level of detail provided. -- You received this bug

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-09-22 Thread Dan Streetman
No, I can't add any details just yet, I don't have direct access to the failing system, but I'm working with the reporter to debug it. This bug is currently just a placeholder so I can provide a debug ppa, pad.lv/ppa/ddstreet/lp1497428. It's okay that nobody else can help debug yet, because I'm

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-09-22 Thread Louis Bouchard
** Also affects: linux (Ubuntu Trusty) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel BUG at

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-09-22 Thread Dan Streetman
** Changed in: linux (Ubuntu Trusty) Assignee: (unassigned) => Dan Streetman (ddstreet) ** Changed in: linux (Ubuntu Trusty) Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-09-21 Thread Christopher M. Penalver
** Tags removed: sts ** Tags added: needs-apport-collect -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968 To manage

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-09-21 Thread Dan Streetman
Chris, this bug is for a Canonical STS issue I'm debugging. I'll add more details as I get them. ** Tags removed: needs-apport-collect ** Tags added: sts -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-09-21 Thread Dan Streetman
** Changed in: linux (Ubuntu) Status: Incomplete => In Progress ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Dan Streetman (ddstreet) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-09-19 Thread Christopher M. Penalver
** Changed in: linux (Ubuntu) Importance: Undecided => Low ** Changed in: linux (Ubuntu) Assignee: Dan Streetman (ddstreet) => (unassigned) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu.

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-09-18 Thread Dan Streetman
** Tags added: trusty -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1497428 Title: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968 To manage notifications about this bug go to: