[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-12-10 Thread dave.muysson
Dan,

  Not sure if this will help or not, but of the 8+ servers we have using
the r3.large instance type, the only two that have encountered the issue
were running MongoDB on them, launched using the numactl tool with the
--interleave=all option set.

Here's the exact launch command used:

exec start-stop-daemon --start --quiet --chuid mongodb --make-pidfile
--pidfile /var/run/mongodb.pid --exec /usr/bin/numactl --
--interleave=all  /usr/bin/mongod --config /etc/mongodb.conf

  I won't pretend to know how numactl interleaves the memory across the
nodes, but I can't help but think high memory usage on these nodes
combined with forced interleaving might be why we hit this issue?

  After weeks of stress testing with your custom kernel, I have yet to
hit this issue again. The synthetic environment I'm using probably isn't
enough to hit this bug. Hopefully your testing with the c4.8xLarge is
more helpful.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1497428

Title:
  kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


Re: [Bug 1497428] kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-11-04 Thread dave.muysson
Dan,

  Just checking in with a status update. Our main system experiencing
the issue is a production system, so loading the custom kernel wasn’t an
option at the time. I have since created a clone of our production
server and am trying to reproduce the issue now.

  I will let you know once the issue reoccurs on our cloned environment.

Dave Muysson | Cloud Architect
dave.muys...@360pi.com  |​ (613) 562-2525 x 510 
 |​ 360pi.com 


> On Oct 13, 2015, at 9:37 AM, Dave Muysson  wrote:
> 
> Dan,
> 
>   I haven’t tried to directly reproduce the bug, but I have a few ideas. If I 
> can free up some time I’ll see if I can reproduce it.
> 
> 
> Dave Muysson | Cloud Architect
> dave.muys...@360pi.com  |​ (613) 562-2525 x 
> 510  |​ 360pi.com 
> 
> 
> 
>> On Oct 13, 2015, at 9:20 AM, Dan Streetman > > wrote:
>> 
>> Hi Dave,
>> 
>> are you able to reproduce the bug?  The trace by itself isn't terribly 
>> helpful, all it really says is the pageblock spans zones, which means 
>> move_freepages_block() logic for detecting that failed for some reason.  I 
>> have a debug kernel ppa here:
>> pad.lv/ppa/ddstreet/lp1497428
>> 
>> that includes additional debug if the problem happens (it also should
>> prevent the BUG()).  If you can use that kernel to trigger this and send
>> the resulting debug output it would help very much :-)
>> 
>> when the problem reproduces, in the system log you should see:
>> page_zone(start_page) !=page_zone(end_page)
>> 
>> and more debug following that.  It should not trigger BUG() though, so
>> you may need to check the logs periodically.
>> 
>> Thanks!
>> 
>> -- 
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1497428 
>> 
>> 
>> Title:
>>  kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968
>> 
>> Status in linux package in Ubuntu:
>>  In Progress
>> Status in linux source package in Trusty:
>>  In Progress
>> 
>> Bug description:
>>  The kernel triggers a BUG when it finds it is in move_freepages() but
>>  the start and end pfns for the move are in different zones.
>> 
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+subscriptions
>

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1497428

Title:
  kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-10-13 Thread dave.muysson
Dan,

  I haven’t tried to directly reproduce the bug, but I have a few ideas.
If I can free up some time I’ll see if I can reproduce it.


Dave Muysson | Cloud Architect
dave.muys...@360pi.com  |​ (613) 562-2525 x 510 
 |​ 360pi.com 


> On Oct 13, 2015, at 9:20 AM, Dan Streetman  
> wrote:
> 
> Hi Dave,
> 
> are you able to reproduce the bug?  The trace by itself isn't terribly 
> helpful, all it really says is the pageblock spans zones, which means 
> move_freepages_block() logic for detecting that failed for some reason.  I 
> have a debug kernel ppa here:
> pad.lv/ppa/ddstreet/lp1497428
> 
> that includes additional debug if the problem happens (it also should
> prevent the BUG()).  If you can use that kernel to trigger this and send
> the resulting debug output it would help very much :-)
> 
> when the problem reproduces, in the system log you should see:
> page_zone(start_page) !=page_zone(end_page)
> 
> and more debug following that.  It should not trigger BUG() though, so
> you may need to check the logs periodically.
> 
> Thanks!
> 
> -- 
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1497428
> 
> Title:
>  kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968
> 
> Status in linux package in Ubuntu:
>  In Progress
> Status in linux source package in Trusty:
>  In Progress
> 
> Bug description:
>  The kernel triggers a BUG when it finds it is in move_freepages() but
>  the start and end pfns for the move are in different zones.
> 
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+subscriptions

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1497428

Title:
  kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2015-10-08 Thread dave.muysson
Dan, I have run into this issue 4 times over the past few months, on two
separate servers running 3.13. I captured the kernel trace output of
each occurrence and can post them here if it would help. I have attached
the latest one, but there are 3 others I can provide as well.

Environment:
AWS EC2 Virtual Instance: r3.large
Ubuntu lts-trusty 3.13.0-53-generic (and) 3.13.0-45-generic.


** Attachment added: "ServerB-ubuntu-lts-trusty-3.13.0-53.txt"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+attachment/4488905/+files/ServerB-ubuntu-lts-trusty-3.13.0-53.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1497428

Title:
  kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs