Re: Yum random crashes in XO4 f20 images
G'day Peter, Thanks for any ideas you may have. The problem also reproduces on OLPC Fedora 20 image for XO-4: http://build.laptop.org/14.1.0/os1/xo-4/41001o4.zd (552 MB) *** Error in `/usr/bin/python': free(): invalid pointer: 0x047c79ae *** === Backtrace: = /lib/libc.so.6(+0x6c8b4)[0xb6c828b4] /lib/libc.so.6(+0x754e8)[0xb6c8b4e8] === Memory map: [...] The error varies in detail, but always suggests corruption of heap or pointers to heap. The triggering conditions are interactive use of yum, yum update, or yum used by olpc-os-builder. The latter is a simple reproducer for me. I'm reproducing it on an XO-4, with 2GB of RAM, no swap, 8 GB eMMC, 8 GB USB flash drive. While memory demand by yum is large by comparison to other programs, the available memory at the time of failure is ample. There are no kernel out of memory (OOM) events. It seems more likely to occur when the filesystem cache is under heavy demand. The method to recreate the problem was: 1. install the system image 41001o4.zd using fs-update and then boot, 2. configure wireless network, 3. "yum install -y git olpc-os-builder" 4. clone the master branch of git://dev.laptop.org/projects/olpc-os-builder (last verified with b87e6ee) 5. run "./osbuilder.py examples/olpc-os-14.1.0-xo4.ini" repeatedly until the error occurs (usually within about five attempts), I've also tried running under valgrind, but that causes illegal instruction. It is quite likely I'm not using valgrind correctly. http://dev.laptop.org/~quozl/z/1XRYtO.txt The workaround at the moment is to build our Fedora 20 images on Fedora 18. Fedora 18 shows no sign of the problem. I'm worried that a low probability heap corruptor may cause instability of applications in the field. The exact same kernel is being used for Fedora 18 and Fedora 20. On Tue, Sep 09, 2014 at 03:55:24PM +0100, Peter Robinson wrote: > What version of OOB are you using, and what config files? I can try > and recreate the problem here on other devices. -- James Cameron http://quozl.linux.org.au/ ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Yum random crashes in XO4 f20 images
Thanks for the quick reponse, On Tue, Sep 9, 2014 at 10:55 AM, Peter Robinson wrote: > Hi Martin, > > > I am have been building a Fedora 20 image for the XO4, and I am seeing > > memory corruption problems while running yum in these images (please > check > > the logs [1,2]). > > The logs don't mean anything to me. What device are you running it on, > how much memory, do you have swap enabled? > It is running on a XO4, with 999MB ram and no swap. Maybe James can explain better what other specs are for the XO4. > > > To give you some context, we are using olpc-os-builder (master [3]) with > > fc20 repositories plus a few hand-crafted packages such as the kernel, > > systemd and xorg, taken from previous Daniel Narvaez efforts [4,5]. > > > > These crashes happens randomly when yum is running, by calling yum > update or > > when olpc-os-build is installing system packages. > > Yum isn't the most memory friendly, and some of the post update > scripts aren't either, you need to make sure there's enough > memory/swap. > I am not so sure this is related to memory usage really, as it happens even when I am installing just a few packages (located in FS) via "yum update /packages/*.rpm". > > > James Cameron, who spent some time researching about this issue, > speculates > > that this problem could be caused by: (a) using older kernel that was > > compiled (possibly) with different options compared to f20's, or (b) a > > faulty glibc library. > > > > I was wondering if this could be related to something else, something > more > > specific to yum or python arm binaries (?). > > Unlikely, I've not seen issues elsewhere. But I need more details. > What kind of details would you need? I can try reproducing and send what you need. > > > I would sincerely appreciate any guidance you can provide to start > > discarding possibilities and try to debug this issue. > > > > Thanks in advance for any help you can provide! > > What version of OOB are you using, and what config files? I can try > and recreate the problem here on other devices. > This problem occurs in all f20 images for the XO4 that we or others have created. The latest images were created using: * OLPC's OOB masterbranch: git://dev.laptop.org/projects/olpc-os-builder * OLPC's .ini file: http://dev.laptop.org/git/projects/olpc-os-builder/tree/examples/olpc-os-14.1.0-xo4.ini In case you have a XO4 with you, and have the time to try this out, you can download an image from http://system.one-education.org/au2a/images/testing/40002au4/ > Peter > ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Yum random crashes in XO4 f20 images
Hi Martin, > I am have been building a Fedora 20 image for the XO4, and I am seeing > memory corruption problems while running yum in these images (please check > the logs [1,2]). The logs don't mean anything to me. What device are you running it on, how much memory, do you have swap enabled? > To give you some context, we are using olpc-os-builder (master [3]) with > fc20 repositories plus a few hand-crafted packages such as the kernel, > systemd and xorg, taken from previous Daniel Narvaez efforts [4,5]. > > These crashes happens randomly when yum is running, by calling yum update or > when olpc-os-build is installing system packages. Yum isn't the most memory friendly, and some of the post update scripts aren't either, you need to make sure there's enough memory/swap. > James Cameron, who spent some time researching about this issue, speculates > that this problem could be caused by: (a) using older kernel that was > compiled (possibly) with different options compared to f20's, or (b) a > faulty glibc library. > > I was wondering if this could be related to something else, something more > specific to yum or python arm binaries (?). Unlikely, I've not seen issues elsewhere. But I need more details. > I would sincerely appreciate any guidance you can provide to start > discarding possibilities and try to debug this issue. > > Thanks in advance for any help you can provide! What version of OOB are you using, and what config files? I can try and recreate the problem here on other devices. Peter ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel