Re: Stability and Memory Pressure in 8.2
On Tue, Sep 30, 2008 at 1:50 PM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: > On Tue, Sep 30, 2008 at 4:13 AM, S Page <[EMAIL PROTECTED]> wrote: >> On September 8th, Michael Stone wrote: >>> Kim, Greg, and I have concluded that the instability we experience under >>> memory-pressure in 8.2-759 and similar is the single "hard" issue that >>> we wish to _attempt_ to address before releasing 8.2 on current >>> timeframes. >> >> How did it go? >> >> I was going through my journal in 8.2-763. Browse and Paint open, >> accidentally started Read, suddenly the cursor stopped moving and XO >> completely unresponsive. I assume it's memory, but we never learned how >> to tell. >> >> Over two minutes later the first page of the PDF appeared and *then* >> immediately Sugar restarted. >> >> Just one datapoint. > > Thanks, Read has serious memory problems because renders whole pages > into memory, regardless of what is the viewed area. Any chance the > first pages of the PDF you opened contained big images? > I noticed some similar sounding freezes with Read, and it appeared to me that the initial fit-to screen-width takes up significant amount of resources. Of course, I'm not sure that this is the reason, but given the problems we are having with zoom, it may be one of the contributing factors (apart from the usual overhead of opening the file and initial rendering of the pages) Thanks, Sayamindu -- Sayamindu Dasgupta [http://sayamindu.randomink.org/ramblings] ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, Oct 1, 2008 at 12:50 PM, S Page <[EMAIL PROTECTED]> wrote: > Tomeu Vizoso wrote: > >> Read has serious memory problems because renders whole pages >> into memory, regardless of what is the viewed area. Any chance the >> first pages of the PDF you opened contained big images? > > Nope, it's a saved Project Gutenberg PDF > Coradella_Collegiate_Bookshelf_Collection_austen-persuasion.pdf, looks like > it has one small image. > Of course when I reproduce everything works fine. > > Is there *anything* testers can run before or after their XO goes funny? My > fantasy would be a section in http://wiki.laptop.org/go/Friends_in_testing : > > "Run log_mem.py from a console as you start Sugar. This snapshots memory > every 10 seconds to a rotating set of files in tmpfs until it detects that > the machine has serious memory problems, then it runs a detailed > /sys/procmem dump. It also runs strace on the OOMKiller thread. If your XO > locks up, zip this directory and attach it to bug 4321." > > Without something like that, it'll be hard to get anything more from testers > than anecdotes. Agreed and I think that the steps you outlined before could work great here. Can you take over this task? Or someone else that feels more capable? Thanks, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
Tomeu Vizoso wrote: > Read has serious memory problems because renders whole pages > into memory, regardless of what is the viewed area. Any chance the > first pages of the PDF you opened contained big images? Nope, it's a saved Project Gutenberg PDF Coradella_Collegiate_Bookshelf_Collection_austen-persuasion.pdf, looks like it has one small image. Of course when I reproduce everything works fine. Is there *anything* testers can run before or after their XO goes funny? My fantasy would be a section in http://wiki.laptop.org/go/Friends_in_testing : "Run log_mem.py from a console as you start Sugar. This snapshots memory every 10 seconds to a rotating set of files in tmpfs until it detects that the machine has serious memory problems, then it runs a detailed /sys/procmem dump. It also runs strace on the OOMKiller thread. If your XO locks up, zip this directory and attach it to bug 4321." Without something like that, it'll be hard to get anything more from testers than anecdotes. Sincerely, -- =S ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 30, 2008 at 4:13 AM, S Page <[EMAIL PROTECTED]> wrote: > On September 8th, Michael Stone wrote: >> Kim, Greg, and I have concluded that the instability we experience under >> memory-pressure in 8.2-759 and similar is the single "hard" issue that >> we wish to _attempt_ to address before releasing 8.2 on current >> timeframes. > > How did it go? > > I was going through my journal in 8.2-763. Browse and Paint open, > accidentally started Read, suddenly the cursor stopped moving and XO > completely unresponsive. I assume it's memory, but we never learned how > to tell. > > Over two minutes later the first page of the PDF appeared and *then* > immediately Sugar restarted. > > Just one datapoint. Thanks, Read has serious memory problems because renders whole pages into memory, regardless of what is the viewed area. Any chance the first pages of the PDF you opened contained big images? Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On September 8th, Michael Stone wrote: > Kim, Greg, and I have concluded that the instability we experience under > memory-pressure in 8.2-759 and similar is the single "hard" issue that > we wish to _attempt_ to address before releasing 8.2 on current > timeframes. How did it go? I was going through my journal in 8.2-763. Browse and Paint open, accidentally started Read, suddenly the cursor stopped moving and XO completely unresponsive. I assume it's memory, but we never learned how to tell. Over two minutes later the first page of the PDF appeared and *then* immediately Sugar restarted. Just one datapoint. Yours sincerely, -- =S Page ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Sun, Sep 14, 2008 at 6:42 AM, James Cameron <[EMAIL PROTECTED]> wrote: > I recall someone noticed that the animated activity icon was redrawing > the whole screen. I think it got fixed. Since it got fixed, I haven't > seen as many OOMs during olpc-update. It was not fixed. http://dev.laptop.org/ticket/8000 Actually, we really should try to fix this one for 8.2. Marco ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Thu, Sep 11, 2008 at 6:51 PM, C. Scott Ananian <[EMAIL PROTECTED]> wrote: > On Thu, Sep 11, 2008 at 12:16 PM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: >> On Thu, Sep 11, 2008 at 5:30 PM, C. Scott Ananian <[EMAIL PROTECTED]> wrote: >>> On Wed, Sep 10, 2008 at 8:13 AM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: On Wed, Sep 10, 2008 at 2:05 PM, James Cameron <[EMAIL PROTECTED]> wrote: > But I did notice one odd thing that I wasn't fully aware of until now > ... the byte-code of the built-in modules was present, complete with doc > strings ... for example; Yes, we are aware of this one and have a fix on the line: https://bugzilla.redhat.com/show_bug.cgi?id=460334 There has been a thread recently on devel or sugar ml about it. If you could help us quantify how much this could help, it would be much appreciated. >>> >>> Here's a quick reference to that previous thread: >>> http://lists.laptop.org/pipermail/sugar/2008-August/007969.html >>> >>> I guess I meant to turn on -OO on joyride, but didn't quite get around >>> to it; it would require patching/forking our numpy and python, and >>> then tweaking the sugar-shell startup to use -OO. It looked like this >>> would save ~6M, but I don't know yet how much extra NAND space it >>> would take for the .pyo files. I might be able to experiment and make >>> a build or two on the faster branch to quantify this. >> >> Would be great if you could look into it. I guess we could drop the >> .pyc files and use the .pyo instead. > > http://dev.laptop.org/ticket/8431 now tracks the issue. > > I've started by putting appropriately patched versions of python and > numpy into joyride, so you can experiment with -OO on a joyride image > without having to worry about these particular bugs. I've confirmed > that python 2.5.2 and numpy 1.2.0 already have/will have the relevant > patches, so we probably won't need the fork by our next major release. > --scott > > p.s. does anyone know why fedora isn't using python 2.5.2 yet? It was > released in February '08; I'm surprised that it's not in F9 or F10. I think the plan is to get one 2.5.1 more in rawhide with some new patches (including the -OO fix) and then doing a 2.5.2 rpm. If 2.5.2 brings more trouble than what can be solved for F10, then we can go back to 2.5.1+patches. Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Thu, Sep 11, 2008 at 12:16 PM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: > On Thu, Sep 11, 2008 at 5:30 PM, C. Scott Ananian <[EMAIL PROTECTED]> wrote: >> On Wed, Sep 10, 2008 at 8:13 AM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: >>> On Wed, Sep 10, 2008 at 2:05 PM, James Cameron <[EMAIL PROTECTED]> wrote: But I did notice one odd thing that I wasn't fully aware of until now ... the byte-code of the built-in modules was present, complete with doc strings ... for example; >>> >>> Yes, we are aware of this one and have a fix on the line: >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=460334 >>> >>> There has been a thread recently on devel or sugar ml about it. >>> >>> If you could help us quantify how much this could help, it would be >>> much appreciated. >> >> Here's a quick reference to that previous thread: >> http://lists.laptop.org/pipermail/sugar/2008-August/007969.html >> >> I guess I meant to turn on -OO on joyride, but didn't quite get around >> to it; it would require patching/forking our numpy and python, and >> then tweaking the sugar-shell startup to use -OO. It looked like this >> would save ~6M, but I don't know yet how much extra NAND space it >> would take for the .pyo files. I might be able to experiment and make >> a build or two on the faster branch to quantify this. > > Would be great if you could look into it. I guess we could drop the > .pyc files and use the .pyo instead. http://dev.laptop.org/ticket/8431 now tracks the issue. I've started by putting appropriately patched versions of python and numpy into joyride, so you can experiment with -OO on a joyride image without having to worry about these particular bugs. I've confirmed that python 2.5.2 and numpy 1.2.0 already have/will have the relevant patches, so we probably won't need the fork by our next major release. --scott p.s. does anyone know why fedora isn't using python 2.5.2 yet? It was released in February '08; I'm surprised that it's not in F9 or F10. -- ( http://cscott.net/ ) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Thu, Sep 11, 2008 at 5:30 PM, C. Scott Ananian <[EMAIL PROTECTED]> wrote: > On Wed, Sep 10, 2008 at 8:13 AM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: >> On Wed, Sep 10, 2008 at 2:05 PM, James Cameron <[EMAIL PROTECTED]> wrote: >>> But I did notice one odd thing that I wasn't fully aware of until now >>> ... the byte-code of the built-in modules was present, complete with doc >>> strings ... for example; >> >> Yes, we are aware of this one and have a fix on the line: >> >> https://bugzilla.redhat.com/show_bug.cgi?id=460334 >> >> There has been a thread recently on devel or sugar ml about it. >> >> If you could help us quantify how much this could help, it would be >> much appreciated. > > Here's a quick reference to that previous thread: > http://lists.laptop.org/pipermail/sugar/2008-August/007969.html > > I guess I meant to turn on -OO on joyride, but didn't quite get around > to it; it would require patching/forking our numpy and python, and > then tweaking the sugar-shell startup to use -OO. It looked like this > would save ~6M, but I don't know yet how much extra NAND space it > would take for the .pyo files. I might be able to experiment and make > a build or two on the faster branch to quantify this. Would be great if you could look into it. I guess we could drop the .pyc files and use the .pyo instead. Thanks, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, Sep 10, 2008 at 8:13 AM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: > On Wed, Sep 10, 2008 at 2:05 PM, James Cameron <[EMAIL PROTECTED]> wrote: >> But I did notice one odd thing that I wasn't fully aware of until now >> ... the byte-code of the built-in modules was present, complete with doc >> strings ... for example; > > Yes, we are aware of this one and have a fix on the line: > > https://bugzilla.redhat.com/show_bug.cgi?id=460334 > > There has been a thread recently on devel or sugar ml about it. > > If you could help us quantify how much this could help, it would be > much appreciated. Here's a quick reference to that previous thread: http://lists.laptop.org/pipermail/sugar/2008-August/007969.html I guess I meant to turn on -OO on joyride, but didn't quite get around to it; it would require patching/forking our numpy and python, and then tweaking the sugar-shell startup to use -OO. It looked like this would save ~6M, but I don't know yet how much extra NAND space it would take for the .pyo files. I might be able to experiment and make a build or two on the faster branch to quantify this. --scott -- ( http://cscott.net/ ) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On 10 Sep 2008, at 21:02, Michael Stone wrote: > A more accurate test would be to disable the preloading itself rather > than disabling isolation but leaving rainbow loading the libraries. :) :-) > To do that, see lines 31-32 of > > /usr/lib/python2.5/site_packages/rainbow/service.py > > You want to set self.preloader_hint = False and comment out the call > to > self.preload_common_modules() by putting '#' at the beginning. OK, retested build 759 with above change, otherwise same test procedure as before: All five Activities launched, free buffers/cache reports 186Mb, an increase of 15Mb relative to when the rainbow fork trick is active, and 7Mb less than when /etc/olpc-security is just roughly removed. Write-51 and Record-57 are still better off without the rainbow trick, with Calculate-23, Paint-22, and Moon-4 gaining a slight memory benefit from the fork trick. (By the way the % figures shown are the %MEM as reported by top) Write-57 trick -> 15.5% (RES=35m, SHR=13m, DATA=20m) no trick -> 13.5% (RES=31m, SHR=11m, DATA=18m) Record-57 trick -> 14.2% (RES=32m, SHR=14m, DATA=64m) no trick -> 11.4% (RES=26m, SHR=12m, DATA=62m) Calculate-23 trick -> 10.6% (RES=24m, SHR=8m, DATA=15m) no trick -> 11% (RES=25m, SHR=11m, DATA=13m) Paint-22 trick -> 10.1% (RES=23m, SHR=8m, DATA=14m) no trick -> 10.4% (RES=23m, SHR=11m, DATA=11m) Moon-4 trick -> 9.7% (RES=22m, SHR=8m, DATA=13m) no trick -> 10.4% (RES=23m, SHR=11m, DATA=11m) /usr/bin/X trick -> 7.5% (RES=17m, SHR=13m, DATA=9m) no trick -> 5.6% (RES=12m, SHR=9m, DATA=9m) Any more for any more? BTW: I still haven't spotted where the overall (~15Mb in this case) savings of the rainbow fork trick are coming from. --Gary ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
A more accurate test would be to disable the preloading itself rather than disabling isolation but leaving rainbow loading the libraries. :) To do that, see lines 31-32 of /usr/lib/python2.5/site_packages/rainbow/service.py You want to set self.preloader_hint = False and comment out the call to self.preload_common_modules() by putting '#' at the beginning. Regards, Michael ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On 10 Sep 2008, at 08:27, Marco Pesenti Gritti wrote: > On Wed, Sep 10, 2008 at 4:29 AM, Gary C Martin > <[EMAIL PROTECTED]> wrote: >> Well, I was hoping to see the numbers go the other way with the >> rainbow fork trick sharing more module code between Activities. Could >> be worse I guess – I should also test opening N instances of the same >> Activity and see which way memory usage has moved in that scenario. > > Now that's worrying. Could you try to disable security (remove > /etc/olpc-security)? That will kill the rainbow trick and comparing > data should tell us if it's helping memory at all. Sure, retested build 759 with and without /etc/olpc-security, same test procedure as before: With /etc/olpc-security removed, a reboot, and all five Activities launched, free buffers/cache reports 21Mb more is being used (up to 192Mb from 171Mb). Though looking at each Activity's footprint shows a less clear signal where Write-57 and Record-57 actually have a considerably smaller footprint; and Calculate-23, Paint-22, and Moon-4 have a slightly larger footprint (and shared memory is actually reported as having increased). Write-57 with -> 15.5% (RES=35m, SHR=13m, DATA=20m) without -> 13.8% (RES=31m, SHR=12m, DATA=18m) Record-57 with -> 14.2% (RES=32m, SHR=14m, DATA=64m) without -> 11.5% (RES=26m, SHR=11m, DATA=61m) Calculate-23 with -> 10.6% (RES=24m, SHR=8m, DATA=15m) without -> 11.3% (RES=25m, SHR=11m, DATA=13m) Paint-22 with -> 10.1% (RES=23m, SHR=8m, DATA=14m) without -> 10.6% (RES=24m, SHR=11m, DATA=12m) Moon-4 with -> 9.7% (RES=22m, SHR=8m, DATA=13m) without -> 10.3% (RES=23m, SHR=11m, DATA=11m) Also I noticed, for some reason, X uses 6-8Mb less resident memory with /etc/olpc-security removed. That was unexpected enough for me to re-check the results several times: /usr/bin/X with -> 7.5% (RES=17m, SHR=13m, DATA=9m) with -> 7.5% (RES=17m SHR=13m DATA=9m) without -> 5.1% (RES=11m, SHR=8m, DATA=9m) without -> 4.3% (RES=9m, SHR=7m, DATA=9m) Hmm, so what actually took up the extra 21Mb in total that the rainbow trick does appear to be saving us (considering most of the above items all add up as memory savings when disabling the rainbow trick)? I seem to be generating more questions than answers here! --Gary ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
My layman's understanding is that you can't execute in place from the NAND flash on the XO. XIP requires NOR flash which is more expensive than NAND but has faster read speeds. It mentions this briefly on the axfs FAQ. Storing some executables and libraries in a separate uncompressed partition seems more plausible, but I can't speculate on the system performance impact. Thanks, Nate On Wed, Sep 10, 2008 at 9:38 AM, <[EMAIL PROTECTED]> wrote: > On Wed, 10 Sep 2008, Tomeu Vizoso wrote: > > > On Wed, Sep 10, 2008 at 11:53 AM, John Gilmore <[EMAIL PROTECTED]> wrote: > >> > >> It may be possible and useful to store some commonly used executables > >> and shared libraries as uncompressed files in jffs2, making them much > >> faster to page back in from Flash. Nobody has tried doing this, as > >> far as I know. > > > > Please, I would love to see this as well... > > not for this release, but would the axfs be an option in the future with > it's execute in place capability for key files? or is the performance > difference compared to ram such that you wouldn't want it in any case? > > either way, the profiling it does of which pages are used (and how much) > could be useful in figuring out what binaries should be stored > uncompressed. > > David Lang > ___ > Devel mailing list > Devel@lists.laptop.org > http://lists.laptop.org/listinfo/devel > ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, Sep 10, 2008 at 02:13:24PM +0200, Tomeu Vizoso wrote: >Not from outside python, but from inside we are using heapy: > >http://guppy-pe.sourceforge.net/ Tomeu already published some guppy RPMs but here is a git repo with pacakging instructions (Makefiles) should you wish to make any changes. http://dev.laptop.org/git/users/mstone/heapy (Look for the $(TARBALL) rule in Makefile.fedora if you want to change anything; that Makefile is configured to pull directly from PyPI...) Michael Here are the resulting RPMs: http://dev.laptop.org/~mstone/releases/RPMS/guppy-0.1.8-1.fc9.i386.rpm http://dev.laptop.org/~mstone/releases/RPMS/guppy-debuginfo-0.1.8-1.fc9.i386.rpm http://dev.laptop.org/~mstone/releases/SRPMS/guppy-0.1.8-1.fc9.src.rpm ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
tomeu wrote: > On Wed, Sep 10, 2008 at 2:37 PM, riccardo <[EMAIL PROTECTED]> wrote: > > Paul, > > > > On Wed, 2008-09-10 at 08:18 -0400, [EMAIL PROTECTED] wrote: > >> i started down that path yesterday afternoon, and realized that it > >> wasn't clear to me how i needed to invoke it. it seems to want > >> to be imported before you start the rest of your program, which > >> sort of forces you into interactive mode. is that your understanding? > >> i had been hoping to be able to "attach" to the sugar shell process, > >> in the way one might do with gdb. perhaps that's not possible. > >> > > > > There is kick-start tutorial on how to use heapy's remote monitor at the > > 56th page of http://guppy-pe.sourceforge.net/heapy-thesis.pdf > > > > For the shell I use to put `import guppy.heapy.RM' before any other > > import statement in main.py. > > Another pointer: > > http://guppy-pe.sourceforge.net/heapy_Use.html#heapykinds.Use.monitor > > Other ways of using guppy are logging out periodically the heap with > gobject.timeout_add or patching keyhandler.py to print the heap (or a > diff of it) when a key combination is pressed. thank you for both of those pointers. paul =- paul fox, [EMAIL PROTECTED] ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, 10 Sep 2008, Tomeu Vizoso wrote: > On Wed, Sep 10, 2008 at 11:53 AM, John Gilmore <[EMAIL PROTECTED]> wrote: >> >> It may be possible and useful to store some commonly used executables >> and shared libraries as uncompressed files in jffs2, making them much >> faster to page back in from Flash. Nobody has tried doing this, as >> far as I know. > > Please, I would love to see this as well... not for this release, but would the axfs be an option in the future with it's execute in place capability for key files? or is the performance difference compared to ram such that you wouldn't want it in any case? either way, the profiling it does of which pages are used (and how much) could be useful in figuring out what binaries should be stored uncompressed. David Lang ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, Sep 10, 2008 at 2:37 PM, riccardo <[EMAIL PROTECTED]> wrote: > Paul, > > On Wed, 2008-09-10 at 08:18 -0400, [EMAIL PROTECTED] wrote: >> tomeu wrote: >> > On Wed, Sep 10, 2008 at 2:05 PM, James Cameron <[EMAIL PROTECTED]> wrote: >> > >> > > Has anyone got an idea of how to measure the heap by usage? >> > >> > Not from outside python, but from inside we are using heapy: >> > >> > http://guppy-pe.sourceforge.net/ >> >> i started down that path yesterday afternoon, and realized that it >> wasn't clear to me how i needed to invoke it. it seems to want >> to be imported before you start the rest of your program, which >> sort of forces you into interactive mode. is that your understanding? >> i had been hoping to be able to "attach" to the sugar shell process, >> in the way one might do with gdb. perhaps that's not possible. >> > > There is kick-start tutorial on how to use heapy's remote monitor at the > 56th page of http://guppy-pe.sourceforge.net/heapy-thesis.pdf > > For the shell I use to put `import guppy.heapy.RM' before any other > import statement in main.py. Another pointer: http://guppy-pe.sourceforge.net/heapy_Use.html#heapykinds.Use.monitor Other ways of using guppy are logging out periodically the heap with gobject.timeout_add or patching keyhandler.py to print the heap (or a diff of it) when a key combination is pressed. Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
Paul, On Wed, 2008-09-10 at 08:18 -0400, [EMAIL PROTECTED] wrote: > tomeu wrote: > > On Wed, Sep 10, 2008 at 2:05 PM, James Cameron <[EMAIL PROTECTED]> wrote: > > > > > Has anyone got an idea of how to measure the heap by usage? > > > > Not from outside python, but from inside we are using heapy: > > > > http://guppy-pe.sourceforge.net/ > > i started down that path yesterday afternoon, and realized that it > wasn't clear to me how i needed to invoke it. it seems to want > to be imported before you start the rest of your program, which > sort of forces you into interactive mode. is that your understanding? > i had been hoping to be able to "attach" to the sugar shell process, > in the way one might do with gdb. perhaps that's not possible. > There is kick-start tutorial on how to use heapy's remote monitor at the 56th page of http://guppy-pe.sourceforge.net/heapy-thesis.pdf For the shell I use to put `import guppy.heapy.RM' before any other import statement in main.py. riccardo ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
tomeu wrote: > On Wed, Sep 10, 2008 at 2:05 PM, James Cameron <[EMAIL PROTECTED]> wrote: > > > Has anyone got an idea of how to measure the heap by usage? > > Not from outside python, but from inside we are using heapy: > > http://guppy-pe.sourceforge.net/ i started down that path yesterday afternoon, and realized that it wasn't clear to me how i needed to invoke it. it seems to want to be imported before you start the rest of your program, which sort of forces you into interactive mode. is that your understanding? i had been hoping to be able to "attach" to the sugar shell process, in the way one might do with gdb. perhaps that's not possible. btw, i continued doing monitoring of the machines i had running: i need to look again after they've been running overnight when i get to the office, but the growth i was seeing may be network related, as tomeu suggested yesterday. (i had at least one case of no growth at all when i had disabled the wireless.) paul =- paul fox, [EMAIL PROTECTED] ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, Sep 10, 2008 at 2:05 PM, James Cameron <[EMAIL PROTECTED]> wrote: > I had a few hours look at the second largest process, the journal > activity, on Joyride 2412. > > Then I used gdb to generate-core-file and wander through the heap memory > to get an idea of what it might contain. I did not make a complete > analysis. I need to learn more about the heap structures before I do > so. Wonder if this is what people familiar with python internals do? Haven't tried myself. > But I did notice one odd thing that I wasn't fully aware of until now > ... the byte-code of the built-in modules was present, complete with doc > strings ... for example; Yes, we are aware of this one and have a fix on the line: https://bugzilla.redhat.com/show_bug.cgi?id=460334 There has been a thread recently on devel or sugar ml about it. If you could help us quantify how much this could help, it would be much appreciated. > Has anyone got an idea of how to measure the heap by usage? Not from outside python, but from inside we are using heapy: http://guppy-pe.sourceforge.net/ Thanks, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
I had a few hours look at the second largest process, the journal activity, on Joyride 2412. VmPeak:40440 kB VmSize:40436 kB VmLck: 0 kB VmHWM: 28824 kB VmRSS: 28824 kB VmData:11632 kB VmStk: 172 kB VmExe: 4 kB VmLib: 21992 kB VmPTE:48 kB so it costs 29Mb or so of RSS, most of which is presumably shared. This is confirmed by smaps, which showed 9Mb or so used by heap. That was the main memory cost, so I concentrated on it. It was the largest Private_Dirty. 0824f000-08be9000 rw-p 0824f000 00:00 0 [heap] Size: 9832 kB Rss:9608 kB Pss:9608 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 9608 kB Referenced: 9608 kB For a while I tried working with /proc/$PID/mem until I figured it just would not work, always got ESRCH on read(2), and mem_read showed I could only do it to processes that are children of my process. Odd, so abandoned that method. Then I used gdb to generate-core-file and wander through the heap memory to get an idea of what it might contain. I did not make a complete analysis. I need to learn more about the heap structures before I do so. But I did notice one odd thing that I wasn't fully aware of until now ... the byte-code of the built-in modules was present, complete with doc strings ... for example; (gdb) x/4bs 0x824f78c 0x824f78c: "int(x[, base]) -> integer\n\nConvert a string or number to an integer, if possible. A floating point\nargument will be truncated towards zero (this does not include a string\nrepresentation of a floating"... 0x824f854: " point number!) When converting a string, use\nthe optional base. It is an error to supply a base when converting a\nnon-string. If the argument is outside the integer range a long object\nwill be retu"... 0x824f91c: "rned instead." 0x824f92a: "" and not just once, twice: $ strings journal.core |grep "supply a base when converting" the optional base. It is an error to supply a base when converting a the optional base. It is an error to supply a base when converting a Has anyone got an idea of how to measure the heap by usage? -- James Cameronmailto:[EMAIL PROTECTED] http://quozl.netrek.org/ ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, 2008-09-10 at 09:15 +0200, Marco Pesenti Gritti wrote: > On Wed, Sep 10, 2008 at 4:29 AM, Gary C Martin <[EMAIL PROTECTED]> wrote: > > > > OK, news is not great on the Activity front... > > > > SUMMARY: 759 vs 711 each Activity instance in 759 consumes an average > > of 1Mb more memory than the same Activity running in 711, with > > Write-57 reportedly taking significantly more than that (perhaps ~7Mb). > > > > Is top and/or ps memory usage calculated in the same way between these > > builds? Could make collecting real data pretty painful. > > Unfortunately not, there has been changes in the kernel. My > understanding is that private memory will be the same, while > calculation of shared memory has changed. Riccardo has a new kernel > somewhere with instructions on how to install it on 711. That should > make the memory usage comparable. > > Marco I used this newer kernel (as it accounts also for pss) for measurements with ps_mem on build 703: http://dev.laptop.org/~rlucchese/utils/703/kernel-2.6.25-20080501.3.olpc.231c7b715f4a8d0.i586.rpm It can be installed on the xo with: $ rpm -ivh kernel-rpm $ cp -a /boot/* /versions/boot/current/boot/ You will also have to update the ram disk image; you can follow the instructions at the bottom of http://wiki.laptop.org/go/Kernel_Building You may also want to try this patched ps_mem (shows pids and doesn't group entries by process name): http://dev.laptop.org/~rlucchese/utils/ps_mem.py riccardo ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 9, 2008 at 6:10 AM, Michael Stone <[EMAIL PROTECTED]> wrote: > > * We need to check carefully for memory-leaks. Three mechanisms which >occur to me include: Looks like we have regressed on http://dev.laptop.org/ticket/5532 . Just entered http://dev.laptop.org/ticket/8394 because most of the details on the older ticket aren't relevant any more. It contains a fix. This means we leak 20KB per buddy, so 10 buddies appearing per hour would match with Paul's observations. We should setup automated tests and check if we still leak, how we could resource this? Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, Sep 10, 2008 at 11:53 AM, John Gilmore <[EMAIL PROTECTED]> wrote: > > It may be possible and useful to store some commonly used executables > and shared libraries as uncompressed files in jffs2, making them much > faster to page back in from Flash. Nobody has tried doing this, as > far as I know. Please, I would love to see this as well... Thanks, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
When measuring memory usage, "cat /proc/XXX/smaps" provides the most accurate info available (as far as I know), and produces directly comparable results in all OLPC software releases. XXX is the process number you're examining (first column of "ps" output). The smaps file also tells you how many of the pages in each allocation are actually *resident* in memory at this instant; how many are uniquely used by this process (versus shared with other processes); and how many of them are *dirty* (written by the process). It also includes the info that the "pmap" command produces (from /proc/XXX/maps). E.g. part of the output for the sugar-shell process includes: b6094000-b6101000 r-xp 1f:00 3391 /usr/lib/libcairo.so.2.17.5 Size:436 kB Rss: 244 kB Pss: 70 kB Shared_Clean:244 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 0 kB Referenced: 244 kB b6101000-b6103000 rw-p 0006d000 1f:00 3391 /usr/lib/libcairo.so.2.17.5 Size: 8 kB Rss: 8 kB Pss: 8 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 8 kB Referenced:8 kB This says that there are two parts of the libcairo shared library mapped into the process's address space. One is 436 kB and is readable and executable (it's the code segment). Of that 436k, 244k is currently resident in memory (Rss), all of that 244k is shared with other processes and is clean (not written to). The second part of libcairo is only 8k bytes long, it's read/write and not executable (it's the data segment and the BSS segment), all 8k has been read into RAM, all 8k is private (not shared with other processes) and dirty (has been modified by the process). There's a lot more info in there too, such as what virtual addresses these things are mapped into, and from what offset within the file. The "size", "nm", and "objdump -h" commands from "yum install binutils" will help you compare these offsets back into the compiler output and thus into the source code. Dirty memory is particularly pernicious on the XO, since the XO has nowhere to keep it except in RAM. On normal Linux systems, when dirty pages are not expected to be used again soon, they can be paged out to the "swap space" on disk. On the XO, which has no swap space, those pages burn up RAM permanently, even if the process goes to sleep for a year and never wakes up again. The dirty memory is released only when the process exits (or when the process explicitly unmaps it, which seldom happens). These long-term dirty pages produce more memory pressure (less available memory) for all the other processes that are actually active and getting work done for the user. When a Linux system gets memory pressure, it tosses out whatever it can to swap space (nothing, on the XO) and then it starts throwing away pages that it knows it can later re-read from the file system. The resident, clean pages that are mapped from files are what get thrown away (like the first segment of libcairo above). When the XO runs low on memory, this means that it throws away a lot of pages containing executable code. If the code on those pages is subsequently executed, those pages will be read back in from the file system. Note that reading in a 4k page from the JFFS2 compressed filesystem is not a cheap operation; a lot of system CPU time goes into decompressing it (compared to an ordinary Linux system with a hard disk and an ext3 filesystem). Throwing away code pages and then immediately reading them back in again, over and over, "thrashing", may be why the XO gets very slow when memory is tight. It may be possible and useful to store some commonly used executables and shared libraries as uncompressed files in jffs2, making them much faster to page back in from Flash. Nobody has tried doing this, as far as I know. I don't know how to instrument the kernel virtual memory subsystem to gain visibility into which pages are being discarded when, and which are being read in later. I think that info would be extremely useful for debugging 8.2's low-memory hangs. Thrashing would become obvious if you see the same pages being read in over and over. Of course, when the machine is thrashing, it's hard to see any output from its kernel... John ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, Sep 10, 2008 at 4:29 AM, Gary C Martin <[EMAIL PROTECTED]> wrote: > Well, I was hoping to see the numbers go the other way with the > rainbow fork trick sharing more module code between Activities. Could > be worse I guess – I should also test opening N instances of the same > Activity and see which way memory usage has moved in that scenario. Now that's worrying. Could you try to disable security (remove /etc/olpc-security)? That will kill the rainbow trick and comparing data should tell us if it's helping memory at all. Marco ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Wed, Sep 10, 2008 at 4:29 AM, Gary C Martin <[EMAIL PROTECTED]> wrote: > > OK, news is not great on the Activity front... > > SUMMARY: 759 vs 711 each Activity instance in 759 consumes an average > of 1Mb more memory than the same Activity running in 711, with > Write-57 reportedly taking significantly more than that (perhaps ~7Mb). > > Is top and/or ps memory usage calculated in the same way between these > builds? Could make collecting real data pretty painful. Unfortunately not, there has been changes in the kernel. My understanding is that private memory will be the same, while calculation of shared memory has changed. Riccardo has a new kernel somewhere with instructions on how to install it on 711. That should make the memory usage comparable. Marco ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On 10 Sep 2008, at 00:11, Gary C Martin wrote: > SUMMARY: 759 vs 711 is only eating an extra ~16Mb of ram after a clean > boot (no running Activities) > > > **I'll try and test several Activity versions that can run on both > builds and see how their individual resources have changed, will post > later. OK, news is not great on the Activity front... SUMMARY: 759 vs 711 each Activity instance in 759 consumes an average of 1Mb more memory than the same Activity running in 711, with Write-57 reportedly taking significantly more than that (perhaps ~7Mb). Is top and/or ps memory usage calculated in the same way between these builds? Could make collecting real data pretty painful. Tests were taken after clean reboots and allowing things to settle (~5min); five activities were launched in order Moon-4, Write-57, Record-57, Paint-20, and Calculate-23; Journal was made the current Activity and the view was switched to home; data collected via a remote ssh session. Wanted to test Browse as it's a known memory eater (well most browsers are), but will need to dig out the most recent version that works with 711 for a reasonable comparison. With all five Activities launched, free buffers/cache reported 5m more memory was being used under 759. Looking at each Activity's foot print shows 759 all having less shared memory, and more resident and data memory. Write-57 759 -> 15.5% (RES=35m, SHR=13m, DATA=20m) 711 -> 12.4% (RES=28m, SHR=15m, DATA=11m) Record-57 759 -> 14.2% (RES=32m, SHR=14m, DATA=64m) 711 -> 13.1% (RES=30m, SHR=16m, DATA=61m) Calculate-23 759 -> 10.6% (RES=24m, SHR=8m, DATA=15m) 711 -> 10.1% (RES=23m, SHR=10m, DATA=11m) Paint-20 759 -> 10.1% (RES=23m, SHR=8m, DATA=14m) 711 -> 9.6% (RES=22m, SHR=10m, DATA=10m) Moon-4 759 -> 9.7% (RES=22m, SHR=8m, DATA=13m) 711 -> 9.2% (RES=21m, SHR=11m, DATA=10m) Well, I was hoping to see the numbers go the other way with the rainbow fork trick sharing more module code between Activities. Could be worse I guess – I should also test opening N instances of the same Activity and see which way memory usage has moved in that scenario. --Gary P.S. No body spotted my intentional 771 mistake in the last email, it was of obviously meant to be 711 :) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On 9 Sep 2008, at 05:10, Michael Stone wrote: > * We need to determine why we encounter low-memory and out-of-memory >situations more frequently than in previous releases. > >- This means that we need to measure how our memory consumption > profile has changed since our previous releases. > > (cscott observes that we were unable to attack the F-9 image size > issues until we were able to quantify the effect of changes we > had > made or were considering making. Consequently, he suggests that > we > will be unable to attack our current space consumption problems > until we are able to generate good numbers (and displays).) > >- We need to think carefully about (or measure) whether our > memory-consumption patterns have changed. SUMMARY: 759 vs 711 is only eating an extra ~16Mb of ram after a clean boot (no running Activities) Just some very quick general observations between build 771 and build 759 running on XO hardware. Tests were taken after clean reboots; allowing things to settle (~5min before collecting stats); with no Activities or UI use (data collected via a remote ssh session); jabber server was set to an unreachable name and no local salute buddies; net connection was to an AP, with about ~4 other APs visible in my neighbourhood. Using free, the reported buffers/cache is generally the more interesting value. After a clean boot 759 is now using an extra 16Mb (up to 115Mb). The reported total has gone up 80k, so I guess the kernel is a little smaller :-) The reported mem free is down by 8Mb (down to 47Mb) indicating better use of available memory (caches went up by that same ~8Mb, plus extra some buffers by 100K). As far as processes are concerned "/usr/bin/sugar-shell" is initially the most hungry, 711 it starts out at 12.2% of total used (RES=28m, SHR =12m, DATA=15m). For 759 it's gone up to 14.3% (RES=33m, SHR=14m, DATA=18m). Working down the list journal is next, 711 starts out at 8.7% of total used (RES=20m, SHR=10m, DATA=1m). For 759 it's gone up to 10.1% (RES=23m, SHR=11m, DATA=11m). These figures are with an empty journal due to the break in compatibility when switching between these builds :-( Next for 759 is more interesting as it reflects the changes to rainbow and (I assume) the pre-loading of commonly used modules for Activity efficiency (I need to test Activity usage changes separately from this email**). So "/usr/sbin/rainbow-daemon" for 711 is just 3.1% (RES=7m, SHR=1m, DATA=6m), while 759 is up at 9.6% (RES=22m, SHR=10m, DATA=11m). Other processes such as "/usr/bin/datastore-service", and "/usr/bin/ sugar-presence-service", have grown slightly by small amounts, "/usr/ bin/sugar-shell-service" has shrunk slightly - nothing exciting. FWIW: The pmap tool seems like it might show interesting data for comparisons (lists where a specific PIDs memory is going at a library level). Most of the interesting stuff is hidden in [ anon ] blocks, but knowing all the libs referenced and their size should be of use. Have been experimenting a little with a script to collect and compare data for all processes between builds - need to find a clear way to visualise the results in a useful (not an 'oh my god spiders with pens are attacking') way. **I'll try and test several Activity versions that can run on both builds and see how their individual resources have changed, will post later. --Gary ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
numpy will still work fine for activities. Marco On Tue, Sep 9, 2008 at 10:52 PM, Mikus Grinbergs <[EMAIL PROTECTED]> wrote: >> Remove numpy usage from the shell > > I have not been following this thread - but: > > There were several Activities (not just Measure) which used > 'numeric'. Then 'numeric' was removed from the builds. I don't > know what those Activities are using now. My concern is that if > they happened to switch to using 'numpy' in place of using 'numeric' > then "no numpy" might also cause ripples. > > mikus > > ___ > Devel mailing list > Devel@lists.laptop.org > http://lists.laptop.org/listinfo/devel > ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
> Remove numpy usage from the shell I have not been following this thread - but: There were several Activities (not just Measure) which used 'numeric'. Then 'numeric' was removed from the builds. I don't know what those Activities are using now. My concern is that if they happened to switch to using 'numpy' in place of using 'numeric' then "no numpy" might also cause ripples. mikus ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 09, 2008 at 05:10:41PM +, Deepak Saxena wrote: > > * We need to find out why the oom-killer is not killing things fast > > enough. Based on our results, we might consider configuring > > /proc/$pid/oom_adj to preferentially kill some processes (e.g., the > > foreground [or background?] activities.) > > In the cases I've been playing with, browse is the only activity that > is running. Will try bumping its oom_adj to see if this improves OOM > kill latency. Did 'echo 15 > /proc/pid/oom_adj`' and this does not help much. The system starts getting laggy at the point we reach about 3M remaining memory (according to top) but the OOM killer does not actually kick in until we fail an allocation which happens sometime in later. Need to capture what is happening at the kernel level during this window though I don't think that fixing this at the OOM killer layer is doable for 8.2. > I have yet to see an actual deadlock. What I saw when trying to > reproduce #3816 is that the OOM killer just takes a very very long > time to kick in. > > > - whether our kernel "overcommits" when allocation requests are made? > > By default vm.overcommit_memory is set to 0 which will refuse "Obvious > overcommits of address space". I will try setting this to 3 along with > vm.overcommit_ratio to 0 to force no overcommit at all and see how the > system reacts. This didn't quite do what I expected as I missread the docs. If we set overcommit_ratio=100 and overcommit_memory=3, the kernel will not overcommit memory and we end up with Browse crashing "gracefully" w/o bogging down the whole system or with Browse just "gracefully" ignoring any user input in the address bar due to probably a failed allocation of some sort when creating a new webpage instance. ~Deepak ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 9, 2008 at 4:01 PM, Marco Pesenti Gritti <[EMAIL PROTECTED]> wrote: > A couple of low risk fixes which could save ~6 mb at startup: > > Remove numpy usage from the shell > http://dev.laptop.org/ticket/8372 > (has patch) > > gst usage in the shell wastes 2.6mb > http://dev.laptop.org/ticket/8375 These seem obvious and low risk. +1 from me. (We should be careful to test the numpy removal w/ differing locales, to ensure #5559 doesn't regress.) --scott -- ( http://cscott.net/ ) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
A couple of low risk fixes which could save ~6 mb at startup: Remove numpy usage from the shell http://dev.laptop.org/ticket/8372 (has patch) gst usage in the shell wastes 2.6mb http://dev.laptop.org/ticket/8375 Marco ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 09, 2008 at 03:13:28PM -0400, [EMAIL PROTECTED] wrote: > [759 sugar shell leak] seems more like 4.5 MB/hour. joyride-2399 sitting back at home with no activities, doing nothing all day: -bash-3.2# uptime 18:14:19 up 20:46, 8 users, load average: 0.15, 0.09, 0.12 -bash-3.2# /home/olpc/bin/ps_mem.py | grep python 70.1 MiB + 6.6 MiB = 76.7 MiB python (5) [...time passes...] -bash-3.2# uptime 19:52:08 up 22:24, 8 users, load average: 0.08, 0.07, 0.01 -bash-3.2# /home/olpc/bin/ps_mem.py | grep python 70.3 MiB + 6.6 MiB = 76.8 MiB python (5) > paul Martin pgprMb6vdCsQp.pgp Description: PGP signature ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 9, 2008 at 9:24 PM, <[EMAIL PROTECTED]> wrote: > tomeu wrote: > > On Tue, Sep 9, 2008 at 9:13 PM, <[EMAIL PROTECTED]> wrote: > > > (there are a lot of variables in play here -- the main thing is > > > that something's certainly leaking.) > > > > The shell shouldn't be doing anything while idle, so checking if the > > trigger is activity network would help here. > > point of reference: on irc you mentioned the buddy list had > been an issue in the past. does the sugar shell maintain that > even when that screen isn't visible? Some history: http://dev.laptop.org/ticket/5532 Info about buddies is permanently stored in the presence service, in the PS wrapper in the sugar shell and in the view. None of this data gets released due to the user switching views. Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
tomeu wrote: > On Tue, Sep 9, 2008 at 9:13 PM, <[EMAIL PROTECTED]> wrote: > > (there are a lot of variables in play here -- the main thing is > > that something's certainly leaking.) > > The shell shouldn't be doing anything while idle, so checking if the > trigger is activity network would help here. point of reference: on irc you mentioned the buddy list had been an issue in the past. does the sugar shell maintain that even when that screen isn't visible? paul =- paul fox, [EMAIL PROTECTED] ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 9, 2008 at 9:13 PM, <[EMAIL PROTECTED]> wrote: > i wrote: > > > > i think there's definitely a sugar shell leak. here's some > > partial data, gathered from a few machines on my desk right now. > > > > (be careful with the column headings -- i rearranged partway through > > to get separate CODE and DATA columns.) > > > > (also, don't do an absolute compare between the 708 build and the > > 759 build -- the latter is chock full of activites, the former > > has none at all.) > > > > > > build 708: > > top - 17:45:17 up 59 min, 3 users, load average: 0.03, 0.05, 0.01 > > PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND > > 1741 olpc 15 0 53128 27m 13m4 14m 12.2 python > > > > same build 708, roughly twenty minutes later: > > top - 18:03:16 up 1:17, 3 users, load average: 0.01, 0.01, 0.00 > > PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND > > 1741 olpc 15 0 53308 28m 13m4 14m 12.3 python > > another hour later on 708: > top - 19:06:19 up 2:21, 3 users, load average: 0.00, 0.00, 0.00 > PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND > 1741 olpc 15 0 53576 28m 13m4 15m 12.3 python > > call it 200 KB/hour? > > > > > build 759: > > top - 12:20:00 up 39 min, 4 users, load average: 0.00, 0.06, 0.11 > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > 1461 olpc 20 0 60576 33m 14m S 0.3 14.5 0:48.38 python > > > > same build 759, almost two hours later: > > top - 14:04:11 up 2:23, 3 users, load average: 0.04, 0.06, 0.08 > > PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND > > 1461 olpc 20 0 65964 38m 14m4 23m 16.7 python > > and another hour on 759: > > top - 15:07:25 up 3:27, 3 users, load average: 0.00, 0.04, 0.02 > PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND > 1461 olpc 20 0 70468 42m 14m4 28m 18.6 python > > seems more like 4.5 MB/hour. > > (there are a lot of variables in play here -- the main thing is > that something's certainly leaking.) The shell shouldn't be doing anything while idle, so checking if the trigger is activity network would help here. Thanks, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
i wrote: > > i think there's definitely a sugar shell leak. here's some > partial data, gathered from a few machines on my desk right now. > > (be careful with the column headings -- i rearranged partway through > to get separate CODE and DATA columns.) > > (also, don't do an absolute compare between the 708 build and the > 759 build -- the latter is chock full of activites, the former > has none at all.) > > > build 708: > top - 17:45:17 up 59 min, 3 users, load average: 0.03, 0.05, 0.01 > PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND > 1741 olpc 15 0 53128 27m 13m4 14m 12.2 python > > same build 708, roughly twenty minutes later: > top - 18:03:16 up 1:17, 3 users, load average: 0.01, 0.01, 0.00 > PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND > 1741 olpc 15 0 53308 28m 13m4 14m 12.3 python another hour later on 708: top - 19:06:19 up 2:21, 3 users, load average: 0.00, 0.00, 0.00 PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND 1741 olpc 15 0 53576 28m 13m4 15m 12.3 python call it 200 KB/hour? > > build 759: > top - 12:20:00 up 39 min, 4 users, load average: 0.00, 0.06, 0.11 > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 1461 olpc 20 0 60576 33m 14m S 0.3 14.5 0:48.38 python > > same build 759, almost two hours later: > top - 14:04:11 up 2:23, 3 users, load average: 0.04, 0.06, 0.08 > PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND > 1461 olpc 20 0 65964 38m 14m4 23m 16.7 python and another hour on 759: top - 15:07:25 up 3:27, 3 users, load average: 0.00, 0.04, 0.02 PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND 1461 olpc 20 0 70468 42m 14m4 28m 18.6 python seems more like 4.5 MB/hour. (there are a lot of variables in play here -- the main thing is that something's certainly leaking.) paul =- paul fox, [EMAIL PROTECTED] ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
c. scott ananian wrote: > On Tue, Sep 9, 2008 at 7:02 AM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: > > stability issue? AFAIK, we haven't seen OOM conditions without any > > activity open. > > Yes, we have. In particular, if you update your system and then leave > it for a while, and later click the software update control panel, you > end up OOMing in the control panel. Sugar restarts and reports are > that software update "works fine the second time". So this might well > be a sugar leak; killing 'sugar' is not good for stability. i think there's definitely a sugar shell leak. here's some partial data, gathered from a few machines on my desk right now. (be careful with the column headings -- i rearranged partway through to get separate CODE and DATA columns.) (also, don't do an absolute compare between the 708 build and the 759 build -- the latter is chock full of activites, the former has none at all.) build 708: top - 17:45:17 up 59 min, 3 users, load average: 0.03, 0.05, 0.01 PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND 1741 olpc 15 0 53128 27m 13m4 14m 12.2 python same build 708, roughly twenty minutes later: top - 18:03:16 up 1:17, 3 users, load average: 0.01, 0.01, 0.00 PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND 1741 olpc 15 0 53308 28m 13m4 14m 12.3 python build 759: top - 12:20:00 up 39 min, 4 users, load average: 0.00, 0.06, 0.11 PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 1461 olpc 20 0 60576 33m 14m S 0.3 14.5 0:48.38 python same build 759, almost two hours later: top - 14:04:11 up 2:23, 3 users, load average: 0.04, 0.06, 0.08 PID USER PR NI VIRT RES SHR CODE DATA %MEM COMMAND 1461 olpc 20 0 65964 38m 14m4 23m 16.7 python finally, i have a joyride-2263, which has been up for 6 days. i don't have copy/paste access to it, but the sugar shell is currently taking 99.6m VIRT, 64m RES, 14m SHR, and is using 28% of system memory. paul p.s. in addition, i think a lot of system processes have grown somewhat. for instance, "login" now has 100k more DATA space in 759 than it had in 708. others (e.g., xinit) haven't grown at all. (also "measured" with top.) =- paul fox, [EMAIL PROTECTED] ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 09, 2008 at 01:10:57PM -0400, Daniel Drake wrote: >On Tue, 2008-09-09 at 00:10 -0400, Michael Stone wrote: >> - whether we can get Browse to behave intelligently when it receives >> BadAlloc errors from X? > >I have no doubt that Browse/xulrunner has room for improvement with >memory usage but this is not where you should be looking. These BadAlloc >messages are true errors generated when the application requests pixmaps >outside of the coordinate range accepted by X (this is well >documented). > >This is a real bug in the code, not a memory pressure issue. Fine. How does the X server report failures to allocate memory on behalf of clients? How does Browse respond? >Such requests should never be generated, and the application crashing >is probably the behaviour we want. I'll grant that it may be helpful for finding the issue in the first place, but I would much rather that we ship a Browse which displayed what it can display without crashing. Michael ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, 2008-09-09 at 13:10 -0400, Daniel Drake wrote: > On Tue, 2008-09-09 at 00:10 -0400, Michael Stone wrote: > > - whether we can get Browse to behave intelligently when it receives > > BadAlloc errors from X? > > I have no doubt that Browse/xulrunner has room for improvement with > memory usage but this is not where you should be looking. These BadAlloc > messages are true errors generated when the application requests pixmaps > outside of the coordinate range accepted by X (this is well > documented). > > This is a real bug in the code, not a memory pressure issue. Such > requests should never be generated, and the application crashing is > probably the behaviour we want. For the specific BadAlloc of the page in our wiki, it is not coordinate out of range, but that the images on that page are so huge as to cause X to get a allocation failure from the OS, and that gets reflected back to the client. Otherwise we'd have gotten a BadValue error. At one point, X11 was pretty carefully checked to work in the face of failures to allocate memory (dunno how true that is today). Whether Firefox should be so silly as to even be asking (the images are huge) and asking the X server to rescale them (also very questionable) is something that can/should be taken up with the Firefox folks, but not something we're going to (be able) to fix on our own. The embedded mozilla folks (there are such people at long last) are the logical people to own this headache. - Jim -- Jim Gettys <[EMAIL PROTECTED]> One Laptop Per Child ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 9, 2008 at 7:39 PM, C. Scott Ananian <[EMAIL PROTECTED]> wrote: > On Tue, Sep 9, 2008 at 7:02 AM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: >> On Tue, Sep 9, 2008 at 6:10 AM, Michael Stone <[EMAIL PROTECTED]> wrote: >>> >>> * We need to find out why the oom-killer is not killing things fast >>>enough. Based on our results, we might consider configuring >>>/proc/$pid/oom_adj to preferentially kill some processes (e.g., the >>>foreground [or background?] activities.) >> >> Any reason why killing first activities' processes wouldn't solve the >> stability issue? AFAIK, we haven't seen OOM conditions without any >> activity open. > > Yes, we have. In particular, if you update your system and then leave > it for a while, and later click the software update control panel, you > end up OOMing in the control panel. Sugar restarts and reports are > that software update "works fine the second time". So this might well > be a sugar leak; killing 'sugar' is not good for stability. That sounds pretty awful, do we have a ticket with precise instructions about how to reproduce? How much time approx. need to wait after updating sugar? Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 9, 2008 at 7:02 AM, Tomeu Vizoso <[EMAIL PROTECTED]> wrote: > On Tue, Sep 9, 2008 at 6:10 AM, Michael Stone <[EMAIL PROTECTED]> wrote: >> >> * We need to find out why the oom-killer is not killing things fast >>enough. Based on our results, we might consider configuring >>/proc/$pid/oom_adj to preferentially kill some processes (e.g., the >>foreground [or background?] activities.) > > Any reason why killing first activities' processes wouldn't solve the > stability issue? AFAIK, we haven't seen OOM conditions without any > activity open. Yes, we have. In particular, if you update your system and then leave it for a while, and later click the software update control panel, you end up OOMing in the control panel. Sugar restarts and reports are that software update "works fine the second time". So this might well be a sugar leak; killing 'sugar' is not good for stability. --scott -- ( http://cscott.net/ ) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 9, 2008 at 11:34 AM, <[EMAIL PROTECTED]> wrote: > On Tue, 2008-09-09 at 00:10 -0400, Michael Stone wrote: >> >> - This means that we need to measure how our memory consumption >> profile has changed since our previous releases. >> >> (cscott observes that we were unable to attack the F-9 image size >> issues until we were able to quantify the effect of changes we had >> made or were considering making. Consequently, he suggests that we >> will be unable to attack our current space consumption problems >> until we are able to generate good numbers (and displays).) > > what's the baseline "previous" release for this comparison? update.1 (703, 708, 713 or your choice) You could also try using the "pre-F-9 merge" joyrides for comparison, but that presupposes that our memory problems are a side-effect of the F-9 merge, and I don't think that we have any evidence of this yet. --scott -- ( http://cscott.net/ ) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, 2008-09-09 at 00:10 -0400, Michael Stone wrote: > - whether we can get Browse to behave intelligently when it receives > BadAlloc errors from X? I have no doubt that Browse/xulrunner has room for improvement with memory usage but this is not where you should be looking. These BadAlloc messages are true errors generated when the application requests pixmaps outside of the coordinate range accepted by X (this is well documented). This is a real bug in the code, not a memory pressure issue. Such requests should never be generated, and the application crashing is probably the behaviour we want. Daniel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
> * We need to find out why the oom-killer is not killing things fast > enough. Based on our results, we might consider configuring > /proc/$pid/oom_adj to preferentially kill some processes (e.g., the > foreground [or background?] activities.) In the cases I've been playing with, browse is the only activity that is running. Will try bumping its oom_adj to see if this improves OOM kill latency. > * We need to determine whether the oom-killer is killing the right > processes. (sysctl's vm.oom_dump_tasks can be set to 1 in order to > get more verbosity from the oom-killer when it fires). >From watching top, it appears that we're killing the correct process. For example, when running the test case from #8316, OOM killer does not kill browse, but just kills the gnash instance which is chewing up RAM. > - the warnings in the ramfs and tmpfs code about the deadlocks that > tmpfsen can generate under low- or no-memory conditions. I have yet to see an actual deadlock. What I saw when trying to reproduce #3816 is that the OOM killer just takes a very very long time to kick in. > - whether our kernel "overcommits" when allocation requests are made? By default vm.overcommit_memory is set to 0 which will refuse "Obvious overcommits of address space". I will try setting this to 3 along with vm.overcommit_ratio to 0 to force no overcommit at all and see how the system reacts. ~Deepak ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2 ([EMAIL PROTECTED])
Hi All, I recommend build 708 as the baseline. Thanks, Greg S ** > Date: Tue, 09 Sep 2008 11:34:50 -0400 > From: [EMAIL PROTECTED] > Subject: Re: Stability and Memory Pressure in 8.2 > To: devel@lists.laptop.org > Message-ID: <[EMAIL PROTECTED]> > Content-Type: text/plain; charset="us-ascii" > > On Tue, 2008-09-09 at 00:10 -0400, Michael Stone wrote: >> >> - This means that we need to measure how our memory consumption >> profile has changed since our previous releases. >> >> (cscott observes that we were unable to attack the F-9 image size >> issues until we were able to quantify the effect of changes we had >> made or were considering making. Consequently, he suggests that we >> will be unable to attack our current space consumption problems >> until we are able to generate good numbers (and displays).) > > what's the baseline "previous" release for this comparison? > > paul > =- > paul fox, [EMAIL PROTECTED] > > > -- > > ___ > Devel mailing list > Devel@lists.laptop.org > http://lists.laptop.org/listinfo/devel > > > End of Devel Digest, Vol 31, Issue 30 > * > ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, 2008-09-09 at 00:10 -0400, Michael Stone wrote: > > - This means that we need to measure how our memory consumption > profile has changed since our previous releases. > > (cscott observes that we were unable to attack the F-9 image size > issues until we were able to quantify the effect of changes we had > made or were considering making. Consequently, he suggests that we > will be unable to attack our current space consumption problems > until we are able to generate good numbers (and displays).) what's the baseline "previous" release for this comparison? paul =- paul fox, [EMAIL PROTECTED] ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, 2008-09-09 at 00:10 -0400, Michael Stone wrote: > Dear devel@, > > Kim, Greg, and I have concluded that the instability we experience under > memory-pressure in 8.2-759 and similar is the single "hard" issue that > we wish to _attempt_ to address before releasing 8.2 on current > timeframes. (We recognize that there are several other issues marked > as blocking the release but we are confident that they will be resolved > satisfactorily or are, in a few cases, beyond help.) > > Since most other aspects of the release seem to be running smoothly, Kim > asked me to take a more direct role in organizing our efforts produce a > release which avoids memory pressure when possible and which is > better-behaved when it strikes. > > To that end, I would like to ask for your assistance with the following > questions and tasks: > > * We need to determine why we encounter low-memory and out-of-memory > situations more frequently than in previous releases. > > - This means that we need to measure how our memory consumption > profile has changed since our previous releases. > > (cscott observes that we were unable to attack the F-9 image size > issues until we were able to quantify the effect of changes we had > made or were considering making. Consequently, he suggests that we > will be unable to attack our current space consumption problems > until we are able to generate good numbers (and displays).) > > - We need to think carefully about (or measure) whether our > memory-consumption patterns have changed. I am particularly > skeptical of our widespread use of tmpfsen since the pages consumed > by files stored on tmpfsen are permanently dirty (and are perhaps > accounted for differently than pages mapped into process' address > spaces?) > > - We need to check the configuration of applications like Browse > which have configurable caching behavior. (Search for "cache" or > "capacity" in about:config; check for important compile-time > configuration flags.) > > - We need to test in a variety of different network configurations > in order to determine to what extent the network/presence > environment affects memory consumption. > > * We need to check carefully for memory-leaks. Three mechanisms which > occur to me include: > > 1) running the system for a period of time, then scanning for > anomalies either manually or in some automated fashion from > userland, kernel-land, or OFW (via SysRq or SMM). > > 2) setting rlimits various processes and noting what dies > > 3) using debugging tools like the python garbage collection > module, guppy/heapy, gdb+macros, valgrind, efence, purify, etc. > looking for trouble. > > * We need to find out why the oom-killer is not killing things fast > enough. Based on our results, we might consider configuring > /proc/$pid/oom_adj to preferentially kill some processes (e.g., the > foreground [or background?] activities.) > > * We need to determine whether the oom-killer is killing the right > processes. (sysctl's vm.oom_dump_tasks can be set to 1 in order to > get more verbosity from the oom-killer when it fires). > > * We ought to ponder whether there are any additional "dirty hacks" we > can experiment with in order to reduce memory consumption; for > example, running the Shell and Journal (and DS?) in one process or > making use of the compressed-caching code published on this list some > months ago. > > * Random other stuff to think about: > > - rlimits, cgroups, and the memory resource controller > > - the warnings in the ramfs and tmpfs code about the deadlocks that > tmpfsen can generate under low- or no-memory conditions. > > - whether our kernel "overcommits" when allocation requests are made? > > - whether we can get Browse to behave intelligently when it receives > BadAlloc errors from X? > > - how to run bootchart on the XO > > - how to generate decent statistics and graphics (preferably in an > automated fashion) concerning memory usage as part of our test > suite > > - system-tap's kmalloc2.stp example > > In conclusion, more to come once I have some actual data; _please_ feel > free to assist in collecting it! (though be aware that I may 'volunteer' > you if I need your help. (That means you, Tomeu, Riccardo, Deepak, > ...)). > > Regards, > > Michael There are some (trivial) tools (you may be interested in) I've written and used besides others to attack/study this issues: * picker [1] For me it was handier to use then bootchart; will also show per process mem usage. * imports timings and alloc statistics [2] Patch to python that prints timings and mem usage diffs for every imported module. Original timings patch is from Tomeu. * py
Re: Stability and Memory Pressure in 8.2
There are four classes of things we can/should/could do: 1) understand where our memory is being used. Individual bugs can have a large effect. Something stupid could be hurting us badly, and we won't know unless we look. What is more, we need to invest in tools that allow us to monitor this. 2) there are some band-aids that have been discussed, such as rlimits, which we can experiment with, and that *might* improve the situation without the real solutions the next two items go into. 3) the oom killer's default algorithms are pretty terrible, taking little into account in the choice of what gets killed. Between Sugar/Rainbow, and knowledge that the window manager has, one could do much better. 4) we provide no end user feedback on memory usage, either. We should investigate whether revisiting our previous attempt to give such feedback, now that Linux can provide much better information than it could when we abandoned our previous donut attempt. The users could really help, if only we let them know a bit about what was going on... In terms of priority: immediately examining what is going on with memory usage in case we have a bad leak is clearly worthwhile (1). We need to budget for tool-building to monitor the situation going forward immediately. 2) and *possibly* (a beginning on) 3 may be 8.2.1 fodder, but without feedback from more users, we won't know if this isn't just keys under the lamppost (e.g. our multiple bug reports about browse ooming because of our amazingly stupid hardware wiki page, which is one of the most egregious pages I've seen in recent memory. Doing 3) pretty well I suspect is 9.1 fodder, but only if we start very soon. My gut tells me its some man-months of work. We might get lucky and should investigate if any of the embedded folks have something we can use. Unfortunately, the Nokia folks I had thought might have something didn't, when I last checked a year ago. But we can/should check a bit first before diving in; it's a year later. http://dev.laptop.org/ticket/1995 I urge we investigate quickly whether 4) is, in fact, feasible, so that it can go on the Sugar roadmap in time to be done for 9.1. - Jim On Tue, 2008-09-09 at 13:02 +0200, Tomeu Vizoso wrote: > On Tue, Sep 9, 2008 at 6:10 AM, Michael Stone <[EMAIL PROTECTED]> wrote: > > > > * We need to find out why the oom-killer is not killing things fast > >enough. Based on our results, we might consider configuring > >/proc/$pid/oom_adj to preferentially kill some processes (e.g., the > >foreground [or background?] activities.) > > Any reason why killing first activities' processes wouldn't solve the > stability issue? AFAIK, we haven't seen OOM conditions without any > activity open. > > Just in case, I'm not saying that isn't worth to do any of the other > things on your list. > > Regards, > > Tomeu > ___ > Devel mailing list > Devel@lists.laptop.org > http://lists.laptop.org/listinfo/devel -- Jim Gettys <[EMAIL PROTECTED]> One Laptop Per Child ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 09, 2008 at 12:10:53AM -0400, Michael Stone wrote: > Dear devel@, > > Kim, Greg, and I have concluded that the instability we experience under > memory-pressure in 8.2-759 and similar is the single "hard" issue that > we wish to _attempt_ to address before releasing 8.2 on current > timeframes. [...] > * We ought to ponder whether there are any additional "dirty hacks" we > can experiment with in order to reduce memory consumption; for > example, running the Shell and Journal (and DS?) in one process or > making use of the compressed-caching code published on this list some > months ago. Compcache has been working well enough for me for the last six months to suggest that wider testing wouldn't be a disaster. -bash-3.2# cat /boot/olpc_build joyride 2399 -bash-3.2# free total used free sharedbufferscached Mem:235716 230356 5360 0 162865448 -/+ buffers/cache: 163280 72436 Swap:58924 2736 56188 -bash-3.2# swapon -s FilenameTypeSizeUsed Priority /dev/ramzswap0 partition 58924 2736 100 The trac ticket is http://dev.laptop.org/ticket/28 > Regards, > > Michael Martin pgpTfTp0tmNft.pgp Description: PGP signature ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Stability and Memory Pressure in 8.2
On Tue, Sep 9, 2008 at 6:10 AM, Michael Stone <[EMAIL PROTECTED]> wrote: > > * We need to find out why the oom-killer is not killing things fast >enough. Based on our results, we might consider configuring >/proc/$pid/oom_adj to preferentially kill some processes (e.g., the >foreground [or background?] activities.) Any reason why killing first activities' processes wouldn't solve the stability issue? AFAIK, we haven't seen OOM conditions without any activity open. Just in case, I'm not saying that isn't worth to do any of the other things on your list. Regards, Tomeu ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel