Re: [Sugar-devel] poor man's mmap "sliding window" on Python 2.5.x

2009-07-08 Thread Daniel Drake
On Wed, 2009-07-08 at 15:23 +1200, Martin Langhoff wrote:
> Had some time to retest this on the plane, and I think it was
> mis-diagnosis. The original code I was testing is lost. In re-testing
> this I find that the problem is more nuanced, and I may have been
> wrong: looking at 'top', the kernel does not appear very eager to
> discard old mapped pages.

You can probably influence this by marking the ranges that you're done
with with madvise().

Daniel


___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] poor man's mmap "sliding window" on Python 2.5.x

2009-07-07 Thread Martin Langhoff
On Tue, Jul 7, 2009 at 9:23 AM, Martin
Langhoff wrote:
> Thanks for looking into this. I'll post the (trivial, really) repro
> code I have later (I'm on a gruelling >35hr trip at the moment).

Had some time to retest this on the plane, and I think it was
mis-diagnosis. The original code I was testing is lost. In re-testing
this I find that the problem is more nuanced, and I may have been
wrong: looking at 'top', the kernel does not appear very eager to
discard old mapped pages.

The process is doing a linear read through the file, and is slow
enough that it appears only to grow. But if I run another process that
allocates a lot of memory, then the kernel does discard pages pages.

A good way of monitoring this seems to be:

   watch --differences grep -A8  /proc//smaps

So the mmap does the right thing. ACCESS_READ doesn't seem to make any
difference.

#!/usr/bin/python

import mmap
import sys

def mmap_to_death(fpath):
fh = open(fpath, 'r+')
mm = mmap.mmap(fh.fileno(), 0, access=mmap.ACCESS_READ)

l = len(mm)
c = 0
buf = ''

while c < l:
buf = mm[c]
c = c+1

mm.close()
fh.close()

mmap_to_death(sys.argv[1])



m
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] poor man's mmap "sliding window" on Python 2.5.x

2009-07-06 Thread Martin Langhoff
On Tue, Jul 7, 2009 at 7:16 AM, Benjamin M.
Schwartz wrote:
> huh.  I looked through python's mmap implementation [1] and there doesn't
> seem to be any caching or funny business going on.
>
> I wonder if it could be over-aggressive caching somewhere in jffs2, in an
> attempt to avoid repeatedly decompressing the same block.

Thanks for looking into this. I'll post the (trivial, really) repro
code I have later (I'm on a gruelling >35hr trip at the moment).

So far, tested on 8.2.x _on vfat partitions mounted from a USB device_
and on the local disk (ext3) on Ubuntu Hardy.

Have not tested it (yet) with data file on jffs2 as that's not the use
case I'm gunning for.



m
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] poor man's mmap "sliding window" on Python 2.5.x

2009-07-06 Thread Benjamin M. Schwartz
Martin Langhoff wrote:
> On Tue, Jul 7, 2009 at 2:06 AM, Benjamin M.
> Schwartz wrote:
>> Is this (a) a kernel bug, (b) Python layering extra caching over mmap, or
>> (c) a misunderstanding of mmap on my part?
> 
> money is b

huh.  I looked through python's mmap implementation [1] and there doesn't
seem to be any caching or funny business going on.

I wonder if it could be over-aggressive caching somewhere in jffs2, in an
attempt to avoid repeatedly decompressing the same block.

--Ben

[1] http://svn.python.org/projects/python/trunk/Modules/mmapmodule.c

P.S. JFFS2 appears to support read-only mmap, which I presume is what
you're using.



signature.asc
Description: OpenPGP digital signature
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] poor man's mmap "sliding window" on Python 2.5.x

2009-07-06 Thread Martin Langhoff
On Tue, Jul 7, 2009 at 2:06 AM, Benjamin M.
Schwartz wrote:
> Is this (a) a kernel bug, (b) Python layering extra caching over mmap, or
> (c) a misunderstanding of mmap on my part?

money is b




m
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] poor man's mmap "sliding window" on Python 2.5.x

2009-07-06 Thread Benjamin M. Schwartz
Martin Langhoff wrote:
> Along the way, found that Python 2.5.x doesn't support an offset to
> mmap(), which at first blush makes re-mapping with a sliding window
> problematic.

Why is an explicit sliding window necessary?  Isn't the point of mmap that
you can access as you like, and the kernel will clear old caches if
there's memory pressure?

> On the XO-1, it's the difference of "churning through it" and slowing
> the whole OS to a crawl, and then inching towards a big OOM zap.

Is this (a) a kernel bug, (b) Python layering extra caching over mmap, or
(c) a misunderstanding of mmap on my part?

--Ben



signature.asc
Description: OpenPGP digital signature
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


[Sugar-devel] poor man's mmap "sliding window" on Python 2.5.x

2009-07-03 Thread Martin Langhoff
Still working on reading and validating Canonical JSON files that are
larger than available memory.

Along the way, found that Python 2.5.x doesn't support an offset to
mmap(), which at first blush makes re-mapping with a sliding window
problematic. Well, almost. If you mmap.close(), re-create the mmap and
start reading at an offset (m[myoffset]), python knows how to DTRT.

So every N number of reads (random or linear), close and re-mmap the
fh. If the reads are short, the memory used by N reads will be roughly

   N * mmap.PAGESIZE

Where pagesize is usually, 4KB. So re-mapping every 4MB for example
keeps the whole process under 6MB while working through a file that is
183MB.

On the XO-1, it's the difference of "churning through it" and slowing
the whole OS to a crawl, and then inching towards a big OOM zap.

cheers,



martin
-- 
 martin.langh...@gmail.com
 mar...@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel