Still working on reading and validating Canonical JSON files that are larger than available memory.
Along the way, found that Python 2.5.x doesn't support an offset to mmap(), which at first blush makes re-mapping with a sliding window problematic. Well, almost. If you mmap.close(), re-create the mmap and start reading at an offset (m[myoffset]), python knows how to DTRT. So every N number of reads (random or linear), close and re-mmap the fh. If the reads are short, the memory used by N reads will be roughly N * mmap.PAGESIZE Where pagesize is usually, 4KB. So re-mapping every 4MB for example keeps the whole process under 6MB while working through a file that is 183MB. On the XO-1, it's the difference of "churning through it" and slowing the whole OS to a crawl, and then inching towards a big OOM zap. cheers, martin -- [email protected] [email protected] -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff _______________________________________________ Devel mailing list [email protected] http://lists.laptop.org/listinfo/devel
