Jeff, I've run Shapely's tests under valgrind in the past, but not with a very recent GEOS. The one place I did see leaks was in the GEOS WKT and WKB readers and writers. I can't rule out new leaks in the more recent GEOS but I think they are unlikely.
Since you're already using osgeo.ogr, you could remove Shapely from the script and just use the OGR geometry methods to see if that helps. I don't have a better idea at the moment. On Wed, Aug 22, 2012 at 4:14 PM, Jeff Cook <[email protected]> wrote: > Thanks for your reply Sean. > > I am using GEOS 3.3.4 and Shapely 1.2.15 on Python 2.7.3 on Arch Linux > x86_64 3.4.8. Machine is an i7-2600K with 16GB of RAM. > > As I said, I still have the massif output file if it would be of any > value (and of course, could generate a new one). > > On Mon, Aug 20, 2012 at 3:23 PM, Sean Gillies <[email protected]> wrote: >> Hi Jeff, >> >> Just back from vacation. I've never used that converter script, and am >> not sure exactly how it works, but the way it builds up large lists of >> data before writing out the paths seems unlikely to scale. >> >> What versions of Shapely and GEOS are you using? >> >> On Mon, Aug 13, 2012 at 10:18 PM, Jeff Cook <[email protected]> >> wrote: >>> Hello all >>> >>> I am using jVectorMap's converter.py script ( >>> https://github.com/bjornd/jvectormap/blob/db22821449ea6e1939f3f91070c2f6280ae99b51/converter/converter.py >>> ) to process an 85MB Shapefile that includes all telephone area codes >>> in the United States. After a short while, memory usage hovers around >>> 8G, 50% of my system memory. Once the script attempts to write to >>> disk, usage jumps to 14G+ and causes my system to start swapping out. >>> >>> I am a relative newbie when it comes to GIS data, and I have never >>> used any Python libraries to deal with such data, so please forgive my >>> ignorance. >>> >>> I am interested in making this run faster if there's a way to do so >>> reasonably. I am currently doing an extensive massif run to get a >>> reasonable memory profile, but initial runs seem to indicate most >>> memory is being consumed by C objects in libgeos. This is consistent >>> with the results from heapy, which pretty consistently show Python >>> objects only taking 10-12 MB of space in the program's early stages. >>> >>> I was wondering if there was something relatively simple that could be >>> done to make the program release memory more reasonably. Some of my >>> reading leads me to believe this issue may lie deeper than the >>> surface-level Python code. I still have more investigation to do, but >>> I thought I should get a message posted here quickly since the list >>> will likely have better ideas than I. >>> >>> Thanks >>> Jeff >>> _______________________________________________ >>> Community mailing list >>> [email protected] >>> http://lists.gispython.org/mailman/listinfo/community >> >> >> >> -- >> Sean Gillies -- Sean Gillies _______________________________________________ Community mailing list [email protected] http://lists.gispython.org/mailman/listinfo/community
