Hi Pascal, thanks for tracking this down. Would you be willing to  
download r1225 from the Shapely 1.0 branch? I've got a fix for the  
various binary op/topology descriptors that keeps the reference counts  
down to 2.

http://svn.gispython.org/svn/Shapely/branches/1.0

On Apr 5, 2009, at 7:22 AM, Pascal Leroux wrote:

> Hi Sean,
>
> I think I've found what's going wrong with the references.
>
> (the line numbers below correspond to the official release 1.0.11)
>
> I added in shapely/geometry/base.py, after line 266 (in __del__  
> method of the BaseGeometry)
> print "__del__ %x" % id(self)
> to see when a geometry instance is actually freed
>
>
> >>> from sys import getrefcount as grc
> >>> from shapely.geometry import Point
> >>> p=Point((0,0))
> >>> grc(p)
> 2
>
> Python documentation says that's ok to find +1 for the result that is
> evaluated inside the function/method. So reference count is 1
>
> >>> "%x" % id(p)
> 'b7e72e8c'
> >>> del p
> __del__ b7e72e8c
>
> As soon as the reference count equals 0, __del__ is called
>
> >>> p=Point((0,0))
> >>> "%x" % id(p)
> 'b7e72e8c'
> >>> p.intersection
> <shapely.topology.BinaryTopologicalOp object at 0xb7d34cac>
> >>> grc(p)
> 3
> >>> id(p)-id(p.intersection.context)
> 0L
>
> In the "__get__" method of class BinaryTopologicalOp (line 25,  
> topology.c),
> self.context that was None, now equals my point p
> There's one more reference and the last statement above shows that  
> the new
> reference is p.intersection.context
>
> >>> res = p.intersection(Point((9,9)))
> __del__ 94e428c
> >>> grc(p)
> 4
> >>> id(p)-id(p.intersection.context)
> 0L
>
> A new "call" to "__get__" decrements AND re-increments : reference  
> count doesn't change
>
> >>> id(p)-id(res.__p__)
> 0L
> >>> del p.intersection.context
> >>> grc(p)
> 3
>
> When the function geom_factory is called, a new reference to the  
> point is done
> "ob.__p__ = parent" (base.py, line 45) with "parent" that is my  
> initial "p"
>
> >>> del res.__p__
> >>> grc(p)
> 2
> >>> del p
> __del__ b7e72e8c
> >>>
>
> When I explictely call "del res.__p__", the "p" reference count is  
> OK (2-1)
> __del__ is now called after the last "del p"
>
>
> I think that this problem occurs for all the geometries (they are  
> all (?) derived from BaseGeometry).
> And that explains the issue I posted last week and those of yesterday.
>
> Do you think it's correct ?
>
> Pascal
>
>
>
>
> 2009/4/4 Pascal Leroux <[email protected]>
> Hi Sean,
>
> 2009/4/4 Sean Gillies <[email protected]>
>
> Hi Pascal,
>
> This cries out for a cascaded union, I think. Maybe we should have a
> 1.0.12 release. Still, the leaks trouble me. Statements like
>
>   P = P.union(p)
>
> have caused me problems in the past. Probably a matter of held
> references? Any interest in digging into it?
>
> Of course I'm interested in finding what's going wrong. But I've  
> using Python (and Shapely) for less than a year (my first mail to  
> the mailing list was posted in last July) and I think my knowledge  
> in Python is not big enough, especially for Python/C interface, to  
> fix the issue.
>
> But I'm trying to get into your code and learn how Shapely works. I  
> would be so sad to have to work again with C/libgeos. Once again,  
> Shapely saves me time (thanks !) and I've taught my colleagues  
> (Institut Géographique National, France) to use Python and Shapely .
>
> If I find something, I tell you.
>
> Pascal
>
>
>
>
> Sean
>
> On Apr 3, 2009, at 3:02 AM, Pascal Leroux wrote:
>
> > Even with very simple geometries :
> >
> > >>> P = Polygon(((0,0),(0,9),(9,9),(9,0)))
> > >>> p = Polygon(((1,1),(1,5),(5,5),(5,1)))
> > >>> count = 1
> > >>> while 1:
> > ...     if count % 10000 == 0: print count
> > ...     P = P.union(p)
> > ...     count += 1
> >
> >
> > after 25 minutes :
> >
> >
> > 7020000
> > 7030000
> > Python(4969) malloc: *** mmap(size=2097152) failed (error code=12)
> > *** error: can't allocate region
> > *** set a breakpoint in malloc_error_break to debug
> > ...
> > Python(4969) malloc: *** mmap(size=2097152) failed (error code=12)
> > *** error: can't allocate region
> > *** set a breakpoint in malloc_error_break to debug
> > MemoryError
> > Python(4969) malloc: *** mmap(size=2097152) failed (error code=12)
> > *** error: can't allocate region
> > *** set a breakpoint in malloc_error_break to debug
> > Traceback (most recent call last):
> >   File "<stdin>", line 3, in <module>
> >   File "/System/Library/Frameworks/Python.framework/Versions/2.5/ 
> lib/
> > python2.5/site-packages/Shapely-1.0.11-py2.5.egg/shapely/
> > topology.py", line 34, in __call__
> >     if not self.context.is_valid:
> > MemoryError
> > >>>
> >
> >
> > This code doesn't make sense but it highlights the issue I talked
> > about last week.
> > I used a very big polygon in my last email but, here, I cannot use
> > simpler ones.
> >
> > My final purpose is to build a full topology with several shapefiles
> > located on a 1°x1° area and I often work with more than 200,000
> > features at a time. Their geometries, even if they are not as
> > complicated as the polygon I used last week, are not as simple as
> > those I used above.
> >
> > 7,000,000 calls to union with the two simple polygons crashed the
> > script because of a lack of memory. That corresponds (in the better
> > case), with 200,000 features, to an average of 35 calls per initial
> > feature. Obviously, it's not enough to build a full topology.
> >
> > I'm trying to split my area into several parts, to build topology
> > for each part and then merge the parts. It will be more complicated.
> > Will it be enough ? Will I have to come back to C/libgeos ? aaargh !
> >
> > Pascal
> >
> >
> >
> > _______________________________________________
> > Community mailing list
> > [email protected]
> > http://lists.gispython.org/mailman/listinfo/community
>
> _______________________________________________
> Community mailing list
> [email protected]
> http://lists.gispython.org/mailman/listinfo/community
>
>
> _______________________________________________
> Community mailing list
> [email protected]
> http://lists.gispython.org/mailman/listinfo/community

_______________________________________________
Community mailing list
[email protected]
http://lists.gispython.org/mailman/listinfo/community

Reply via email to