Hi GisPython I am new to using shapley.
I have a few polygons and I am interested in finding out about which of some millions of points intersect those polygons. I only need to create the polygons once - there are dozens at most, but creating millions of point objects is killing my performance big time! Here is some cprofile output from a test run: 159616551 function calls (159601695 primitive calls) in 307.189 seconds Ordered by: internal time List reduced from 1844 to 20 due to restriction <20> ncalls tottime percall cumtime percall filename:lineno(function) 7518198 135.451 0.000 185.103 0.000 predicates.py:8(__call__) 1253033 22.242 0.000 51.047 0.000 point.py:170(geos_point_from_py) 7518198 16.698 0.000 16.698 0.000 base.py:24(geometry_type_name) 8011 16.586 0.002 302.903 0.038 shapely_intersects.py:26(shape_function) 7518198 14.399 0.000 201.687 0.000 base.py:447(intersects) 8771627 11.716 0.000 38.436 0.000 {hasattr} 15036396 11.607 0.000 45.146 0.000 topology.py:14(_validate) 8771241 7.834 0.000 11.235 0.000 collections.py:119(itervalues) 3759104 7.515 0.000 10.693 0.000 base.py:131(empty) 1253033 7.092 0.000 12.961 0.000 numeric.py:446(require) 1253033 5.791 0.000 63.158 0.000 point.py:105(_set_coords) 37590990 5.679 0.000 5.679 0.000 base.py:162(_get_geom) 7518198 5.659 0.000 23.459 0.000 base.py:242(geometryType) 1253033 4.999 0.000 4.999 0.000 __init__.py:501(cast) 8771283 3.401 0.000 3.401 0.000 collections.py:72(__iter__) 7518198 3.260 0.000 26.719 0.000 base.py:245(type) 3759104 3.178 0.000 3.178 0.000 base.py:124(_is_empty) 1253033 2.874 0.000 66.251 0.000 point.py:38(__init__) 1253033 2.835 0.000 23.231 0.000 coords.py:17(required) 1253036 2.689 0.000 2.689 0.000 {method 'copy' of 'numpy.ndarray' objects} Would it be possible to pool the point objects and just change out the lat/lon location of the point object before every call to intersects? Here is a code except - blen = len(block) particle_position = block['loc'] # a (n,2) array of lat/lon spillets_in_shapes = numpy.zeros((blen, slen),dtype='bool') for i, pos in enumerate(particle_position): p = Point(pos) for j,shape in enumerate(shapes.itervalues()): if shape.intersects(p): spillets_in_shapes[i,j] = True This code is called many times for each block of particles that I have to process. It seems most of my time is spent in initializing point objects and in something called predicates.py? Any suggestions for optimization? David
_______________________________________________ Community mailing list Community@lists.gispython.org http://lists.gispython.org/mailman/listinfo/community