On Jul 22, 2009, at 5:08 AM, Sean Gillies wrote: > I was expressing concern to Howard Butler about his plans to add the > capacity to store Python objects to our Rtree class. After sleeping on > the idea, it seems to be after all not unlike the OOBTree or IOBTree > flavors of Zope's BTrees. I'm +1 on the idea. While on the subject of > BTrees, I think we could do worse than to implement its interface when > enhancing Rtree: > > http://docs.zope.org/zope3/Code/BTrees/Interfaces/index.html > http://docs.zope.org/zope3/Code/BTrees/OOBTree/OOBTree.1 >
I guess I will use this as an opportunity to give folks an update of what I've been doing. I've needed a C API of libspatialindex for libLAS <http://liblas.org>, and my experience with Shapely and libLAS told me that a ctypes-based API plus doctests would be an excellent way to test it. Additionally, performance optimization of a library like libspatialindex is rather tedious from C/C++, and having a Python implementation that I can use to twiddle the knobs can speed investigation of performance issues. These two things have resulted in a complete refactoring of Rtree. The first iteration of Rtree had the following deficiencies: - It was a convoluted Python extension in C that wrapped some C++ code. It was quite hard to maintain in order to do anything other than fix small bugs. - It only supported changing the pagesize parameter. libspatialindex has nearly 20 properties that can be set/tweaked. - It didn't allow storage of objects in the tree (ie, clustered index), only ids. - It only supported the RTree index variant of libspatialindex, not the MVRtree (multi-versioned) or TPRTree (time-parameterized) variants - It only supported 2D indexes - It only allowed you to store boxes, not just points (which would increase the storage efficiency by nearly a factor of four in most cases) With these in mind, I started off implementing a C API for libspatialindex. It currently lives in the Rtree repository <http://svn.gispython.org/svn/gispy/Rtree/trunk/libsidx/ >, but it will probably move to libspatialindex at some point. The C API for libspatialindex hides a number of the details of working with libspatialindex, and it currently provides functional parity (plus object storage) with what was needed for the previous versions of Rtree. I then reimplemented Rtree as a ctypes wrapper around the C API. The current SVN head of Rtree contains this code. I have disabled the old C/C++ wrappers, though they are still in the tree so I can test performance changes. The new code currently provides functional parity and no API changes from the previous version of Rtree. Usage of this new version looks like <http://svn.gispython.org/svn/gispy/Rtree/trunk/tests/index.txt >. Query results from the index now come in the form of a C API object that also contains a string of data for that record along with the id. Because of this, I have wrapped the intersection and nearest methods to only return ids from these query results, and a user would need to use the intersection_obj or nearest_obj to fetch the actual query results. If we are willing to break the API here, we can change that. The following items have yet to be done in both the C API and Python implementations: - TPR/MVR trees. A bit of work needs to be done, but the basics are there - Point storage. insert will determine that the min and max values are the same, and we will insert a SpatialIndex::Point instead of a SpatialIndex::Region - There's a performance issue causing memory and disk indexes to query at the same speed. I'm sure its something stupid I'm doing... - Implementation of a query to find the bounds of the entire index (quickly). - Bulk insertion I plan to continue working on these items throughout the week. I would be interested in folks testing what's there if they have a chance, but I will warn you that it requires using a SVN head version of libspatialindex <http://svn.gispython.org/svn/spatialindex/spatialindex/trunk/ >. I have fixed a number of bugs along the way while implementing this, and one of them was a show-stopper. As far as Sean's proposal to implement Zope's indexing interface atop of Rtree, it sounds neat to me, but I'm not signing up to do it :) Howard _______________________________________________ Community mailing list [email protected] http://lists.gispython.org/mailman/listinfo/community
