On Jul 22, 2009, at 5:08 AM, Sean Gillies wrote:

> I was expressing concern to Howard Butler about his plans to add the
> capacity to store Python objects to our Rtree class. After sleeping on
> the idea, it seems to be after all not unlike the OOBTree or IOBTree
> flavors of Zope's BTrees. I'm +1 on the idea. While on the subject of
> BTrees, I think we could do worse than to implement its interface when
> enhancing Rtree:
>
> http://docs.zope.org/zope3/Code/BTrees/Interfaces/index.html
> http://docs.zope.org/zope3/Code/BTrees/OOBTree/OOBTree.1
>

I guess I will use this as an opportunity to give folks an update of  
what I've been doing.  I've needed a C API of libspatialindex for  
libLAS <http://liblas.org>, and my experience with Shapely and libLAS  
told me that a ctypes-based API plus doctests would be an excellent  
way to test it.  Additionally, performance optimization of a library  
like libspatialindex is rather tedious from C/C++, and having a Python  
implementation that I can use to twiddle the knobs can speed  
investigation of performance issues.

These two things have resulted in a complete refactoring of Rtree.   
The first iteration of Rtree had the following deficiencies:

- It was a convoluted Python extension in C that wrapped some C++  
code.  It was quite hard to maintain in order to do anything other  
than fix small bugs.
- It only supported changing the pagesize parameter.  libspatialindex  
has nearly 20 properties that can be set/tweaked.
- It didn't allow storage of objects in the tree (ie, clustered  
index), only ids.
- It only supported the RTree index variant of libspatialindex, not  
the MVRtree (multi-versioned) or TPRTree (time-parameterized) variants
- It only supported 2D indexes
- It only allowed you to store boxes, not just points (which would  
increase the storage efficiency by nearly a factor of four in most  
cases)

With these in mind, I started off implementing a C API for  
libspatialindex.  It currently lives in the Rtree repository 
<http://svn.gispython.org/svn/gispy/Rtree/trunk/libsidx/ 
 >, but it will probably move to libspatialindex at some point.  The C  
API for libspatialindex hides a number of the details of working with  
libspatialindex, and it currently provides functional parity (plus  
object storage) with what was needed for the previous versions of  
Rtree.  I then reimplemented Rtree as a ctypes wrapper around the C  
API.  The current SVN head of Rtree contains this code.  I have  
disabled the old C/C++ wrappers, though they are still in the tree so  
I can test performance changes.  The new code currently provides  
functional parity and no API changes from the previous version of Rtree.

Usage of this new version looks like 
<http://svn.gispython.org/svn/gispy/Rtree/trunk/tests/index.txt 
 >.  Query results from the index now come in the form of a C API  
object that also contains a string of data for that record along with  
the id.  Because of this, I have wrapped the intersection and nearest  
methods to only return ids from these query results, and a user would  
need to use the intersection_obj or nearest_obj to fetch the actual  
query results.  If we are willing to break the API here, we can change  
that.

The following items have yet to be done in both the C API and Python  
implementations:
- TPR/MVR trees.  A bit of work needs to be done, but the basics are  
there
- Point storage.  insert will determine that the min and max values  
are the same, and we will insert a SpatialIndex::Point instead of a  
SpatialIndex::Region
- There's a performance issue causing memory and disk indexes to query  
at the same speed.  I'm sure its something stupid I'm doing...
- Implementation of a query to find the bounds of the entire index  
(quickly).
- Bulk insertion

I plan to continue working on these items throughout the week.  I  
would be interested in folks testing what's there if they have a  
chance, but I will warn you that it requires using a SVN head version  
of libspatialindex 
<http://svn.gispython.org/svn/spatialindex/spatialindex/trunk/ 
 >.  I have fixed a number of bugs along the way while implementing  
this, and one of them was a show-stopper.

As far as Sean's proposal to implement Zope's indexing interface atop  
of Rtree, it sounds neat to me, but I'm not signing up to do it :)

Howard


_______________________________________________
Community mailing list
[email protected]
http://lists.gispython.org/mailman/listinfo/community

Reply via email to