Re: [Community] Designing new types of Rtree indexes

Howard Butler Fri, 24 Jul 2009 08:49:31 -0700

On Jul 22, 2009, at 10:18 AM, Howard Butler wrote:
> The following items have yet to be done in both the C API and Python  
> implementations:
> - TPR/MVR trees.  A bit of work needs to be done, but the basics are  
> there


After some investigation, these tree types require some new object  
types in the C/Python APIs.  I have no need for them right now, so I  
am going to table this.  If someone wants/needs them, I can help  
implement.

> - Point storage.  insert will determine that the min and max values  
> are the same, and we will insert a SpatialIndex::Point instead of a  
> SpatialIndex::Region

implemented.

> - There's a performance issue causing memory and disk indexes to  
> query at the same speed.  I'm sure its something stupid I'm doing...

fixed.

> - Implementation of a query to find the bounds of the entire index  
> (quickly).

done.

> - Bulk insertion

How do people typically use Rtree right now?  Is the typical usage a  
dump of a large amount of data and then query? Or, is it incrementally  
inserting records into the index over time?

I'd be interested to hear opinions, but I think the bulk insertion of  
an index would happen at instantiation time.  You would give it an  
iterator that dereferences to objects containing your points/boxes to  
insert (use __geo_interface__ here?), and it would use the BulkLoading  
strategy to create a new RTree and load your data.  This is easily an  
order of magnitude faster than incremental insertion, so it is worth  
it for large chunks of data.  Only the RTree variant supports bulk  
loading, not MVR or TPR trees.  Would this be a sensible approach, or  
should bulk loading be made more "special'?

Howard

_______________________________________________
Community mailing list
[email protected]
http://lists.gispython.org/mailman/listinfo/community

Re: [Community] Designing new types of Rtree indexes

Reply via email to