On Mar 26, 2013, at 11:02 PM, David Fawcett <david.fawc...@gmail.com> wrote: > Obviously something went very wrong on the last two. They use the RTree, but > are an order of magnitude slower than the most basic RTree example. I am > definitely curious if my code is bad or if it has to do with the way that > geometries and properties are store within the RTree.
A few points about using clustered indexes (store the data in the index itself) with Rtree. First, it is very sensitive to the page size parameter that is set when the index is created. The default page size is quite low because the base example is to use Rtree in a non-clustered configuration and just store indices. Too large of a page size means bloating up the index (and on-disk footprint) of the index. Too small of a page size means reallocating and moving tons of bytes around for each insert to make the GeoJSON/WKB/WKT of the geometry fit in the index. Secondly, you can override/control the serialization that happens when items are inserted or removed from the index. For shapely geoms, the default probably i/o's through GeoJSON (didn't look), but you could make it i/o through WKB and it would likely be faster and more compact. Rtree just stores pickles. The faster your pickles are, the faster the serialization/deserialization will happen out of the index. Finally, clustered index storage is a lazy man's not-so-fast spatial database. There's some threshold of number of searches x number of items where it crosses the realm of usefulness and performant-enough, but it's quite sensitive to the configuration of a number of things. Hope this helps, Howard _______________________________________________ Community mailing list Community@lists.gispython.org http://lists.gispython.org/mailman/listinfo/community