On 04/18/2012 11:06 PM, Jason Roberts wrote:
For scenarios involving large numbers of features, I suspect it is much
harder to do it fast and within available memory. It is probably necessary
to do a multi-pass approach, where the first step operates only on the
spatial indexes of the layers involved. It is probably too slow--or at least
very suboptimal--to even read all of the features into memory, much less
call GEOS on them; all of that would happen in a later pass, when a subset
of features was identified via the spatial index pass. I suspect all of this
must be implemented in the core OGR code, where it is possible to quickly
and selectively probe spatial indexes, unless OGR wants to expose an
abstraction allowing higher level code to do that. I could definitely
appreciate the view that either of these approaches is significant scope
creep.

I agree that having a spatial index in OGR would be useful in many cases. Regarding the methods I propose the index would help of course, but IMO the index is not their responsibility. The algorithms usually simply do this (with variations - and I don't see much possibilities for speed ups) for a method C = A->method(B):

iterate trough all features x of A // the spatial filter of A may be in effect
    set (or include) x.geom into the spatial filter of B
iterate trough all features y of B // the spatial filter of B is in effect

Thus the key issue is to have the index help the iteration. A possibility would be to have a pair of layer methods:

InstallSpatialIndex() // iterate through all features and build the index into memory
UninstallSpatialIndex() // remove the index

There are free R-tree implementations which could be used (for example http://www.superliminal.com/sources/sources.htm#C%20&%20C++%20Code) and if available those methods would then use that. They could also be simple no-ops if R-tree or something else is not available. The GetNextFeature method would then use the spatial index if available.

Hm. The validity of the index is of course an issue - all changes to features and their geometries either need to be recorded into it or invalidate it. I would probably assume that the index is valid only the minimal time - basically leave it up to the user to recreate it.

After I get working examples of the layer methods I propose, I may look into that.

Cheers,

Ari

_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to