On 04/18/2012 11:06 PM, Jason Roberts wrote:
For scenarios involving large numbers of features, I suspect it is much
harder to do it fast and within available memory. It is probably necessary
to do a multi-pass approach, where the first step operates only on the
spatial indexes of the layers involved. It is probably too slow--or at least
very suboptimal--to even read all of the features into memory, much less
call GEOS on them; all of that would happen in a later pass, when a subset
of features was identified via the spatial index pass. I suspect all of this
must be implemented in the core OGR code, where it is possible to quickly
and selectively probe spatial indexes, unless OGR wants to expose an
abstraction allowing higher level code to do that. I could definitely
appreciate the view that either of these approaches is significant scope
creep.
I agree that having a spatial index in OGR would be useful in many
cases. Regarding the methods I propose the index would help of course,
but IMO the index is not their responsibility. The algorithms usually
simply do this (with variations - and I don't see much possibilities for
speed ups) for a method C = A->method(B):
iterate trough all features x of A // the spatial filter of A may be in
effect
set (or include) x.geom into the spatial filter of B
iterate trough all features y of B // the spatial filter of B is in
effect
Thus the key issue is to have the index help the iteration. A
possibility would be to have a pair of layer methods:
InstallSpatialIndex() // iterate through all features and build the
index into memory
UninstallSpatialIndex() // remove the index
There are free R-tree implementations which could be used (for example
http://www.superliminal.com/sources/sources.htm#C%20&%20C++%20Code) and
if available those methods would then use that. They could also be
simple no-ops if R-tree or something else is not available. The
GetNextFeature method would then use the spatial index if available.
Hm. The validity of the index is of course an issue - all changes to
features and their geometries either need to be recorded into it or
invalidate it. I would probably assume that the index is valid only the
minimal time - basically leave it up to the user to recreate it.
After I get working examples of the layer methods I propose, I may look
into that.
Cheers,
Ari
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev