Homme, > > I've come up against a problem with `gdalbuildvrt` taking a long time to > create > a VRT when it is passed a large number of source datasets. I am trying > to create > a VRT file for a zoom level in a TMS structure containing JPEG tiles. The > command I'm using is: > > gdalbuildvrt output.vrt `find ./tiles/18 -iname *.jpg -printf "%p "` > > where the number of tiles is: > > $ find ./tiles/18 -iname *.jpg | wc -l > 767104 > > The processing seemed to progress reasonably quickly with the progress bar > outputing '0... etc ...100 - done'. However `gdalbuildvrt` continued > running > until I killed it 8 hours later. Looking at `output.vrt` just before I > killed > the program showed it remained empty (0 bytes).
I've looked up a bit at the code, and I spotted a potential performance problem when serialing the in-memory VRT into the XML with a big number of sources. I've just committed an improvement into trunk that will make the complexity of source serialization linear instead of quadratic. > > Before digging any deeper is there something I'm missing? Am I expecting > too much of `gdalbuildvrt`, or indeed the VRT format, in processing this > many source > datasets? > > Conceptually in this instance it seems as if it would be useful for a > VRT file > (and `gdalbuildvrt`) to reference the output of `gdaltindex` or something > similar. I'm not sure how efficiently source datasets are indexed in > VRTs and > whether this might be contributing to the problem? There's no indexing in VRT. So yes for that big number of sources, there might be performance problems since each RasterIO() request will have to go test if each source interstects the requested area of interest. Adding an in-memory spatial index after opening the VRT would likely be possible, provided that the non neglectable size of the VRT/XML doesn't make opening it too slow. That depends on the use cases. Yes, perhaps referencing a shapefile tile index could be a possible enhancement. Even -- Spatialys - Geospatial professional services http://www.spatialys.com _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
