Improvements have hit master. I suspect there are some remaining bottlenecks though, but I currently lack the tests/means to investigate further in short term and will appreciate feedback.
Den tors 31 okt. 2019 02:17Björn Harrtell <bjorn.harrt...@gmail.com> skrev: > Thanks for trying out accessing FlatGeobuf via http. > > For the record I've been slightly aware of this particular efficiency > problem and I aim to improve it when I can get to it, because this is a use > case I definitely want FlatGeobuf to grab the first place. :) > > /Björn > > Den tors 24 okt. 2019 kl 20:05 skrev Even Rouault < > even.roua...@spatialys.com>: > >> On jeudi 24 octobre 2019 17:42:23 CEST Rahkonen Jukka (MML) wrote: >> > Hi, >> > >> > I was experimenting with accessing some vector files through http (same >> data >> > as FlatGeoBuffers, GeoPackage, and shapefile). The file size in each >> format >> > was about 850 MB and the amount of data was about 240000 linestrings. I >> > made ogrinfo request with spatial filter that selects one feature and >> > cheched the number of http requests and amount of requested data. >> > >> > FlatGeoBuffers >> > 19 http requests >> > 33046509 bytes read >> >> Looking at the debug log, FlatGeoBuf currently loads the whole index-of- >> features array( "Reading feature offsets index" ), which accounts for >> 32.7 MB >> of the above 33 MB. This could probably be avoided by only loading the >> offsets >> of the selected features. The shapefile driver a few years ago had the >> same >> issue and this was fixed by initializing the offset array to zeroes, and >> load >> on demand the offsets when needed. >> >> > If somebody >> > really finds a use case for reading vector data from the web it seems >> > obvious that having a possibility to cache and re-use the spatial index >> > would be very beneficial. I can imagine that with shapefile it would >> mean >> > downloading the .qix file, with GeoPackage reading the contents of the >> > rtree index table, and with FlatGeoBuffers probably extracting the >> Static >> > packed Hilbert R-tree index. >> >> A general caching logic in /vsicurl/ would be preferable (although the >> download of the 'data' part of files might potentially evict the indexes, >> but >> having a dedicated logic in each driver to tell which files / region of >> the >> files should be cached would be a bit annoying). Basically doing a HEAD >> request on the file to get its last update date, and have a local cache >> of >> downloaded pieces would be a more general solution. >> >> Even >> >> -- >> Spatialys - Geospatial professional services >> http://www.spatialys.com >> >
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev