More specifically, I'm guessing the MOST efficient is to read whole blocks from one band, which presumably avoids crossing compression boundaries etc.
A block may be a scanline, or it may be a tile, depending on the format. On Mon, Nov 1, 2021 at 2:29 PM Simon Eves <[email protected]> wrote: > Band by band makes sense. I shall do that instead. Thank you! :) > > On Mon, Nov 1, 2021 at 2:05 PM Even Rouault <[email protected]> > wrote: > >> Yes regarding multithreading. Regarding GRIB and performance issues, you >> must be aware that the GRIB driver when accessing a single pixel of a band >> needs to decompress data for the whole band. Hence there's a per-dataset >> cache of band data which default to 100 MB (you can increase it by setting >> the GRIB_CACHEMAX config option to a number in megabytes). So the most >> performance access pattern for GRIB is to read band per band, and no >> all-bands-of-a-line >> Le 01/11/2021 à 21:58, Simon Eves a écrit : >> >> You can ignore this. >> >> I have rather belatedly found the documentation that says that one must >> open a GDALDataset per thread, even if it's on the same file. >> >> The multi-threading now works just fine. >> >> Interestingly, we're not actually doing that with our existing geo >> importer. I guess it's OK because we're pulling the OGRFeatures out with >> the process thread, and only converting and loading them with the child >> threads. I guess I really ought to rewrite that code too now. Sigh. >> >> As you were... >> >> Simon >> >> On Sun, Oct 31, 2021 at 4:27 PM Simon Eves <[email protected]> >> wrote: >> >>> We are writing a raster importer, and finding that >>> GDALRasterBand::RasterIO() is unexpectedly slow for some GRIB2 files. >>> >>> We have a file which is about 1800x1000 pixels, with 49 bands of type >>> DOUBLE. The file is about 47MB on disc. >>> >>> Reading all the bands of a single scanline from this file takes about >>> 1300ms, which is about 26ms per band, hence the entire file takes around 20 >>> minutes to import. All the time seems to be spent in the RasterIO() call, >>> even though it's not doing any raster rescaling or data format conversion >>> (1:1 pixels, fetching as GDT_Float64). >>> >>> So, I figured we'd try multi-threading it, but evidently the call is not >>> thread-safe. Here is just one of various stack traces it will throw. >>> >>> libc.so.6!raise (Unknown Source:0) >>> libc.so.6!abort (Unknown Source:0) >>> libc.so.6![Unknown/Just-In-Time compiled code] (Unknown Source:0) >>> libgdal.so.28!GRIBRasterBand::UncacheData(GRIBRasterBand * const this) >>> (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:948) >>> libgdal.so.28!GRIBRasterBand::LoadData(GRIBRasterBand * const this) >>> (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:730) >>> libgdal.so.28!GRIBRasterBand::LoadData(GRIBRasterBand * const this) >>> (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:697) >>> libgdal.so.28!GRIBRasterBand::IReadBlock(GRIBRasterBand * const this, >>> int nBlockYOff, void * pImage) >>> (/build/scripts/gdal-3.2.2/frmts/grib/gribdataset.cpp:803) >>> libgdal.so.28!GDALRasterBand::GetLockedBlockRef(int bJustInitialize, int >>> nYBlockOff, int nXBlockOff, GDALRasterBand * const this) >>> (/build/scripts/gdal-3.2.2/gcore/gdal_priv.h:963) >>> libgdal.so.28!GDALRasterBand::GetLockedBlockRef(GDALRasterBand * const >>> this, int nXBlockOff, int nYBlockOff, int bJustInitialize) >>> (/build/scripts/gdal-3.2.2/gcore/gdalrasterband.cpp:1238) >>> libgdal.so.28!GDALRasterBand::IRasterIO(GDALRasterBand * const this, >>> GDALRWFlag eRWFlag, int nXOff, int nYOff, int nXSize, int nYSize, void * >>> pData, int nBufXSize, int nBufYSize, GDALDataType eBufType, GSpacing >>> nPixelSpace, GSpacing nLineSpace, GDALRasterIOExtraArg * psExtraArg) >>> (/build/scripts/gdal-3.2.2/gcore/rasterio.cpp:149) >>> libgdal.so.28!GDALRasterBand::RasterIO(GDALRasterBand * const this, >>> GDALRWFlag eRWFlag, int nXOff, int nYOff, int nXSize, int nYSize, void * >>> pData, int nBufXSize, int nBufYSize, GDALDataType eBufType, GSpacing >>> nPixelSpace, GSpacing nLineSpace, GDALRasterIOExtraArg * psExtraArg) >>> (/build/scripts/gdal-3.2.2/gcore/gdalrasterband.cpp:372) >>> import_export::Importer::<lambda(size_t, int)>::operator()(size_t, int) >>> const(const import_export::Importer::<lambda(size_t, int)> * const >>> __closure, const size_t thread_id, const int y) >>> (/home/simon.eves/work/omniscidb-internal/ImportExport/Importer.cpp:5721) >>> ... >>> >>> All of the parameters to the call are either constant or uncontended >>> simple variables, and obviously there is a unique data buffer (pData) per >>> thread. >>> >>> Is there anything we can do to make this work? >>> >>> I was intending to look into the lower level block-based API, in the >>> hope that it will be faster, but have not yet done so. >>> >>> This is all with a local static build of GDAL 3.2.2 on Ubuntu 20.04 with >>> GCC 9. >>> >>> Yours, >>> >>> Simon Eves >>> >>> -- >>> <http://www.omnisci.com/> >>> Simon Eves >>> Senior Graphics Engineer, Rendering Group >>> 100 Montgomery St (5th Floor), San Francisco, CA 94104, USA >>> >>> >>> Email: [email protected] | Cell: +1 (415) 902-1996 >>> >>> >> >> -- >> <http://www.omnisci.com/> >> Simon Eves >> Senior Graphics Engineer, Rendering Group >> 100 Montgomery St (5th Floor), San Francisco, CA 94104, USA >> >> >> Email: [email protected] | Cell: +1 (415) 902-1996 >> >> >> _______________________________________________ >> gdal-dev mailing >> [email protected]https://lists.osgeo.org/mailman/listinfo/gdal-dev >> >> -- http://www.spatialys.com >> My software is free, but my time generally not. >> >> > > -- > <http://www.omnisci.com/> > Simon Eves > Senior Graphics Engineer, Rendering Group > 100 Montgomery St (5th Floor), San Francisco, CA 94104, USA > > > Email: [email protected] | Cell: +1 (415) 902-1996 > > -- <http://www.omnisci.com/> Simon Eves Senior Graphics Engineer, Rendering Group 100 Montgomery St (5th Floor), San Francisco, CA 94104, USA Email: [email protected] | Cell: +1 (415) 902-1996
_______________________________________________ gdal-dev mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/gdal-dev
