Hi Sean, I initialise datatmp using CPLMalloc in the C++
T* dataTmp = (T*)CPLMalloc(sizeof(T) *(this->dspRastXSize * this->dspRastYSize)); this is then freed using CPLFree once finished with. I do re-malloc it for each band we want to read in a raster so that does add some slowdown that could probably be avoided. The python is initialised using ReadAsArray. I have taken a look at rasterio, and was of a mind to use it, but we are wanting to keep any more dependencies to an absolute minimum, so the C++ is being implemented with a check to see if it's available, if not the program will use the current method. >Hi Gareth, > >How are you initializing your dataTmp array? In my Rasterio project, I've >found that numpy.empty() is the fastest array allocator and use it whenever >possible. I also use the GDAL C API and Cython (in case you're interested: >https://github.com/mapbox/rasterio/blob/c80b568903ef7b902ce6254a42c73af9ddcc8362/rasterio/_io.pyx#L58-L69) >and find the performance to be as good as ReadAsArray. On Wed, May 11, 2016 at 10:53 AM, Gareth James Jones [gjj12] < [email protected]> wrote: >> I'm currently writing optimisations for a raster viewer program which uses >> gdal as it's base. It's currently written purely in python, and has some >> major speed issues which cause problems when we are reading many files at a >> time. After making some optimisations in the python, and getting quite a >> minimal speed increase, I proceeded to profile the program quite heavily >> and found that our getImage method was our slowest call. I had already >> performed some optimisations on this function so decided to write a >> C-Extension so that we could get some speed increases through a lower level >> language. >> >> This has worked for the most part, however there is still one issue, we >> have found a speed increase of ~2s for some of our larger files in the bulk >> of the code. But this is negated by the GDALRasterIO call, which is >> actually about 3s slower than the python ReadAsArray. >> >> This doesn't make any sense to me as ReadAsArray is a wrapper around a C++ >> call to GDALRasterIO, and thus should be slower than having a call straight >> to GDALRasterIO. >> >> I was hoping someone here might know of a way to read the rasters more >> efficiently. I have tried to implement a method using ReadBlock rather than >> RasterIO, but due to the replication that RasterIO does it didn't work at >> all. (I'm currently trying to figure out a way to do that replication >> without losing too much speed). >> >> The RasterIO call i'm using is >> >> band->RasterIO(GF_Read, this->ovleft, this->ovtop, this->ovxsize, >> this->ovysize, dataTmp, this->ovxsize, this->ovysize, >> band->GetRasterDataType(), 0, 0); >> >> The old python call was: >> >> dataTmp = band.ReadAsArray(ovleft, ovtop, >> ovxsize, ovysize, >> dspRastXSize, dspRastYSize) >> >> >> Thanks in advance >> >> Gareth Jones >> >> _______________________________________________ >> gdal-dev mailing list >> [email protected] >> http://lists.osgeo.org/mailman/listinfo/gdal-dev >> >-- >Sean Gillies Gareth Jones -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20160511/46be7326/attachment-0001.html> ------------------------------ Subject: Digest Footer _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev ------------------------------ End of gdal-dev Digest, Vol 144, Issue 40 ***************************************** _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
