No comments on the below anyone? It does seem like a genuine bug that gdal_calc.py can produce output with statistics not up to date?
Should I create an issue in the issue tracker? -- Patrick. On Thu, Jun 19, 2014 at 2:45 PM, Patrick Sunter <[email protected]> wrote: > Hi there, > > I've been using gdal_calc.py as part of a workflow where I need to compute > the average of several raster files and save as a new output. (Using a > binary build of GDAL, v 1.10.0, on Mac Os X). > > It is a fairly standard calc operation (e.g. "(A+B+C)/3") but because the > original data are GeoTiff's storing data in a Byte format, I first create > VRT files referencing these with data type set to be Float32, then perform > the calc operation with these VRT files as input. This is to prevent > 'overflow' problems in the midst of the averaging operation given the size > limit of the Byte format (see > http://gis.stackexchange.com/questions/33152/why-do-i-get-different-results-with-gdal-calc-within-the-osgeo4w-shell-and-qgis > ) > > The output GeoTiff created by gdal_calc.py seems to be have correct values > at every point - but the saved statistics metadata are wrong - > specifically, the minimum value. > > An example is doing a gdalinfo -stats on one of these files, I get the > following: > Min=21.000 Max=128.000 > Minimum=21.000, Maximum=128.000, Mean=81.692, StdDev=29.397 > > but I know the minimum value should be 0, and this is confirmed by > inspection in QGIS. I also wrote a script to open up the raster in Python > and run the following queries: > > print "GetStatistics(): " + str(band_in.GetStatistics(True, True)) > print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax()) > print "Hand calc. min: " + str(min(map(min, raster_values))) > > I get: > GetStatistics(): [21.0, 128.0, 81.692344139650999, 29.397484687917] > ComputeRasterMinMax(): (0.0, 120.0) > Hand calc. min: 0 > > So you can see that ComputeRasterMinMax() is returning the right values, > as well as my manual calc using Python's min function. > > As a user I presume its reasonable for the output of a gdal_calc.py to > have latest stats stored correctly on any output it produces? So what is > going wrong? > > Maybe its something to do with doing a calc on a set of VRT files using a > different type to the original. This is also a case where the output file > already exists. > > Looking at the gdal_calc.py I guess a simple fix is to force an update of > statistic values using ComputeRasterMinMax() before closing the newly > created file, but maybe there is a better solution. > > cheers, Patrick. > > BTW example gdal_calc.py command: > gdal_calc.py -A > ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_45_00-BZEV1.vrt > -B > ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_50_00-BZEV1.vrt > -C > ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_55_00-BZEV1.vrt > -D > ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1.vrt > -E > ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_05_00-BZEV1.vrt > -F > ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_10_00-BZEV1.vrt > -G > ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_15_00-BZEV1.vrt > --outfile=./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1-avg7.tiff > --calc="(A+B+C+D+E+F+G)/7" --type=Byte --overwrite > > And full testing script on output: > #!/usr/bin/env python > from osgeo import ogr > from osgeo import gdal > > > in_file="./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/CHADSTONE-22_00_00-BZEV1-avg7.tiff" > > ds_in = gdal.Open(in_file) > band_in = ds_in.GetRasterBand(1) > xsize_in = band_in.XSize > ysize_in = band_in.YSize > raster_values = band_in.ReadAsArray(0, 0, xsize_in, ysize_in) > > print "GetStatistics(): " + str(band_in.GetStatistics(True, True)) > print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax()) > print "Hand calc. min: " + str(min(map(min, raster_values))) > > I can provide some sample input if needed for testing too. > >
_______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
