Re: [gdal-dev] GDAL 2.3.1 on linux disregarding nodata values

Craig Bruce Fri, 06 Jul 2018 13:11:58 -0700

On 07/06/2018 07:14 AM, Even Rouault wrote:

  NoData Value=3.40282346638529011e+38
  Metadata:
    STATISTICS_MAXIMUM=3.4028234663853e+38

ok, the issue is that nodata value (stored as text in GeoTIFF) is slightly 
above the maximum value of a float32. Presumably due to rounding issues when 
formatting it. I've pushed a fix to detect that situation and clamp it to the 
max value of float32. A bit strange that the issue wasn't found on Windows 
though.

A wider issue is that if the NoData value is not a fully-representable integer, you can't expect a direct comparison to work every time. You also can't expect the text-encoded NoData value to match the binary value exactly, since people do silly things like truncate the precision or generate 'float' values using 'double' computations. The following C program:

#include <stdio.h> #include <stdlib.h> #include <math.h> #include <float.h> int main() { float f = (float) atof("3.40282346638529011e+38"); double d = 3.40282346638529011e+38; long double e = 3.40282346638529011e+38L; printf( "float: %.18g (%.18g, eq=%d)\n", f, (float) FLT_MAX, f == FLT_MAX ); printf( "double: %.18g (%.18g)\n", d, (double) (float) FLT_MAX ); printf( "ldouble: %.19Lg (%.19Lg)\n", e, powl(2, 128) - 1 ); return( 0 ); }
produces the output on Linux/GCC of:

float: 3.4028234663852886e+38 (3.4028234663852886e+38, eq=1) double: 3.40282346638529011e+38 (3.4028234663852886e+38) ldouble: 3.40282346638529011e+38 (3.402823669209384635e+38)

Which means the given NoData value doesn't actually overflow the 'float' representation and I get an equality when executing the comparison using 'float's. The given value here is presumably meant to be 2^128-1, though it doesn't match this at any precision. The text is representable as a 'double' but not as 'float', which indicates some kind of tortured origin story.

I don't know how GDAL does the comparison, but it needs to be done carefully. In general, 'float's only keep 6 digits of accuracy when converting from decimal → binary → decimal and need 9 decimal digits to represent every conversion of binary → decimal → binary. I do comparisons in my own code using 'double's after running the NoData value through 'float' and back again in this case and then use a comparison tolerance of ±(NoData × 1e-7) for 'float' samples (maybe it should be 1e-6 to handle more arbitrary precision truncations).

With the available information, if GDAL did the comparison using 'float's, the (float) STATISTICS_MAXIMUM == FLT_MAX == (float) 3.40282346638529011e+38 and the NoData value would be detected properly. If, OTOH, GDAL is representing NoData as 'double' and comparing this directly to 'float' samples, then the values wouldn't match, because the NoData text value does not match (double) FLT_MAX.

--
Dr. Craig S. Bruce
Senior Software Developer
CubeWerx Inc.
http://www.cubewerx.com

_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] GDAL 2.3.1 on linux disregarding nodata values

Reply via email to