Hi, Following a recent discussion on using PREDICTOR=2 with COMPRESS=DEFLATE with TIFF, I've implemented in trunk a trick suggested by Adobe in the TIFF specification to improve the effectiveness of horizontal prediction (which is using the difference between consecutive pixels rather than their value)
The DISCARD_LSB=nbit creation option is an initial *lossy* compression step that will discard nbit least-significant bits of the pixel values. A different value can be specified per band with nbit_band1,nbit_band2,...nbit_bandN. A more practical view of this is that it decreases the number of colors per channel. For example : gdal_translate world.topo.bathy.200406.3x21600x21600.C1.png out_lsb1.tif \ -co tiled=yes -co compress=deflate -co predictor=2 -co discard_lsb=1 gdal_translate world.topo.bathy.200406.3x21600x21600.C1.png out_lsb213.tif \ -co tiled=yes -co compress=deflate -co predictor=2 -co discard_lsb=2,1,3 Resulting file sizes on the above mentionned RGB BMNG tile (21600x21600 pixels): world.topo.bathy.200406.3x21600x21600.C1.png: 484 696 919 bytes out_lsb000.tif: 467 791 323 (i.e. lossless compression) out_lsb111.tif: 352 89 5108 out_lsb213.tif: 286 368 793 out_lsb222.tif: 259 788 627 out_lsb324.tif: 210 505 787 out_lsb333.tif: 184 807 316 out_lsb334.tif: 177 060 429 --> discard_lsb=1 has really nearly undetectable visual degradation. --> discard_lsb=2,1,3 : the rationale for that one is that the human eye is sensitive mostly to luminance, and in the usual computation of luminance from red, green, blue channels, the green channel has a weight of 72%, red 21% and blue 7%, so we discard more red bits than green bits, and more blue than red. Very good result overall. Some tiny artefacts can be seen in the blue gradients in the oceanic areas when watching closely. --> the more you increase the number of discarded bits, the more artifacts in blue gradients. Quality on land areas remains quite good. To be compared with JPEG compression (quality of 95% and 90%, YCbCr 4:2:0) : out_jpeg_95_ycbcr.tif: 108 487 980 out_jpeg_90_ycbcr.tif: 72 054 360 So JPEG compression is more efficient, doesn't exhibit the issue with blue gradients but has the typical JPEG artifacts with high frequencies. The advantage of DISCARD_LSB is that you have a guarantee on the error : it cannot exceed 2^(nbits-1) (and the mean error should be half ot that for evenly distributed values). It can also be used with RGBA images where JPEG YCbCr in TIFF cannot be used. Important note: this is only something done on compression side, and doesn't change the encoded scheme. So 100% compatibility with any DEFLATE + PREDICTOR compatible reader. Theoretically, that could be enhanced to do adaptative compression per tile (or even within a tile), by adjusting the number of discarded bits depending on the sensitiveness of the eye to the content. Whereas JPEG-in-TIFF doesn't allow this (quantization tables are common to the whole file). Happy experimentations ! Even -- Spatialys - Geospatial professional services http://www.spatialys.com _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
