Gary Lucas created IMAGING-259:
----------------------------------
Summary: Enhance TIFF DataReaders speed for compressed RGB
Key: IMAGING-259
URL: https://issues.apache.org/jira/browse/IMAGING-259
Project: Commons Imaging
Issue Type: Improvement
Components: Format: TIFF
Reporter: Gary Lucas
TIFF files support many different formats, some of them legacy or specialty
formats, others that are widely used. DataReaderStrips and DataReaderTiled were
originally written with a single block of code that collected the raw data
(samples) for each pixel and then passed it into a single method that branched
depending on the format. This approach meant that for each pixel, the reader
loops had the extra overhead of a method call that executed multiple
conditional evaluations. In 2012, enhancements were added to imaging to execute
dedicated blocks of code for a few commonly used formats, most notably 3-byte
RGB. However, at this time, the code does not support the case where the RGB
is stored with a differencing predictor. Predictors improve the compression
ratios (often significantly) when compressing RGB images. So I propose to
enhance the dedicated RGB code to support predictors.
Here's an example of some performance testing on a large image that uses
compression with imaging
Processing file: CONUS_LandWaterMask_LZW_RGB.tif (original)
image size: 6000 by 4000
time to load image -- memory
time ms avg ms -- used mb total mb
971.817 0.000 -- 213.592 252.000
921.690 0.000 -- 143.229 260.000
895.587 895.587 -- 96.234 174.000
899.227 897.407 -- 117.259 154.000
899.078 897.964 -- 134.200 184.000
889.602 895.873 -- 143.226 180.000
896.170 895.933 -- 128.183 188.000
894.250 895.652 -- 97.187 178.000
896.436 895.764 -- 103.226 186.000
891.540 895.236 -- 119.185 171.000
Processing file: CONUS_LandWaterMask_LZW_RGB.tif (with chamges)
image size: 6000 by 4000
time to load image -- memory
time ms avg ms -- used mb total mb
498.123 0.000 -- 212.589 252.000
423.136 0.000 -- 110.733 237.000
396.021 396.021 -- 100.735 164.000
400.435 398.228 -- 115.725 160.000
400.901 399.119 -- 114.726 162.000
395.092 398.112 -- 118.711 159.000
394.106 397.311 -- 118.710 159.000
400.866 397.903 -- 118.710 159.000
400.972 398.342 -- 115.710 160.000
397.218 398.201 -- 109.691 164.000
Additionally, the special-purpose RGB block of code included additional logic
to support a case for non-RGB formats where image samples were organized 3 one
byte samples, but the photometric interpretation was not RGB. According to
Coveralls, this block of code is not exercised by any of our test images. Thus
that part of the code is uncovered by testing. So I will be removing it to
improve the code-coverage scores. I believe that this change is appropriate
because, even if there are TIFF files "in the wild" that use this
configuration, the commons imaging library will still work properly. In such a
case, the image samples would be handled properly by the original,
non-specialized block of code. Furthermore, I went through the TIFF
specification and did not see any obvious examples of a case where that
configuration would be likely.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)