[jira] [Updated] (IMAGING-259) Enhance TIFF DataReaders speed for compressed RGB

Gary Lucas (Jira) Fri, 29 May 2020 06:46:14 -0700


     [ 
https://issues.apache.org/jira/browse/IMAGING-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Gary Lucas updated IMAGING-259:
-------------------------------
    Description: 
TIFF files support many different formats, some of them legacy or specialty 
formats, others that are widely used. DataReaderStrips and DataReaderTiled were 
originally written with a single block of code that collected the raw data 
(samples) for each pixel and then passed it into a single method that branched 
depending on the format.  This approach meant that for each pixel, the reader 
loops had the extra overhead of a method call that executed multiple 
conditional evaluations. In 2012, enhancements were added to imaging to execute 
dedicated blocks of code for a few commonly used formats, most notably 3-byte 
RGB.  However, at this time, the code does not support the case where the RGB 
is stored with a differencing predictor. Predictors improve the compression 
ratios (often significantly) when compressing RGB images. So I propose to 
enhance the dedicated RGB code to support predictors.

Here's an example of some performance testing on a large image that uses 
compression with imaging. The time to load images was extracted using the 

Processing file: CONUS_LandWaterMask_LZW_RGB.tif (original)
image size: 6000 by 4000

 

{noformat}
Processing file: CONUS_LandWaterMask_LZW_RGB.tif (original)
image size: 6000 by 4000

time to load image    --         memory
time ms      avg ms   --    used mb   total mb
 971.817     0.000    --    213.592   252.000 
 921.690     0.000    --    143.229   260.000 
 895.587   895.587    --     96.234   174.000 
 899.227   897.407    --    117.259   154.000 
 899.078   897.964    --    134.200   184.000 
 889.602   895.873    --    143.226   180.000 
 896.170   895.933    --    128.183   188.000 
 894.250   895.652    --     97.187   178.000 
 896.436   895.764    --    103.226   186.000 
 891.540   895.236    --    119.185   171.000 

Processing file: CONUS_LandWaterMask_LZW_RGB.tif (with changes)
 image size: 6000 by 4000

time to load image    --         memory
time ms      avg ms   --    used mb   total mb
 498.123     0.000    --    212.589   252.000 
 423.136     0.000    --    110.733   237.000 
 396.021   396.021    --    100.735   164.000 
 400.435   398.228    --    115.725   160.000 
 400.901   399.119    --    114.726   162.000 
 395.092   398.112    --    118.711   159.000 
 394.106   397.311    --    118.710   159.000 
 400.866   397.903    --    118.710   159.000 
 400.972   398.342    --    115.710   160.000 
 397.218   398.201    --    109.691   164.000 
{noformat}

 

 
Additionally, the special-purpose RGB block of code included additional logic 
to support a case for non-RGB formats where image samples were organized 3 one 
byte samples, but the photometric interpretation was not RGB.  According to 
Coveralls, this block of code is not exercised by any of our test images.  Thus 
that part of the code is uncovered by testing. So I will be removing it to 
improve the code-coverage scores. I believe that this change is appropriate 
because, even if there are TIFF files "in the wild" that use this 
configuration, the commons imaging library will still work properly.  In such a 
case, the image samples would be handled properly by the original, 
non-specialized block of code.    Furthermore, I went through the TIFF 
specification and did not see any obvious examples of a case where that 
configuration would be likely.

  was:
TIFF files support many different formats, some of them legacy or specialty 
formats, others that are widely used. DataReaderStrips and DataReaderTiled were 
originally written with a single block of code that collected the raw data 
(samples) for each pixel and then passed it into a single method that branched 
depending on the format.  This approach meant that for each pixel, the reader 
loops had the extra overhead of a method call that executed multiple 
conditional evaluations. In 2012, enhancements were added to imaging to execute 
dedicated blocks of code for a few commonly used formats, most notably 3-byte 
RGB.  However, at this time, the code does not support the case where the RGB 
is stored with a differencing predictor. Predictors improve the compression 
ratios (often significantly) when compressing RGB images. So I propose to 
enhance the dedicated RGB code to support predictors.

Here's an example of some performance testing on a large image that uses 
compression with imaging

Processing file: CONUS_LandWaterMask_LZW_RGB.tif (original)
image size: 6000 by 4000

time to load image -- memory
time ms avg ms -- used mb total mb
 971.817 0.000 -- 213.592 252.000 
 921.690 0.000 -- 143.229 260.000 
 895.587 895.587 -- 96.234 174.000 
 899.227 897.407 -- 117.259 154.000 
 899.078 897.964 -- 134.200 184.000 
 889.602 895.873 -- 143.226 180.000 
 896.170 895.933 -- 128.183 188.000 
 894.250 895.652 -- 97.187 178.000 
 896.436 895.764 -- 103.226 186.000 
 891.540 895.236 -- 119.185 171.000

Processing file: CONUS_LandWaterMask_LZW_RGB.tif (with chamges)
 image size: 6000 by 4000

time to load image -- memory
time ms avg ms -- used mb total mb
 498.123 0.000 -- 212.589 252.000 
 423.136 0.000 -- 110.733 237.000 
 396.021 396.021 -- 100.735 164.000 
 400.435 398.228 -- 115.725 160.000 
 400.901 399.119 -- 114.726 162.000 
 395.092 398.112 -- 118.711 159.000 
 394.106 397.311 -- 118.710 159.000 
 400.866 397.903 -- 118.710 159.000 
 400.972 398.342 -- 115.710 160.000 
 397.218 398.201 -- 109.691 164.000

 
Additionally, the special-purpose RGB block of code included additional logic 
to support a case for non-RGB formats where image samples were organized 3 one 
byte samples, but the photometric interpretation was not RGB.  According to 
Coveralls, this block of code is not exercised by any of our test images.  Thus 
that part of the code is uncovered by testing. So I will be removing it to 
improve the code-coverage scores. I believe that this change is appropriate 
because, even if there are TIFF files "in the wild" that use this 
configuration, the commons imaging library will still work properly.  In such a 
case, the image samples would be handled properly by the original, 
non-specialized block of code.    Furthermore, I went through the TIFF 
specification and did not see any obvious examples of a case where that 
configuration would be likely.


> Enhance TIFF DataReaders speed for compressed RGB
> -------------------------------------------------
>
>                 Key: IMAGING-259
>                 URL: https://issues.apache.org/jira/browse/IMAGING-259
>             Project: Commons Imaging
>          Issue Type: Improvement
>          Components: Format: TIFF
>            Reporter: Gary Lucas
>            Priority: Minor
>
> TIFF files support many different formats, some of them legacy or specialty 
> formats, others that are widely used. DataReaderStrips and DataReaderTiled 
> were originally written with a single block of code that collected the raw 
> data (samples) for each pixel and then passed it into a single method that 
> branched depending on the format.  This approach meant that for each pixel, 
> the reader loops had the extra overhead of a method call that executed 
> multiple conditional evaluations. In 2012, enhancements were added to imaging 
> to execute dedicated blocks of code for a few commonly used formats, most 
> notably 3-byte RGB.  However, at this time, the code does not support the 
> case where the RGB is stored with a differencing predictor. Predictors 
> improve the compression ratios (often significantly) when compressing RGB 
> images. So I propose to enhance the dedicated RGB code to support predictors.
> Here's an example of some performance testing on a large image that uses 
> compression with imaging. The time to load images was extracted using the 
> Processing file: CONUS_LandWaterMask_LZW_RGB.tif (original)
> image size: 6000 by 4000
>  
> {noformat}
> Processing file: CONUS_LandWaterMask_LZW_RGB.tif (original)
> image size: 6000 by 4000
> time to load image    --         memory
> time ms      avg ms   --    used mb   total mb
>  971.817     0.000    --    213.592   252.000 
>  921.690     0.000    --    143.229   260.000 
>  895.587   895.587    --     96.234   174.000 
>  899.227   897.407    --    117.259   154.000 
>  899.078   897.964    --    134.200   184.000 
>  889.602   895.873    --    143.226   180.000 
>  896.170   895.933    --    128.183   188.000 
>  894.250   895.652    --     97.187   178.000 
>  896.436   895.764    --    103.226   186.000 
>  891.540   895.236    --    119.185   171.000 
> Processing file: CONUS_LandWaterMask_LZW_RGB.tif (with changes)
>  image size: 6000 by 4000
> time to load image    --         memory
> time ms      avg ms   --    used mb   total mb
>  498.123     0.000    --    212.589   252.000 
>  423.136     0.000    --    110.733   237.000 
>  396.021   396.021    --    100.735   164.000 
>  400.435   398.228    --    115.725   160.000 
>  400.901   399.119    --    114.726   162.000 
>  395.092   398.112    --    118.711   159.000 
>  394.106   397.311    --    118.710   159.000 
>  400.866   397.903    --    118.710   159.000 
>  400.972   398.342    --    115.710   160.000 
>  397.218   398.201    --    109.691   164.000 
> {noformat}
>  
>  
> Additionally, the special-purpose RGB block of code included additional logic 
> to support a case for non-RGB formats where image samples were organized 3 
> one byte samples, but the photometric interpretation was not RGB.  According 
> to Coveralls, this block of code is not exercised by any of our test images.  
> Thus that part of the code is uncovered by testing. So I will be removing it 
> to improve the code-coverage scores. I believe that this change is 
> appropriate because, even if there are TIFF files "in the wild" that use this 
> configuration, the commons imaging library will still work properly.  In such 
> a case, the image samples would be handled properly by the original, 
> non-specialized block of code.    Furthermore, I went through the TIFF 
> specification and did not see any obvious examples of a case where that 
> configuration would be likely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (IMAGING-259) Enhance TIFF DataReaders speed for compressed RGB

Reply via email to