[ 
https://issues.apache.org/jira/browse/IMAGING-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261089#comment-14261089
 ] 

Gary Lucas commented on IMAGING-126:
------------------------------------

I've put together a patch to address the main issue, the size of the output 
file.  After I do a bit more testing, I'll upload the patch.  In the mean time, 
I'm going to describe the proposed solution below. If you find it agreeable, 
feel free to add it to the code base.   If not, let me know and I'll do 
something else.

To recap the explanation above, the basic issue is that TIFF files are written 
in blocks (either strips or tiles) of a specified size in memory.  LZW 
compression tends to work better on bigger "texts", but the larger the text the 
more processing time it requires.  This change provides an option which allows 
an application to direct the TiffImageWriteBase class to increase the block 
size from about 8K to either 32K or 64K.

 Here's an example:
 
                HashMap<String, Object> params = new HashMap<String, Object>();

                params.put(
                    ImagingConstants.PARAM_KEY_COMPRESSION,
                    new Integer(TiffConstants.TIFF_COMPRESSION_LZW));

                params.put(
                    TiffConstants.PARAM_KEY_LZW_COMPRESSION_BLOCK_SIZE,
                    new 
Integer(TiffConstants.TIFF_LZW_COMPRESSION_BLOCK_SIZE_MEDIUM));

                File file = new File(tiffOutputPath);
                Imaging.writeImage(outputImage, file, ImageFormats.TIFF, 
params);
 
The "compression" parameter is already needed by the Imaging API, so I haven't 
introduced any new complexity to the API with that.  By adding the block-size 
parameter, the code instructs the TIFF writing to use a larger block size when 
storing the file.   Experimenting with the original test PNG image supplied for 
this tracker item, the original API resulted in an output image of 1063 
kilobytes.  Adding the new "medium" specification (32 K blocks) reduces it to 
506 kilobytes.  If I use the  64K block size specification, the size is reduced 
to 402 K.  

402 K is pretty much the "target size" suggested by the original posting.   
However, there is a cost in terms of performance.  The output operation takes 
approximately twice as long, averaging from 2.02 seconds for the original 8K 
block size to 4.04 secs for the 64K block size.  

One other thing.  In the original post, Tilman Hausherr provided a PNG image as 
the source image to be stored as a TIFF.  The PNG is only 50 kilobytes in size. 
 Why is the TIFF so much bigger?  Basically, the size of the TIFF could 
probably be reduced to something equivalent by using a technique known as 
horizontal differencing, which is not currently implemented in our TIFF writer. 
 With horizontal differencing, TIFF performs an extra step in its processing, 
taking the difference between subsequent pixels before sending them to the 
compressor.  Mr. Hausherr's original image features a lot of repeated pixels, 
so that the horizontal differences are almost all zeroes. If we had that 
feature implemented, compression would be improved.  However the technique only 
works for certain kinds of images (it's not good for photographs) and the level 
of effort is a little out of scope for me at this time.





> TIFF and PNG images should not be bigger than the ones created by java ImageIO
> ------------------------------------------------------------------------------
>
>                 Key: IMAGING-126
>                 URL: https://issues.apache.org/jira/browse/IMAGING-126
>             Project: Commons Imaging
>          Issue Type: Improvement
>          Components: Format: PNG, Format: TIFF
>    Affects Versions: 1.0
>         Environment: W7
>            Reporter: Tilman Hausherr
>            Priority: Minor
>             Fix For: Discussion
>
>         Attachments: Imaging_126_patch_1.patch, pdfbox-1870-devicen3-01.png, 
> pdfbox-1870-devicen3-01.tif, pdfbox-1870-devicen3.pdf-1.png, 
> pdfbox-1870-devicen3.pdf-1.tif
>
>
> I tried to use Apache Imaging for the PDFBOX project (PDFBOX-1734) because of 
> problems with setting the tiff resolution in java imageio.
> While the code is pretty nice, I found that the generated images are 
> sometimes much bigger in size than the ones generated by java imageio.
> Example:
> pdfbox-1870-devicen3-01.png 50 KB (imageio)
> pdfbox-1870-devicen3.pdf-1.png 70 KB (imaging)
> pdfbox-1870-devicen3-01.tif 401 KB (imageio)
> pdfbox-1870-devicen3.pdf-1.tif 1063 KB (imaging)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to