Emmeran Seehuber created PDFBOX-4847:
----------------------------------------

             Summary: [PATCH] Allow to access raw image data and fix ICC 
profile embedding in PNGConverter
                 Key: PDFBOX-4847
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4847
             Project: PDFBox
          Issue Type: New Feature
          Components: PDModel, Writing
    Affects Versions: 2.0.19
            Reporter: Emmeran Seehuber
         Attachments: color_difference.png, pdfbox-rawimages.patch

This patch was primary thought to add access to raw image data (i.e. without 
any kind of color conversion/reduction). While implementing and testing it I 
also found a bug with ICC profile embedding in the PNGConverter.

This patch does those things:
 - add a method getRawRaster() to PDImage. This allows to read the original 
raster data in 8 or 16 bit without any kind of color interpretation. The user 
must know what he wants to do with this himself (E.g. to access the raw data of 
DeviceN images).
 - add a method getRawImage(). Tries to return the raster obtained by 
getRawRaster() as a BufferedImage. This is only successful if there is a 
matching java ColorSpace for the colorspace of the image. I.e. only for 
ICCBased images. In theory this also should work for PDIndexed sRGB images. But 
I have to find a PDF with such an image first to test it.
 - add a -noColorConversion switch to the ExtractImage utility to extract 
images in their original colorspace. For CMYK images this only works when a 
TIFF encoder (e.g. from TwelveMonkeys) is in the class path.
 - add support to export PNGs with ICC profile data in ImageIOUtil.
 - fix a bug in PNGConverter which does not correctly embed the ICC profile 
from the png file.
 - the PNGConverterTest tests the raw images; While reading PNG files to 
compare it also ensures that the embedded ICC profile is correctly respected. 
The default PNG reader at least till JDK11 does *not* respect the embedded ICC 
profile. I.e. the colors are wrong. But there is a workaround for this in the 
PNGConverterTest (which I have in production for years now). See the screenshot 
for the correct color display of the png_rgb_romm_16.png testfile (left side; 
macOS Preview app) and the wrong display (right side; Java; inside IDEA).

 

Access to the raw image allows beside finding bugs like in the PNGConverter it 
also to do all kind of funny color things. E.g. a future patch could be to 
allow using the raw images to print PDFs. If the PDF you want to print has 
images with a gamut > sRGB (i.e. all modern cameras) and the target printer has 
also a gamut > sRGB (i.e. some ink photo printer) you will for sure see a 
difference in the resulting print. Such a mode would be rather slow, as the 
current sRGB image handling is optimized for speed and using the original raw 
images would need on demand color conversions in the printer driver. But you 
get „high quality“ out of it (at least in respect to colors).

I don’t think this is in time for the 2.0.20 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to