[ 
https://issues.apache.org/jira/browse/PDFBOX-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134810#comment-17134810
 ] 

Emmeran Seehuber commented on PDFBOX-4847:
------------------------------------------

The attached patch [^pdfbox-image-compare.patch] implements this check. For 
whatever reason, it fails when running with openjdk-8 on Travis (see 
[https://travis-ci.com/github/rototor/pdfbox/jobs/348736754]), but it works for 
me when running local on the oracle-jdk8, so no idea if it is some missing 
bugfix in the openjdk-8 used by Travis - or if they use some other color 
management which is not correct. 

The special case for the 16-bit handling in ValidateXImage.convertToSRGB() 
needs some explanation. Without this block, a 16-bit image is converted to 8 
bit by the ColorConvertOp. This "should" be fine normally, but it is not in 
this case. The reason is, that when going from 16-bit to 8-bit per channel you 
always have some information loss. "The rest of the word" usually just do 
something like ((channelValue >> 8) & 0xFF). I.e. just shift 8 bits out and be 
done. But PDFBox, especially SampledImageReader.fromAny(), uses floats and 
rounds the result (see SampledImageReader:555 in 2.0). I see the need to use 
floats here, as you must respect the domain entry for the value range, so you 
are more or less forced to use floats. But instead of truncating the value by 
just casting it to int, Math.round() is used. You can argue here if using 
Math.round() or just truncating the values is the "right" way here - but as you 
always lose information there is IMHO no "right" way. So this whole block is 
only needed because SampledImageReader.fromAny() uses Math.round().

The PNGConverterTest.getImageWithProfileData() method is required because the 
ImageIO PNG reader does not respect the color profile of the PNG when reading 
it. So you must "tag" the BufferedImage with the right color profile, which 
this method does. At least in JDK11 this bug persists, no idea if this has been 
fixed on newer JDKs.

> [PATCH] Allow to access raw image data and fix ICC profile embedding in 
> PNGConverter
> ------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-4847
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4847
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: PDModel, Writing
>    Affects Versions: 2.0.19
>            Reporter: Emmeran Seehuber
>            Priority: Minor
>              Labels: feature, patch
>             Fix For: 2.0.21, 3.0.0 PDFBox
>
>         Attachments: color_difference.png, pdfbox-image-compare.patch, 
> pdfbox-rawimages.patch
>
>
> This patch was primary thought to add access to raw image data (i.e. without 
> any kind of color conversion/reduction). While implementing and testing it I 
> also found a bug with ICC profile embedding in the PNGConverter.
> This patch does those things:
>  - add a method getRawRaster() to PDImage. This allows to read the original 
> raster data in 8 or 16 bit without any kind of color interpretation. The user 
> must know what he wants to do with this himself (E.g. to access the raw data 
> of DeviceN images).
>  - add a method getRawImage(). Tries to return the raster obtained by 
> getRawRaster() as a BufferedImage. This is only successful if there is a 
> matching java ColorSpace for the colorspace of the image. I.e. only for 
> ICCBased images. In theory this also should work for PDIndexed sRGB images. 
> But I have to find a PDF with such an image first to test it.
>  - add a -noColorConversion switch to the ExtractImage utility to extract 
> images in their original colorspace. For CMYK images this only works when a 
> TIFF encoder (e.g. from TwelveMonkeys) is in the class path.
>  - add support to export PNGs with ICC profile data in ImageIOUtil.
>  - fix a bug in PNGConverter which does not correctly embed the ICC profile 
> from the png file.
>  - the PNGConverterTest tests the raw images; While reading PNG files to 
> compare it also ensures that the embedded ICC profile is correctly respected. 
> The default PNG reader at least till JDK11 does *not* respect the embedded 
> ICC profile. I.e. the colors are wrong. But there is a workaround for this in 
> the PNGConverterTest (which I have in production for years now). See the 
> screenshot for the correct color display of the png_rgb_romm_16.png testfile 
> (left side; macOS Preview app) and the wrong display (right side; Java; 
> inside IDEA).
>  
> Access to the raw image allows beside finding bugs like in the PNGConverter 
> it also to do all kind of funny color things. E.g. a future patch could be to 
> allow using the raw images to print PDFs. If the PDF you want to print has 
> images with a gamut > sRGB (i.e. all modern cameras) and the target printer 
> has also a gamut > sRGB (i.e. some ink photo printer) you will for sure see a 
> difference in the resulting print. Such a mode would be rather slow, as the 
> current sRGB image handling is optimized for speed and using the original raw 
> images would need on demand color conversions in the printer driver. But you 
> get „high quality“ out of it (at least in respect to colors).
> I don’t think this is in time for the 2.0.20 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to