[
https://issues.apache.org/jira/browse/PDFBOX-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798287#comment-17798287
]
Harald Kuhr edited comment on PDFBOX-5738 at 12/18/23 5:16 PM:
---------------------------------------------------------------
Hmmm...
The code in {{org.apache.pdfbox.filter.DCTFilter}} that handles 4 channel
"YCbCr" and converts it to CMYK seems very fishy to me... There really isn't
such a thing as "4 channel YCbCr" (unless, perhaps YCbCrA), and looking at the
code in {{fromYCbCrtoCMYK}} it seems to apply exactly the same algorithm as
YCCK to CMYK, just with the ITU Rec.601 YCC->RGB conversion instead of the
correct JPEG-defined YCC->RGB conversion...
I think TM 3.9/3.10 would return "YCbCr" as color space in the standard
metadata due to the bug in identifying the App14/Adobe marker, and I believe
this is the reason why you get the slightly different colors.
So I think a more appropriate action to take would be to just use the
{{fromYCCKtoCMYK}} in both cases (ie. ignore the obviously incorrect transform
value, and just fall back to the most likely case; YCCK), or throw an exception
in the case of YCbCr.
PS: Feel free to "borrow" {{com.twelvemonkeys.imageio.color.YCbCrConverter}}
for faster LUT-based conversion. I got the Java version from Werner
Randelshofer, but it's really based on the C code from libJPEG I think.
was (Author: haraldk76):
Hmmm...
The code in {{org.apache.pdfbox.filter.DCTFilter}} that handles 4 channel
"YCbCr" and converts it to CMYK seems very fishy to me... There really isn't
such a thing as "4 channel YCbCr" (unless, perhaps YCbCrA), and looking at the
code in {{fromYCbCrtoCMYK}} it seems to apply exactly the same algorithm as
YCCK to CMYK, just with the ITU Rec.601 YCC->RGB conversion instead of the
correct JPEG-defined YCC->RGB conversion...
I think TM 3.9.0 would return "YCbCr" as color space in the standard metadata
due to the bug in identifying the App14/Adobe marker, and I believe this is the
reason why you get the slightly different colors.
So I think a more appropriate action to take would be to just use the
{{fromYCCKtoCMYK}} in both cases (ie. ignore the obviously incorrect transform
value, and just fall back to the most likely case; YCCK), or throw an exception
in the case of YCbCr.
PS: Feel free to "borrow" {{com.twelvemonkeys.imageio.color.YCbCrConverter}}
for faster LUT-based conversion. I got the Java version from Werner
Randelshofer, but it's really based on the C code from libJPEG I think.
> Wrong colors in PDF since PDFBOX-5488
> -------------------------------------
>
> Key: PDFBOX-5738
> URL: https://issues.apache.org/jira/browse/PDFBOX-5738
> Project: PDFBox
> Issue Type: Bug
> Components: Rendering
> Reporter: Oliver Schmidtmer
> Priority: Major
> Attachments: PDFBOX-5488-p1.pdf-1-old.png,
> PDFBOX-5488-p1.pdf-new.png, Rechnung 983312924 (Carbafas)_page1.jpg, Rechnung
> 983312924 (Carbagas).pdf, gre_research_validiity_data_page1.jpg
>
>
> Since the workaround for PDFBOX-5488, the attached PDF has wrong colors.
> The base issue from PDFBOX-5488 might be a difference between the reported
> color space from the metadata-tree for the user and the raw pixel data when
> readRaster is used, at least if I understand this correctly:
> [https://github.com/haraldk/TwelveMonkeys/issues/571]
> For the default JPEG Image Reader this is not a problem, as
> reader.getImageMetadata throws an Exception "javax.imageio.IIOException: JFIF
> APP0 must be first marker after SOI" and "getAdobeTransformByBruteForce" is
> used instead of "getAdobeTransform".
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]