[ 
https://issues.apache.org/jira/browse/PDFBOX-5738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798287#comment-17798287
 ] 

Harald Kuhr edited comment on PDFBOX-5738 at 12/18/23 5:16 PM:
---------------------------------------------------------------

Hmmm...

The code in {{org.apache.pdfbox.filter.DCTFilter}} that handles 4 channel 
"YCbCr" and converts it to CMYK seems very fishy to me... There really isn't 
such a thing as "4 channel YCbCr" (unless, perhaps YCbCrA), and looking at the 
code in {{fromYCbCrtoCMYK}} it seems to apply exactly the same algorithm as 
YCCK to CMYK, just with the ITU Rec.601 YCC->RGB conversion instead of the 
correct JPEG-defined YCC->RGB conversion... 

I think TM 3.9/3.10 would return "YCbCr" as color space in the standard 
metadata due to the bug in identifying the App14/Adobe marker, and I believe 
this is the reason why you get the slightly different colors. 

So I think a more appropriate action to take would be to just use the 
{{fromYCCKtoCMYK}} in both cases (ie. ignore the obviously incorrect transform 
value, and just fall back to the most likely case; YCCK), or throw an exception 
in the case of YCbCr. 

PS: Feel free to "borrow" {{com.twelvemonkeys.imageio.color.YCbCrConverter}} 
for faster LUT-based conversion. I got the Java version from Werner 
Randelshofer, but it's really based on the C code from libJPEG I think. 


was (Author: haraldk76):
Hmmm...

The code in {{org.apache.pdfbox.filter.DCTFilter}} that handles 4 channel 
"YCbCr" and converts it to CMYK seems very fishy to me... There really isn't 
such a thing as "4 channel YCbCr" (unless, perhaps YCbCrA), and looking at the 
code in {{fromYCbCrtoCMYK}} it seems to apply exactly the same algorithm as 
YCCK to CMYK, just with the ITU Rec.601 YCC->RGB conversion instead of the 
correct JPEG-defined YCC->RGB conversion... 

I think TM 3.9.0 would return "YCbCr" as color space in the standard metadata 
due to the bug in identifying the App14/Adobe marker, and I believe this is the 
reason why you get the slightly different colors. 

So I think a more appropriate action to take would be to just use the 
{{fromYCCKtoCMYK}} in both cases (ie. ignore the obviously incorrect transform 
value, and just fall back to the most likely case; YCCK), or throw an exception 
in the case of YCbCr. 

PS: Feel free to "borrow" {{com.twelvemonkeys.imageio.color.YCbCrConverter}} 
for faster LUT-based conversion. I got the Java version from Werner 
Randelshofer, but it's really based on the C code from libJPEG I think. 

> Wrong colors in PDF since PDFBOX-5488
> -------------------------------------
>
>                 Key: PDFBOX-5738
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5738
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>            Reporter: Oliver Schmidtmer
>            Priority: Major
>         Attachments: PDFBOX-5488-p1.pdf-1-old.png, 
> PDFBOX-5488-p1.pdf-new.png, Rechnung 983312924 (Carbafas)_page1.jpg, Rechnung 
> 983312924 (Carbagas).pdf, gre_research_validiity_data_page1.jpg
>
>
> Since the workaround for PDFBOX-5488, the attached PDF has wrong colors.
> The base issue from PDFBOX-5488 might be a difference between the reported 
> color space from the metadata-tree for the user and the raw pixel data when 
> readRaster is used, at least if I understand this correctly:
> [https://github.com/haraldk/TwelveMonkeys/issues/571]
> For the default JPEG Image Reader this is not a problem, as 
> reader.getImageMetadata throws an Exception "javax.imageio.IIOException: JFIF 
> APP0 must be first marker after SOI" and "getAdobeTransformByBruteForce" is 
> used instead of "getAdobeTransform".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to