[
https://issues.apache.org/jira/browse/PDFBOX-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160791#comment-13160791
]
Andreas Lehmkühler commented on PDFBOX-1169:
--------------------------------------------
I found 3 different issues:
- the given pdf contains 2 images which are embedded in a XObjectForm which is
embedded in another XObjectForm and can't be extracted using ExtractImages. I
fixed that in revision 1209017
- PDJpeg.write2OutputStream assumed that every PDJpeg contains jpeg image data
because of the used DCTFilter, but PDJpegs may also contain CMYK-encoded image
data as in the given pdf. I fixed that in revision 1209015
- the colors of the image are wrong, but I don't know why. I'm still
investigating
> Images extracted from PDF are loosing color (are shown in blackcolor)
> ---------------------------------------------------------------------
>
> Key: PDFBOX-1169
> URL: https://issues.apache.org/jira/browse/PDFBOX-1169
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 1.6.0
> Environment: Windows
> Reporter: susheel
> Attachments: eBook-Mini.pdf, image-1.jpg, image-2.jpg
>
>
> Using PDFBox, tried to read file (eBook-Mini.pdf, which is attached)
> When images are extracted using below mentioned code, the extracted images
> aren't as per the ones in PDF, they have lost color.
> Checked extracting images, using other tools and images were extracted
> correctly.
> Attached images extracted using PDFBox as well.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira