[ https://issues.apache.org/jira/browse/PDFBOX-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160791#comment-13160791 ]
Andreas Lehmkühler commented on PDFBOX-1169: -------------------------------------------- I found 3 different issues: - the given pdf contains 2 images which are embedded in a XObjectForm which is embedded in another XObjectForm and can't be extracted using ExtractImages. I fixed that in revision 1209017 - PDJpeg.write2OutputStream assumed that every PDJpeg contains jpeg image data because of the used DCTFilter, but PDJpegs may also contain CMYK-encoded image data as in the given pdf. I fixed that in revision 1209015 - the colors of the image are wrong, but I don't know why. I'm still investigating > Images extracted from PDF are loosing color (are shown in blackcolor) > --------------------------------------------------------------------- > > Key: PDFBOX-1169 > URL: https://issues.apache.org/jira/browse/PDFBOX-1169 > Project: PDFBox > Issue Type: Bug > Components: Utilities > Affects Versions: 1.6.0 > Environment: Windows > Reporter: susheel > Attachments: eBook-Mini.pdf, image-1.jpg, image-2.jpg > > > Using PDFBox, tried to read file (eBook-Mini.pdf, which is attached) > When images are extracted using below mentioned code, the extracted images > aren't as per the ones in PDF, they have lost color. > Checked extracting images, using other tools and images were extracted > correctly. > Attached images extracted using PDFBox as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira