Hi,
We've been using PDFBox to extract images from PDF files and recently upgraded to PDFBox version 2.0.0-RC2. I noticed that class PDXObjectImage is renamed/rewritten and method PDXObjectImage.write2OutputStream we used to write images to disk no longer exists? Therefore, I've been trying to use the new class PDImageXObject and follow your example org.apache.pdfbox.tools.ExtractImages#write2file in order to extract images from PDF and write them to disk. It appears that there's a code path (IOUtils.copy etc) for RGB or Gray colorspace where it just copies the unmodified JPEG stream. However, I have a couple of JPEG images with RBG colorspace in a PDF and used this code to extract and write them to disk, and they can't be opened by any image viewer, suggesting that the images may be damaged… If I change the code to call ImageIOUtil.writeImage instead, then the extracted images can be viewed ok. But I don't know the implication here as the code suggests that the JPEG will be converted. Please could you suggest why IOUtils.copy for RGB or Gray did not work properly and what's the recommended/ correct way to process them? Kind regards, Joe

