[
https://issues.apache.org/jira/browse/PDFBOX-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335603#comment-14335603
]
Siegfried Goeschl commented on PDFBOX-2128:
-------------------------------------------
Would you consider a non-breaking API change to handle image conversion via
callback? I had a look at it and discussed the idea with Andreas at
ApacheCon@Budapest. I'm currently using some custom glue code to convert images
using TwelveMonkeys, JAI, an external JBIG2 library but you hit troubles if you
have multiple ImageIO Readers for the same graphic format. Therefore I think
having a callback hook allows you to solve the issues (if you really have to)
> CMYK images are not supported correctly
> ---------------------------------------
>
> Key: PDFBOX-2128
> URL: https://issues.apache.org/jira/browse/PDFBOX-2128
> Project: PDFBox
> Issue Type: Bug
> Components: PDModel
> Affects Versions: 1.8.5, 1.8.6, 2.0.0
> Environment: Windows 7 Professional
> Running jvm: Java HotSpot(TM) 64-Bit Server VM - 1.6.0_26-b03 - 20.1-b02 -
> Sun Microsystems Inc
> Reporter: Ludovic Davoine
> Labels: PDJpeg, cmyk, images
> Fix For: 2.1.0
>
> Attachments: 573636.pdf, DCTFilter.patch,
> PDFBOX-2128-CAPITAL.pdf-1.png, PDFBOX-2128-CAPITAL.pdf-2.png,
> PDFBOX-2128-PORSCHE_CMYK.pdf-1.png, PDFBOX-2128-PORSCHE_CMYK.pdf-2.png,
> porsche_cmyk.pdf-2.png
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> I have a PDF with CMYK images inside and i need to extract the images in the
> RGB format. But the PDJpeg class seems to not work correctly; the colors are
> bad. Example:
> - Original image in te PDF : http://ludoda.free.fr/IMAGE_IN_PDF.jpg
> - Extracted image: http://ludoda.free.fr/IMAGE_EXTRACTED.jpg
> You can download the PDF : http://ludoda.free.fr/PORSCHE_CMYK.PDF
> and try my simple Test Case (I'm using PDFbox 1.8.5):
> {code}
> import java.awt.image.BufferedImage;
> import java.io.File;
> import java.io.IOException;
> import java.util.Iterator;
> import java.util.List;
> import java.util.Map;
> import javax.imageio.ImageIO;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.PDPage;
> import org.apache.pdfbox.pdmodel.PDResources;
> import org.apache.pdfbox.pdmodel.graphics.xobject.PDJpeg;
> import org.apache.pdfbox.pdmodel.graphics.xobject.PDXObject;
> import org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage;
> public class TestCase {
>
> public static void main(String[] args)
> {
> try
> {
> System.out.println("START EXTRACTING IMAGES...");
> read_pdf();
> System.out.println("COMPLETE");
> }
> catch (IOException ex)
> {
> System.out.println("" + ex);
> }
> }
> public static void read_pdf() throws IOException
> {
> PDDocument document = null;
> document = PDDocument.load("C:\\temp\\PORSCHE_CMYK.pdf");
> @SuppressWarnings("unchecked")
> List<PDPage> pages =
> document.getDocumentCatalog().getAllPages();
> Iterator<PDPage> iter = pages.iterator();
> int i =1;
> while (iter.hasNext())
> {
> PDPage page = (PDPage) iter.next();
> PDResources resources = page.getResources();
> Map<String, PDXObject> pageImages =
> resources.getXObjects();
> if (pageImages != null)
> {
> Iterator<String> imageIter =
> pageImages.keySet().iterator();
> while (imageIter.hasNext())
> {
> String key = (String) imageIter.next();
> if(pageImages.get(key) instanceof
> PDXObjectImage)
> {
> PDJpeg image = (PDJpeg)
> pageImages.get(key);
>
> // Test 1 : write2file
>
> image.write2file("C:\\workspace\\JAVA_PDFTools\\temp\\image" + i);
>
> // Test 2: getRGBImage
> BufferedImage
> bimage=image.getRGBImage();
> File outputfile = new
> File("C:\\workspace\\JAVA_PDFTools\\temp\\image" + i+"_buffered.jpg");
> ImageIO.write(bimage, "jpg",
> outputfile);
> i ++;
> }
> }
> }
> }
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]